Skip to content

[couchbase] Add Integration Package with Cluster Data Stream #3706

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 12, 2022
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
…into package_couchbase_cluster
  • Loading branch information
bhagyaraj-crest committed Jul 27, 2022
commit 7b60b4d65ee3d09cbeb4bff05690e4f733bfa7c6
14 changes: 11 additions & 3 deletions packages/couchbase/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Couchbase Integration

This Elastic integration collects and parses the [Cluster](https://docs.couchbase.com/server/current/rest-api/rest-cluster-details.html) metrics from [Couchbase](https://www.couchbase.com/) so that the user could monitor and troubleshoot the performance of the Couchbase instances.
This Elastic integration collects and parses [Bucket](https://docs.couchbase.com/server/current/rest-api/rest-buckets-summary.html) and [Cluster](https://docs.couchbase.com/server/current/rest-api/rest-cluster-details.html) metrics from [Couchbase](https://www.couchbase.com/) so that the user could monitor and troubleshoot the performance of the Couchbase instances.

This integration uses `http` metricbeat module to collect `cluster` metrics.
This integration uses `http` metricbeat module to collect `bucket` and `cluster` metrics.

Note: For Couchbase cluster setup, there is an ideal scenario of single host with administrator access for the entire cluster to collect metrics. Providing multiple host from the same cluster might lead to data duplication. In case of multiple clusters, adding a new integration to collect data from different cluster host is a good option.

Expand All @@ -20,10 +20,18 @@ Example Host Configuration: `http://Administrator:password@localhost:8091`

## Metrics

### Bucket

This is the `bucket` data stream. A bucket is a logical container for a related set of items such as key-value pairs or documents.

{{event "bucket"}}

{{fields "bucket"}}

### Cluster

This is the `cluster` data stream. A cluster is a collection of nodes that are accessed and managed as a single group. Each node is an equal partner in orchestrating the cluster to provide facilities such as operational information (monitoring) or managing cluster membership of nodes and health of nodes.

{{event "cluster"}}

{{fields "cluster"}}
{{fields "cluster"}}
3 changes: 3 additions & 0 deletions packages/couchbase/changelog.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,6 @@
- description: Couchbase integration package with "cluster" data stream.
type: enhancement
link: https://github.com/elastic/integrations/pull/3706
- description: Couchbase integration package with "bucket" data stream.
type: enhancement
link: https://github.com/elastic/integrations/pull/3666
150 changes: 147 additions & 3 deletions packages/couchbase/docs/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Couchbase Integration

This Elastic integration collects and parses the [Cluster](https://docs.couchbase.com/server/current/rest-api/rest-cluster-details.html) metrics from [Couchbase](https://www.couchbase.com/) so that the user could monitor and troubleshoot the performance of the Couchbase instances.
This Elastic integration collects and parses [Bucket](https://docs.couchbase.com/server/current/rest-api/rest-buckets-summary.html) and [Cluster](https://docs.couchbase.com/server/current/rest-api/rest-cluster-details.html) metrics from [Couchbase](https://www.couchbase.com/) so that the user could monitor and troubleshoot the performance of the Couchbase instances.

This integration uses `http` metricbeat module to collect `cluster` metrics.
This integration uses `http` metricbeat module to collect `bucket` and `cluster` metrics.

Note: For Couchbase cluster setup, there is an ideal scenario of single host with administrator access for the entire cluster to collect metrics. Providing multiple host from the same cluster might lead to data duplication. In case of multiple clusters, adding a new integration to collect data from different cluster host is a good option.

Expand All @@ -20,6 +20,151 @@ Example Host Configuration: `http://Administrator:password@localhost:8091`

## Metrics

### Bucket

This is the `bucket` data stream. A bucket is a logical container for a related set of items such as key-value pairs or documents.

An example event for `bucket` looks as following:

```json
{
"@timestamp": "2022-07-22T10:40:36.032Z",
"agent": {
"ephemeral_id": "b6b8e21b-ded1-41d8-a193-c5aead533ff1",
"id": "5d67808a-0fe5-4f5f-9636-ec161f0cdcf0",
"name": "docker-fleet-agent",
"type": "metricbeat",
"version": "8.3.2"
},
"couchbase": {
"bucket": {
"data": {
"used": {
"bytes": 20892210
}
},
"disk": {
"fetches": 0,
"used": {
"bytes": 20914347
}
},
"item": {
"count": 7303
},
"memory": {
"used": {
"bytes": 34972008
}
},
"name": "beer-sample",
"operations_per_sec": 0,
"ram": {
"quota": {
"bytes": 104857600,
"used": {
"pct": 33.35190582275391
}
}
},
"type": "membase"
}
},
"data_stream": {
"dataset": "couchbase.bucket",
"namespace": "ep",
"type": "metrics"
},
"ecs": {
"version": "8.3.0"
},
"elastic_agent": {
"id": "5d67808a-0fe5-4f5f-9636-ec161f0cdcf0",
"snapshot": false,
"version": "8.3.2"
},
"event": {
"agent_id_status": "verified",
"category": [
"database"
],
"dataset": "couchbase.bucket",
"duration": 6674276,
"ingested": "2022-07-22T10:40:39Z",
"kind": "metric",
"module": "couchbase",
"type": [
"info"
]
},
"host": {
"architecture": "x86_64",
"containerized": true,
"hostname": "docker-fleet-agent",
"ip": [
"172.26.0.7"
],
"mac": [
"02:42:ac:1a:00:07"
],
"name": "docker-fleet-agent",
"os": {
"codename": "focal",
"family": "debian",
"kernel": "5.4.0-110-generic",
"name": "Ubuntu",
"platform": "ubuntu",
"type": "linux",
"version": "20.04.4 LTS (Focal Fossa)"
}
},
"metricset": {
"name": "json",
"period": 10000
},
"service": {
"address": "http://elastic-package-service_couchbase_1:8091/pools/default/buckets",
"type": "http"
},
"tags": [
"forwarded",
"couchbase-bucket"
]
}
```

**Exported fields**

| Field | Description | Type | Unit | Metric Type |
|---|---|---|---|---|
| @timestamp | Event timestamp. | date | | |
| couchbase.bucket.data.used.bytes | Size of user data within buckets of the specified state that are resident in RAM. | long | byte | gauge |
| couchbase.bucket.disk.fetches | Number of disk fetches. | long | | gauge |
| couchbase.bucket.disk.used.bytes | Amount of disk used (bytes). | long | byte | gauge |
| couchbase.bucket.item.count | Number of items associated with the bucket. | long | | counter |
| couchbase.bucket.memory.used.bytes | Amount of memory used by the bucket (bytes). | long | byte | gauge |
| couchbase.bucket.name | Name of the bucket. | keyword | | |
| couchbase.bucket.operations_per_sec | Number of operations per second. | long | | gauge |
| couchbase.bucket.ram.quota.bytes | Amount of RAM used by the bucket (bytes). | long | byte | gauge |
| couchbase.bucket.ram.quota.used.pct | Percentage of RAM used (for active objects) against the configured bucket size (%). | scaled_float | percent | gauge |
| couchbase.bucket.type | Type of the bucket. | keyword | | |
| data_stream.dataset | Data stream dataset. | constant_keyword | | |
| data_stream.namespace | Data stream namespace. | constant_keyword | | |
| data_stream.type | Data stream type. | constant_keyword | | |
| ecs.version | ECS version this event conforms to. `ecs.version` is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. | keyword | | |
| error.message | Error message. | match_only_text | | |
| event.category | This is one of four ECS Categorization Fields, and indicates the second level in the ECS category hierarchy. `event.category` represents the "big buckets" of ECS categories. For example, filtering on `event.category:process` yields all events relating to process activity. This field is closely related to `event.type`, which is used as a subcategory. This field is an array. This will allow proper categorization of some events that fall in multiple categories. | keyword | | |
| event.dataset | Name of the dataset. If an event source publishes more than one type of log or events (e.g. access log, error log), the dataset is used to specify which one the event comes from. It's recommended but not required to start the dataset name with the module name, followed by a dot, then the dataset name. | keyword | | |
| event.duration | Duration of the event in nanoseconds. If event.start and event.end are known this value should be the difference between the end and start time. | long | | |
| event.ingested | Timestamp when an event arrived in the central data store. This is different from `@timestamp`, which is when the event originally occurred. It's also different from `event.created`, which is meant to capture the first time an agent saw the event. In normal conditions, assuming no tampering, the timestamps should chronologically look like this: `@timestamp` \< `event.created` \< `event.ingested`. | date | | |
| event.kind | This is one of four ECS Categorization Fields, and indicates the highest level in the ECS category hierarchy. `event.kind` gives high-level information about what type of information the event contains, without being specific to the contents of the event. For example, values of this field distinguish alert events from metric events. The value of this field can be used to inform how these kinds of events should be handled. They may warrant different retention, different access control, it may also help understand whether the data coming in at a regular interval or not. | keyword | | |
| event.module | Name of the module this data is coming from. If your monitoring agent supports the concept of modules or plugins to process events of a given source (e.g. Apache logs), `event.module` should contain the name of this module. | keyword | | |
| event.type | This is one of four ECS Categorization Fields, and indicates the third level in the ECS category hierarchy. `event.type` represents a categorization "sub-bucket" that, when used along with the `event.category` field values, enables filtering events down to a level appropriate for single visualization. This field is an array. This will allow proper categorization of some events that fall in multiple event types. | keyword | | |
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword | | |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword | | |
| tags | List of keywords used to tag each event. | keyword | | |


### Cluster

This is the `cluster` data stream. A cluster is a collection of nodes that are accessed and managed as a single group. Each node is an equal partner in orchestrating the cluster to provide facilities such as operational information (monitoring) or managing cluster membership of nodes and health of nodes.
Expand Down Expand Up @@ -203,4 +348,3 @@ An example event for `cluster` looks as following:
| service.address | Address where data about this service was collected from. This should be a URI, network address (ipv4:port or [ipv6]:port) or a resource path (sockets). | keyword | | |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword | | |
| tags | List of keywords used to tag each event. | keyword | | |

You are viewing a condensed version of this merge commit. You can view the full changes here.