Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Azure Storage Account] Add metric_type metadata to the storage_account datastream #7488

Merged
5 changes: 5 additions & 0 deletions packages/azure_metrics/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
- version: "1.0.25"
changes:
- description: Add metric_type metadata to the `storage_account` datastream
type: enhancement
link: https://github.com/elastic/integrations/pull/7488
- version: "1.0.23"
changes:
- description: Add dimension and metric_type metadata to the compute_vm_scaleset datastream
Expand Down
115 changes: 108 additions & 7 deletions packages/azure_metrics/data_stream/storage_account/fields/fields.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,108 @@
- name: azure.storage_account.*.*
type: object
object_type: float
object_type_mapping_type: "*"
description: >
storage account

- name: azure.storage_account
type: group
fields:
Copy link
Contributor

@zmoog zmoog Aug 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, please forget the comment about IndexCapacity, I could not find it but it's there. My Bad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is surprising: how can FileShareQuota field from the docs become file_share_capacity_quota field in ES?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is being taken from here or sdk.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks.

I see we have two documents for Azure Files:

Since the first one mentions the "Microsoft.ClassicStorage/storageAccounts" namespace, this may mean this for the classic version of the Files service, the the second for the current / modern version.

Are we supporting metrics for the classic namespace?

We should also check if we need add some *IOPS* related field in the current namespace doc, like:

  • FileShareMaxUsedIOPS
  • FileShareProvisionedIOPS
  • FileShareMaxUsedBandwidthMiBps

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, my mistake! I referred to the classicStorage link instead of the correct one, which should be Microsoft.Storage/storageAccounts as this is the namespace we are supporting. To address this, I will update the metric_type mapping using the object type instead of the group format like this

- name: azure.storage_account.*.*
  type: object
  object_type: float
  metric_type: gauge
  object_type_mapping_type: "*"
  description: >
    storage account

This will ensure that all the fields within this namespace are correctly accounted for.

I didn't use this format before due to some issues associated with it. However, it seems those issues have been resolved, as I tested it successfully.

Screenshot 2023-08-22 at 9 34 39 PM Screenshot 2023-08-22 at 9 34 09 PM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks for the check!

I'm not an expert in TSDB, but this makes sense.

I updated the link in the PR description with the non-classic one; please take a look to double-check if it's the correct one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zmoog!

- name: availability.avg
type: float
metric_type: gauge
unit: percent
description: The percentage of availability for the storage service or the specified API operation.
- name: egress.total
type: float
metric_type: gauge
unit: byte
description: The amount of egress data, in bytes.
- name: ingress.total
type: float
unit: byte
metric_type: gauge
description: The amount of ingress data, in bytes.
- name: success_e2elatency.avg
zmoog marked this conversation as resolved.
Show resolved Hide resolved
type: float
metric_type: gauge
unit: ms
description: The end-to-end latency of successful requests made to a storage service or the specified API operation, in milliseconds.
- name: success_server_latency.avg
type: float
metric_type: gauge
unit: ms
description: The latency used by Azure Storage to process a successful request, in milliseconds.
- name: transactions.total
type: float
metric_type: gauge
description: The number of requests made to a storage service or the specified API operation.
- name: used_capacity.avg
type: float
metric_type: gauge
unit: byte
description: Account used capacity
- name: blob_capacity.avg
type: float
metric_type: gauge
unit: byte
description: The amount of storage used by the storage account's Blob service in bytes.
- name: blob_count.avg
type: float
metric_type: gauge
description: The number of Blob in the storage account's Blob service.
- name: container_count.avg
type: float
metric_type: gauge
description: The number of containers in the storage account's Blob service.
- name: index_capacity.avg
type: float
metric_type: gauge
unit: byte
description: The amount of storage used by ADLS Gen2 (Hierarchical) Index in bytes.
- name: file_capacity.avg
type: float
metric_type: gauge
unit: byte
description: The amount of storage used by the storage account's File service in bytes.
- name: file_count.avg
type: float
metric_type: gauge
description: The number of file in the storage account's File service.
- name: file_share_count.avg
type: float
metric_type: gauge
description: The number of file shares in the storage account's File service.
- name: file_share_capacity_quota.avg
type: float
metric_type: gauge
unit: byte
description: The upper limit on the amount of storage that can be used by Azure Files Service in bytes.
- name: file_share_snapshot_count.avg
type: float
metric_type: gauge
description: The number of snapshots present on the share in storage account's Files Service.
- name: file_share_snapshot_size.avg
type: float
metric_type: gauge
unit: byte
description: The amount of storage used by the snapshots in storage account's File service in bytes.
- name: queue_capacity.avg
type: float
metric_type: gauge
unit: byte
description: The amount of storage used by the storage account's Queue service in bytes.
- name: queue_count.avg
type: float
metric_type: gauge
description: The number of queue in the storage account's Queue service.
- name: queue_message_count.avg
type: float
metric_type: gauge
description: The approximate number of queue messages in the storage account's Queue service.
- name: table_capacity.avg
type: float
metric_type: gauge
unit: byte
description: The amount of storage used by the storage account's Table service in bytes.
- name: table_count.avg
type: float
metric_type: gauge
description: The number of table in the storage account's Table service.
- name: table_entity_count.avg
type: float
metric_type: gauge
description: The number of table entities in the storage account's Table service.
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,3 @@
description: >
Azure metric dimensions.

- name: metrics.*.*
type: object
object_type: float
object_type_mapping_type: "*"
description: >
Metrics returned.

133 changes: 77 additions & 56 deletions packages/azure_metrics/docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -278,62 +278,83 @@ so the `period` for `storage_account` should be `300s` or multiples of `300s`.

**Exported fields**

| Field | Description | Type |
|---|---|---|
| @timestamp | Event timestamp. | date |
| azure.application_id | The application ID | keyword |
| azure.dimensions.\* | Azure metric dimensions. | object |
| azure.metrics.\*.\* | Metrics returned. | object |
| azure.namespace | The namespace selected | keyword |
| azure.resource.group | The resource group | keyword |
| azure.resource.id | The id of the resource | keyword |
| azure.resource.name | The name of the resource | keyword |
| azure.resource.tags.\* | Azure resource tags. | object |
| azure.resource.type | The type of the resource | keyword |
| azure.storage_account.\*.\* | storage account | object |
| azure.subscription_id | The subscription ID | keyword |
| azure.timegrain | The Azure metric timegrain | keyword |
| cloud.account.id | The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier. | keyword |
| cloud.availability_zone | Availability zone in which this host is running. | keyword |
| cloud.image.id | Image ID for the cloud instance. | keyword |
| cloud.instance.id | Instance ID of the host machine. | keyword |
| cloud.instance.name | Instance name of the host machine. | keyword |
| cloud.machine.type | Machine type of the host machine. | keyword |
| cloud.project.id | Name of the project in Google Cloud. | keyword |
| cloud.provider | Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean. | keyword |
| cloud.region | Region in which this host is running. | keyword |
| container.id | Unique container id. | keyword |
| container.image.name | Name of the image the container was built on. | keyword |
| container.labels | Image labels. | object |
| container.name | Container name. | keyword |
| container.runtime | Runtime managing this container. | keyword |
| data_stream.dataset | Data stream dataset name. | constant_keyword |
| data_stream.namespace | Data stream namespace. | constant_keyword |
| data_stream.type | Data stream type. | constant_keyword |
| dataset.name | Dataset name. | constant_keyword |
| dataset.namespace | Dataset namespace. | constant_keyword |
| dataset.type | Dataset type. | constant_keyword |
| ecs.version | ECS version this event conforms to. `ecs.version` is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. | keyword |
| host | A host is defined as a general computing instance. ECS host.\* fields should be populated with details about the host on which the event happened, or from which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes. | group |
| host.architecture | Operating system architecture. | keyword |
| host.containerized | If the host is a container. | boolean |
| host.domain | Name of the domain of which the host is a member. For example, on Windows this could be the host's Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host's LDAP provider. | keyword |
| host.hostname | Hostname of the host. It normally contains what the `hostname` command returns on the host machine. | keyword |
| host.id | Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of `beat.name`. | keyword |
| host.ip | Host ip addresses. | ip |
| host.mac | Host mac addresses. | keyword |
| host.name | Name of the host. It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use. | keyword |
| host.os.build | OS build information. | keyword |
| host.os.codename | OS codename, if any. | keyword |
| host.os.family | OS family (such as redhat, debian, freebsd, windows). | keyword |
| host.os.kernel | Operating system kernel version as a raw string. | keyword |
| host.os.name | Operating system name, without the version. | keyword |
| host.os.name.text | Multi-field of `host.os.name`. | text |
| host.os.platform | Operating system platform (such centos, ubuntu, windows). | keyword |
| host.os.version | Operating system version as a raw string. | keyword |
| host.type | Type of host. For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment. | keyword |
| service.address | Service address | keyword |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword |
| Field | Description | Type | Unit | Metric Type |
|---|---|---|---|---|
| @timestamp | Event timestamp. | date | | |
| azure.application_id | The application ID | keyword | | |
| azure.dimensions.\* | Azure metric dimensions. | object | | |
| azure.namespace | The namespace selected | keyword | | |
| azure.resource.group | The resource group | keyword | | |
| azure.resource.id | The id of the resource | keyword | | |
| azure.resource.name | The name of the resource | keyword | | |
| azure.resource.tags.\* | Azure resource tags. | object | | |
| azure.resource.type | The type of the resource | keyword | | |
| azure.storage_account.availability.avg | The percentage of availability for the storage service or the specified API operation. | float | percent | gauge |
| azure.storage_account.blob_capacity.avg | The amount of storage used by the storage account's Blob service in bytes. | float | byte | gauge |
| azure.storage_account.blob_count.avg | The number of Blob in the storage account's Blob service. | float | | gauge |
| azure.storage_account.container_count.avg | The number of containers in the storage account's Blob service. | float | | gauge |
| azure.storage_account.egress.total | The amount of egress data, in bytes. | float | byte | gauge |
| azure.storage_account.file_capacity.avg | The amount of storage used by the storage account's File service in bytes. | float | byte | gauge |
| azure.storage_account.file_count.avg | The number of file in the storage account's File service. | float | | gauge |
| azure.storage_account.file_share_capacity_quota.avg | The upper limit on the amount of storage that can be used by Azure Files Service in bytes. | float | byte | gauge |
| azure.storage_account.file_share_count.avg | The number of file shares in the storage account's File service. | float | | gauge |
| azure.storage_account.file_share_snapshot_count.avg | The number of snapshots present on the share in storage account's Files Service. | float | | gauge |
| azure.storage_account.file_share_snapshot_size.avg | The amount of storage used by the snapshots in storage account's File service in bytes. | float | byte | gauge |
| azure.storage_account.index_capacity.avg | The amount of storage used by ADLS Gen2 (Hierarchical) Index in bytes. | float | byte | gauge |
| azure.storage_account.ingress.total | The amount of ingress data, in bytes. | float | byte | gauge |
| azure.storage_account.queue_capacity.avg | The amount of storage used by the storage account's Queue service in bytes. | float | byte | gauge |
| azure.storage_account.queue_count.avg | The number of queue in the storage account's Queue service. | float | | gauge |
| azure.storage_account.queue_message_count.avg | The approximate number of queue messages in the storage account's Queue service. | float | | gauge |
| azure.storage_account.success_e2elatency.avg | The end-to-end latency of successful requests made to a storage service or the specified API operation, in milliseconds. | float | ms | gauge |
| azure.storage_account.success_server_latency.avg | The latency used by Azure Storage to process a successful request, in milliseconds. | float | ms | gauge |
| azure.storage_account.table_capacity.avg | The amount of storage used by the storage account's Table service in bytes. | float | byte | gauge |
| azure.storage_account.table_count.avg | The number of table in the storage account's Table service. | float | | gauge |
| azure.storage_account.table_entity_count.avg | The number of table entities in the storage account's Table service. | float | | gauge |
| azure.storage_account.transactions.total | The number of requests made to a storage service or the specified API operation. | float | | gauge |
| azure.storage_account.used_capacity.avg | Account used capacity | float | byte | gauge |
| azure.subscription_id | The subscription ID | keyword | | |
| azure.timegrain | The Azure metric timegrain | keyword | | |
| cloud.account.id | The cloud account or organization id used to identify different entities in a multi-tenant environment. Examples: AWS account id, Google Cloud ORG Id, or other unique identifier. | keyword | | |
| cloud.availability_zone | Availability zone in which this host is running. | keyword | | |
| cloud.image.id | Image ID for the cloud instance. | keyword | | |
| cloud.instance.id | Instance ID of the host machine. | keyword | | |
| cloud.instance.name | Instance name of the host machine. | keyword | | |
| cloud.machine.type | Machine type of the host machine. | keyword | | |
| cloud.project.id | Name of the project in Google Cloud. | keyword | | |
| cloud.provider | Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean. | keyword | | |
| cloud.region | Region in which this host is running. | keyword | | |
| container.id | Unique container id. | keyword | | |
| container.image.name | Name of the image the container was built on. | keyword | | |
| container.labels | Image labels. | object | | |
| container.name | Container name. | keyword | | |
| container.runtime | Runtime managing this container. | keyword | | |
| data_stream.dataset | Data stream dataset name. | constant_keyword | | |
| data_stream.namespace | Data stream namespace. | constant_keyword | | |
| data_stream.type | Data stream type. | constant_keyword | | |
| dataset.name | Dataset name. | constant_keyword | | |
| dataset.namespace | Dataset namespace. | constant_keyword | | |
| dataset.type | Dataset type. | constant_keyword | | |
| ecs.version | ECS version this event conforms to. `ecs.version` is a required field and must exist in all events. When querying across multiple indices -- which may conform to slightly different ECS versions -- this field lets integrations adjust to the schema version of the events. | keyword | | |
| host | A host is defined as a general computing instance. ECS host.\* fields should be populated with details about the host on which the event happened, or from which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes. | group | | |
| host.architecture | Operating system architecture. | keyword | | |
| host.containerized | If the host is a container. | boolean | | |
| host.domain | Name of the domain of which the host is a member. For example, on Windows this could be the host's Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host's LDAP provider. | keyword | | |
| host.hostname | Hostname of the host. It normally contains what the `hostname` command returns on the host machine. | keyword | | |
| host.id | Unique host id. As hostname is not always unique, use values that are meaningful in your environment. Example: The current usage of `beat.name`. | keyword | | |
| host.ip | Host ip addresses. | ip | | |
| host.mac | Host mac addresses. | keyword | | |
| host.name | Name of the host. It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use. | keyword | | |
| host.os.build | OS build information. | keyword | | |
| host.os.codename | OS codename, if any. | keyword | | |
| host.os.family | OS family (such as redhat, debian, freebsd, windows). | keyword | | |
| host.os.kernel | Operating system kernel version as a raw string. | keyword | | |
| host.os.name | Operating system name, without the version. | keyword | | |
| host.os.name.text | Multi-field of `host.os.name`. | text | | |
| host.os.platform | Operating system platform (such centos, ubuntu, windows). | keyword | | |
| host.os.version | Operating system version as a raw string. | keyword | | |
| host.type | Type of host. For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment. | keyword | | |
| service.address | Service address | keyword | | |
| service.type | The type of the service data is collected from. The type can be used to group and correlate logs and metrics from one service type. Example: If logs or metrics are collected from Elasticsearch, `service.type` would be `elasticsearch`. | keyword | | |


`container_instance`
Expand Down
Loading