Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guidance on product/project name inside attribute/metric name #608

Open
lmolkova opened this issue Dec 13, 2023 · 10 comments
Open

Guidance on product/project name inside attribute/metric name #608

lmolkova opened this issue Dec 13, 2023 · 10 comments

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Dec 13, 2023

We should provide a guidance on product/project name being fully qualified (or not). We should keep the same pattern in different semconvs.
Examples of inconsistencies:

  • db.cosmosdb (Azure), db.dynamodb (AWS), db.couchdb (under Apache umbrella), db.spanner (GCP), db2 (IBM), hanadb (SAP HANA) etc and corresponding values in the db.system enum
  • we also have aws_sqs, gcp_pubsub, azure_servicebus in the messaging.system enum

We should provide a guidance on how to represent multiple words:

We should require consistency across signals:

  • Redis can be used as messaging system and should have exactly the same representation in db.system and messaging.system
  • If Azure Service Bus is instrumented on the service side to report telemetry to end users, it should use the same value in resource attributes (Define cloud.platform and/or rename it #609)

Misc discrepancies:

  • mssqlcompact (Microsoft SQL Server Compact) should probably become mssql_compact

[Update]
Other attributes that have the same problem:

@alanwest
Copy link
Member

alanwest commented Feb 9, 2024

With regards to representing multiple words, can a product's branding provide a guide? For example, use dynamodb, couchdb, and cosmos_db because they are respectively branded DynamoDB, CouchDB, and Cosmos DB.

@trask
Copy link
Member

trask commented Feb 9, 2024

It would be nice if whatever the enum is, e.g. cosmosdb, that is also the namespace, e.g. db.cosmosdb.*, for product-specific attributes

@trask
Copy link
Member

trask commented Feb 9, 2024

do we think we need mssqlcompact? that doesn't seem like something we'd necessarily know from the client side, and on the server side it could potentially be a resource attribute describing the "edition"

@KalleOlaviNiemitalo
Copy link

SQL Server Compact runs in-process rather than as a network service, so yes, the application should know it's using that.
It's no longer supported by Microsoft, but open-telemetry/opentelemetry-specification#3105 shows it's still used.

@trask trask moved this to Post Stability in Database Client Semantic Conventions Apr 24, 2024
@lmolkova lmolkova moved this from V1 - Stable Semantics to Post-stability in Spec: Messaging Semantics Jun 20, 2024
@lmolkova
Copy link
Contributor Author

Based on the messaging SIG discussion on 6/13, none of the controversial systems are part of the initial stability (kafka + RabbitMQ), so removing the stability blocker label.

@lmolkova
Copy link
Contributor Author

lmolkova commented Jan 10, 2025

Been discussing it in the scope of #1734. There are competing consistency goals when it comes to project names. Let's explore them:

1. Stay consistent with external project/product/system/etc name whenever possible

The guidance would be:

  • Use a registered trademark (wordmark) or another 'official' name
  • When it's ambiguous (e.g. caché), it needs to be disambiguated, for example, prefixed with a company name (intersystems_cache)

Non-controversial examples:

  • mongodb
  • postgresql
  • cassandra
  • ibm.mq
  • oracle.db (oracledb, oracle.database and other possible variations)

Controversial examples

  • informix - it's a product that was acquired by IBM but had history before it, has a unique name, and registered as a trademark. The controversy is that it coexists with ibm.mq
  • sap.hana (trademark on SAP HANA) and maxdb (trademark on MaxDB)
  • cloud_spanner and gcp.pubsub - the former is a trademark, the latter is ambiguous and has to be qualified
  • s3 - it's a trademark. Controversy is that we use aws.s3 today and have a root aws namespace for cross-AWS attributes.

2. Stay consistent within semantic conventions

The guidance would be:

Product name should be qualified with a company/division name with the following exceptions:

  • company and product name are the same (or similar).
  • OSS/community-driven projects that don't belong to a company

Non-controversial examples:

  • mongodb - company name is the same as product name
  • elasticsearch - elastic is already part of elasticsearch, let's use common sense
  • cassandra - apache project
  • ibm.mq

Controversial examples

  • ibm.informix - informix was a product before IBM acquired it

Obviously wrong examples:

  • oracle.mysql - TIL that MySQL belongs to Oracle.
  • broadcom.spring - Spring deserves a root namespace.

I personally prioritize consistency within semantic conventions higher than strict consistency with a trademark. Given that products get acquired/renamed/evolve, we won't be able to stay fully consistent.

I'd prefer option 1.7 (bullets are ordered with descending priority):

  1. Avoid ambiguity - always qualify ambiguous names (gcp.pubsub, ibm.mq, oracle.db - never pubsub, mq, oracle)
  2. Use well-recognized and unique projects names as is (spring, mysql, mssql, postgresql) regardless of company affiliations
  3. Qualify product name with the company/division/etc in other cases. Qualify cloud services with the cloud provider name.
  4. When defining system name/attribute for a product, check if there are system names/attributes for other products from this company. Follow the same pattern.

There are plenty of edge-cases and we'd need to use our judgement on case-by-case basis. E.g. is informix a well-recognized product? Then is should fall under p2 and be informix, otherwise it falls under p3 and becomes ibm.informix.

@trask
Copy link
Member

trask commented Jan 11, 2025

is Xyz a well-recognized product

this is a tough thing to decide, do you think it's possible to only use "avoid ambiguity", and so as long as it's not ambiguous (maybe relying on ownership of a common TLDs or high google SEO ranking?), we'd allow xyz.* to be a top-level namespace

not sure how this would work with hive, geode, derby (apache projects) though...

@lmolkova
Copy link
Contributor Author

lmolkova commented Jan 11, 2025

I'm thinking about azure.
Leaving az vs azure aside, I feel value in having az.cosmosdb, az.servicebus, az.blob vs cosmosdb, servicebus, az.blob:

  • when typing query, users don't need to remember where to use az prefix
  • everything under az would be governed by azure if we had decentralized semconv
  • az.* things use az attributes (including common ones like az.namespace)

From this perspective, I don't see a difference between IBM DB2 that can be hosted anywhere and Azure CosmosDB that's a cloud service and can run only on Azure. So if we use az for the latter, why don't we use ibm for the former?

@trask
Copy link
Member

trask commented Jan 12, 2025

I think we can justify mysql not being oracle.mysql given it's a TLD: https://mysql.com

@trask
Copy link
Member

trask commented Jan 13, 2025

I think we can justify mysql not being oracle.mysql given it's a TLD: https://mysql.com

Similar justification for spring as top-level namespace due to it being hosted at https://spring.io

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Post Stability
Status: Post-stability
Development

No branches or pull requests

6 participants