Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ECO][Inventory v2] APM changes #202299

Closed
cauemarcondes opened this issue Nov 29, 2024 · 11 comments · Fixed by #202497
Closed

[ECO][Inventory v2] APM changes #202299

cauemarcondes opened this issue Nov 29, 2024 · 11 comments · Fixed by #202497
Assignees
Labels
Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team

Comments

@cauemarcondes
Copy link
Contributor

cauemarcondes commented Nov 29, 2024

We're moving away from materialized entities, meaning we won't have the entity-*-latest index anymore. Due to this, these changes need to be made:

Warning

We should NOT call the EEM rest APIs directly, we should use the services exposed on the EEM client plugin.

  • APM fetches the datastream_type from a given service name on the entities latest index. We'll need to update it to call the new API instead:
    API: GET /internal/apm/entities/services/{serviceName}/summary
POST kbn:/internal/entities/v2/_search
{
    "type": "service",
    "limit": 1,
    "metadata_fields": ["data_stream.type"],
    "filters": ["service.name == \"service-9\"", "service.environment == \"Synthtrace: other_bucket_group development\""]
}
  • Delete GET /internal/apm/entities/services.
  • /link-to/entity/{serviceName} route:
    This link is used to link service names shown on logs explorer and discover, It also uses the summary API GET /internal/apm/entities/services/{serviceName}/summary, and needs to be tested.
@cauemarcondes cauemarcondes added the Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team label Nov 29, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ux-infra_services-team (Team:obs-ux-infra_services)

@cauemarcondes
Copy link
Contributor Author

@cauemarcondes cauemarcondes removed their assignment Dec 4, 2024
@cauemarcondes
Copy link
Contributor Author

I created a draft PR in case the person making these changes wants to continue what I've done.

@jennypavlova jennypavlova self-assigned this Dec 11, 2024
@jennypavlova
Copy link
Member

jennypavlova commented Dec 12, 2024

I was comparing the results of entityManagerClient.v2.searchEntities and getServiceEntitySummary I see some differences:

Before we were using the latestEntityServices query and now I am trying to replace it with entityManagerClient.v2.searchEntities
The agentName and environments are not present in the response:

getServiceEntitySummary entityManagerClient.v2.searchEntities
{
"serviceName": "synth-dotnet",
"agentName": "rum-js",
"lastSeenTimestamp": "2024-12-12T18:23:14.753Z",
"dataStreamTypes": [
"traces",
"logs"
],
"environments": [
[
"Synthtrace: logs_traces_hosts"
]
}
[
{
"data_stream.type": [
"logs",
"traces"
],
"entity.last_seen_timestamp": "2024-12-12T18:24:20.753Z",
"service.name": "synth-dotnet",
"entity.type": "built_in_services_from_ecs_data",
"entity.id": "synth-dotnet",
"entity.display_name": "synth-dotnet""
}
]

So regarding the serviceEntitySummary missing fields usage: the agent name is used here and I couldn't find environments usage so far.

cc: @miltonhultgren @klacabane Do you know how to get the agent name there and if we plan to add it in the response?

I did some updates in #202497 so both responses are now logged and present

@klacabane
Copy link
Contributor

klacabane commented Dec 12, 2024

@jennypavlova how does the searchEntities call looks like ? generally you need to explicitly ask for metadata fields in v2 APIs, so in that case it should be searchEntities({ ..., metadata_fields: ['data_stream.type', 'agent.name', 'service.environment'], ... }).

If you want some metadata fields to always be included when you search for entities, you'd need to set them on the source definition (here for services) but you'd have to consider the tradeoff (ie always aggregating a given field even if not used vs only aggregating when needed). If a field is required in all the places a given entity type is manipulated then it may be a good call to set it on the definition source

@klacabane
Copy link
Contributor

klacabane commented Dec 12, 2024

One additional note on metadata fields is that you may see different results between v1 and v2 as v1 used a terms aggregation to get n unique values but v2 uses a top aggregation that does not dedupe. For a given time range there's a good chance you won't get all the unique values for a field but only the top result sorted alphabetically. I'm looking into ways to improve this (see #204137)

@jennypavlova
Copy link
Member

jennypavlova commented Dec 13, 2024

generally you need to explicitly ask for metadata fields in v2 APIs, so in that case it should be searchEntities({ ..., metadata_fields: ['data_stream.type', 'agent.name', 'service.environment'], ... }).

@klacabane Good point, this works for the use case here!

If you want some metadata fields to always be included when you search for entities, you'd need to set them on the source definition (here for services) but you'd have to consider the tradeoff (ie always aggregating a given field even if not used vs only aggregating when needed).

I don't think we should add it for now - I am not sure if it will be used everywhere so I think we can add it once we confirm that.

One additional note on metadata fields is that you may see different results between v1 and v2 as v1 used a terms aggregation to get n unique values but v2 uses a top aggregation that does not dedupe. For a given time range there's a good chance you won't get all the unique values for a field but only the top result sorted alphabetically. I'm looking into ways to improve this

Thanks! Agent name is a string so I guess it should be fine there (it could be also an array from what I saw) and for the data stream type that would be useful :)

@jennypavlova
Copy link
Member

@rmyz found an issue with duplicate values in the metadata and enviroment and I checked and it's related to the EEM response - I used synthtrace to generate data and I got this response:

Image

I handled that on our side to allow only unique values but it might be something you should also take a look, @miltonhultgren , @klacabane

@klacabane
Copy link
Contributor

@jennypavlova this is caused by the TOP behavior we use on metadata aggregation which does not dedup values and eem only dedup values when merging entities from different sources. we've discussed this and TOP does not satisfy our requirements so we'll revert to VALUES which does the dedup

@jennypavlova
Copy link
Member

Thank you for checking that! I already added a logic to pick only unique values so it's not a blocker for the PR

jennypavlova added a commit that referenced this issue Dec 17, 2024
Closes [202299](#202299)

## Summary

This PR replaces the query in `getServiceEntitySummary` with the v2
function (`entityManagerClient.v2.searchEntities`)

## Testing
- Verify the response of the summary endpoint in the UI:



https://github.com/user-attachments/assets/ba895f7d-57c8-492b-81dd-cf7869ffbc86



- Dev tools query
  -  APM service
      #### Request:
      ```
GET
kbn:/internal/apm/entities/services/synth-node-0/summary?environment=ENVIRONMENT_ALL
      ```
      #### Response
      ```
      {
        "serviceName": "synth-node-0",
        "agentName": "nodejs",
        "lastSeenTimestamp": "2024-12-13T16:29:19.868Z",
        "dataStreamTypes": [
          "logs",
          "traces"
        ],
        "environments": [
          "Synthtrace: simple_trace"
        ]
      }
      ```

  -  Service from logs
      #### Request:
      ```
GET
kbn:/internal/apm/entities/services/synth-node/summary?environment=ENVIRONMENT_ALL
      ```
      #### Response: 
      ```
      {
        "serviceName": "synth-node",
        "agentName": "go",
        "lastSeenTimestamp": "2024-12-13T16:27:43.461Z",
        "dataStreamTypes": [
          "logs",
          "traces"
        ],
        "environments": [
          "Synthtrace: logs_traces_hosts"
        ]
      }
      ```

---------

Co-authored-by: Jenny <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Dec 17, 2024
Closes [202299](elastic#202299)

## Summary

This PR replaces the query in `getServiceEntitySummary` with the v2
function (`entityManagerClient.v2.searchEntities`)

## Testing
- Verify the response of the summary endpoint in the UI:

https://github.com/user-attachments/assets/ba895f7d-57c8-492b-81dd-cf7869ffbc86

- Dev tools query
  -  APM service
      #### Request:
      ```
GET
kbn:/internal/apm/entities/services/synth-node-0/summary?environment=ENVIRONMENT_ALL
      ```
      #### Response
      ```
      {
        "serviceName": "synth-node-0",
        "agentName": "nodejs",
        "lastSeenTimestamp": "2024-12-13T16:29:19.868Z",
        "dataStreamTypes": [
          "logs",
          "traces"
        ],
        "environments": [
          "Synthtrace: simple_trace"
        ]
      }
      ```

  -  Service from logs
      #### Request:
      ```
GET
kbn:/internal/apm/entities/services/synth-node/summary?environment=ENVIRONMENT_ALL
      ```
      #### Response:
      ```
      {
        "serviceName": "synth-node",
        "agentName": "go",
        "lastSeenTimestamp": "2024-12-13T16:27:43.461Z",
        "dataStreamTypes": [
          "logs",
          "traces"
        ],
        "environments": [
          "Synthtrace: logs_traces_hosts"
        ]
      }
      ```

---------

Co-authored-by: Jenny <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
(cherry picked from commit a66c139)
@klacabane
Copy link
Contributor

@jennypavlova fyi #204460 has been merged so the dedup logic is not needed anymore

JoseLuisGJ pushed a commit to JoseLuisGJ/kibana that referenced this issue Dec 19, 2024
Closes [202299](elastic#202299)

## Summary

This PR replaces the query in `getServiceEntitySummary` with the v2
function (`entityManagerClient.v2.searchEntities`)

## Testing
- Verify the response of the summary endpoint in the UI:



https://github.com/user-attachments/assets/ba895f7d-57c8-492b-81dd-cf7869ffbc86



- Dev tools query
  -  APM service
      #### Request:
      ```
GET
kbn:/internal/apm/entities/services/synth-node-0/summary?environment=ENVIRONMENT_ALL
      ```
      #### Response
      ```
      {
        "serviceName": "synth-node-0",
        "agentName": "nodejs",
        "lastSeenTimestamp": "2024-12-13T16:29:19.868Z",
        "dataStreamTypes": [
          "logs",
          "traces"
        ],
        "environments": [
          "Synthtrace: simple_trace"
        ]
      }
      ```

  -  Service from logs
      #### Request:
      ```
GET
kbn:/internal/apm/entities/services/synth-node/summary?environment=ENVIRONMENT_ALL
      ```
      #### Response: 
      ```
      {
        "serviceName": "synth-node",
        "agentName": "go",
        "lastSeenTimestamp": "2024-12-13T16:27:43.461Z",
        "dataStreamTypes": [
          "logs",
          "traces"
        ],
        "environments": [
          "Synthtrace: logs_traces_hosts"
        ]
      }
      ```

---------

Co-authored-by: Jenny <[email protected]>
Co-authored-by: kibanamachine <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:obs-ux-infra_services Observability Infrastructure & Services User Experience Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants