Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get the connector given name on outputs? #65

Open
ismajl-ramadani opened this issue Nov 8, 2024 · 11 comments
Open

Can't get the connector given name on outputs? #65

ismajl-ramadani opened this issue Nov 8, 2024 · 11 comments
Assignees

Comments

@ismajl-ramadani
Copy link
Contributor

Im implementing a custom output plugin, and I found out that I can't get the connector provided name that I specified during the config.

If I use this connector config:

{
  "identity": "EC0FFEE1",
  "key": "xoxb-...",
  "connector": "slack_audit",
  "name": "Slack-Instance-A"
}

According to documentation:

To allow Grove to collect logs from all of these it supports multiple instances of the same connectors to be configured.

I should be able to add another instance for slack_audit, like so:

{
  "identity": "...",
  "key": "...-...",
  "connector": "slack_audit",
  "name": "Slack-Instance-B"
}

and that works perfectly fine, but then in the output[Submit] plugin I don't see a way how to get the name of the instance instead of the name of the connector.


What I can see from the implementation, is that the base connector sends the logs, and attaches the necessary arguments and metadata. The name part, as I see comes from the concrete implementation of the connector, but its static. The slack connector for example, is set up like following:

# The part where grove send the logs (from base connector)
self._output.submit(
  data=self._output.serialize(
      data=to_save,
      metadata=self.metadata(),
  ),
  part=self._part,
  operation=self.operation,
  connector=self.NAME,
  identity=self.identity,
  descriptor=descriptor,
)

# Slack Connector Implementation
...
class Connector(BaseConnector):
    NAME = "slack_audit_logs"
....

Also what I was able to observe, was the way the cache_key was built using a prefix, connectorName and the hash of identity... meaning that the provided name in the config was ignored completely.

Is this intentionally, or is there another way to get the exact name as provided in the configuration.

@hcpadkins
Copy link
Contributor

Hey there,

Thanks for raising this issue! Let me dig into this a little further and get back to you :)

@hcpadkins hcpadkins self-assigned this Nov 11, 2024
@hcpadkins
Copy link
Contributor

Sorry for the delay in response here!

Looking through the implementation, you are correct. The name field doesn't appear to be referenced anywhere, and is only used as a plain-text identifier in the connector configuration. This is likely an oversight from the initial implementation a number of years ago.

One challenge we have here is that if we were to swap this today, it would be a breaking change as a number of internals would change - such as caches key, and paths in output object stores. We can likely avoid updating the cache keys entirely, as those aren't exposed to the user in any way.

Two ways we could look to approach this:

  1. Expose a configuration flag.
    1. This would indicate to Grove that name should be used in place of identity when outputting logs.
    2. I don't feel great about this, but it would avoid an API break, while still allowing control over which should be used.
    3. This could cause some sharp edges for users, as there's no uniqueness checks on name.
  2. We break the API - requiring a major version change.
    1. In this case, we would add the name field to the method signature of the submit method of BaseOutput.
    2. This would allow this to be used in output modules, but would also require a breaking change.

Let me have a bit of a think about this, and discuss with a few others internally. In the interim, if you have any other suggestions for a fix, please let me know!

Thank you again for your help.

@ismajl-ramadani
Copy link
Contributor Author

Thanks for getting back.

I would like to participate on fixing this.

One hack/quick fix that will not require breaking change is to pass the name into the metadata as it is accessible with config.name..., but again it will create confusion. However I would suggest to implement it correctly and to go with the second approach, and I'm willing to do the implementation.

@hcpadkins
Copy link
Contributor

That sounds great! I'd be more than happy for you to raise a pull-request for this one, if you'd like to take it on, and we'll get it reviewed and merged.

There is a candidate v2.0 which will be raised soon as there are some other changes we are making to the core, but those won't conflict with these changes. However, we can likely look to get this into the same release. This would allow the breaking change to be released to PyPi without a subsequent major version bump.

@ismajl-ramadani
Copy link
Contributor Author

Amazing, will send a PR over the coming days.

@ismajl-ramadani
Copy link
Contributor Author

Hey @hcpadkins, I'm working on this branch #66 on an attempt to fix this issue.

I need some input from you though:

Do you think its a bad idea to change the self.NAME to self.CONNECTOR? That way it would be more concise when we refer to that term, and then we can use use the name as it is provided from configuration. If that doesn't make sense, can you please suggest any other approach on how to fix this?

@hcpadkins
Copy link
Contributor

Hey @ismajl-ramadani,

Thanks for submitting this one! I've had a quick look and I can't see any reason why this should be a problem. However, I'll run a few more tests over the next couple of hours and get back to you.

@hcpadkins
Copy link
Contributor

Thanks for your patience, this should be fine! I've double checked, and you've updated all of the references that I can find related to the old connector NAME field - including the documentation.

@ismajl-ramadani
Copy link
Contributor Author

Thats amazing, let me just double check the signature method for outputs and then I will submit the PR.

@ismajl-ramadani
Copy link
Contributor Author

I opened the PR now, feel free to let me know if I need to change anything.

@hcpadkins
Copy link
Contributor

Hey @ismajl-ramadani,

I've just merged this, so the changes should be in main shortly. I'll get this integrated into the v2.0.0 version when that's ready to go, but in the mean time you should be able to install from the commit ref in the main repository :)

Thank you again for raising this, and for resolving it as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants