Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(langchain): handle secret str api keys #7430

Merged
merged 10 commits into from
Nov 3, 2023

Conversation

albertjan
Copy link
Contributor

@albertjan albertjan commented Oct 31, 2023

Currently the anthropic chain implementation in langchain uses a pydantic SecretStr as an api key this is causing errors in our pipeline when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the OpenAI implementation will also start using a SecretStr. I'm sure at that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print anything. And I have no experience with the cassettes testing methods. Can someone help with this? I think if we add a test that uses the Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know about Pydantic so I don't know if this is a valid thing to do.

Checklist

  • Change(s) are motivated and described in the PR description.
  • Testing strategy is described if automated tests are not included in the PR.
  • Risk is outlined (performance impact, potential for breakage, maintainability, etc).
  • Change is maintainable (easy to change, telemetry, documentation).
  • Library release note guidelines are followed. If no release note is required, add label changelog/no-changelog.
  • Documentation is included (in-code, generated user docs, public corp docs).
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Title is accurate.
  • No unnecessary changes are introduced.
  • Description motivates each change.
  • Avoids breaking API changes unless absolutely necessary.
  • Testing strategy adequately addresses listed risk(s).
  • Change is maintainable (easy to change, telemetry, documentation).
  • Release note makes sense to a user of the library.
  • Reviewer has explicitly acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment.
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy
  • If this PR touches code that signs or publishes builds or packages, or handles credentials of any kind, I've requested a review from @DataDog/security-design-and-guidance.
  • This PR doesn't touch any of that.

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Oct 31, 2023

Hi @albertjan, thanks for your contribution! I can take over testing and formatting and getting this merged.

@Yun-Kim Yun-Kim changed the title fix(langchain): Correctly handle SecretStr api_keys while formatting fix(langchain): handle secret str api keys Oct 31, 2023
@Yun-Kim
Copy link
Contributor

Yun-Kim commented Oct 31, 2023

The existing test cases should be sufficient since we continue to test latest versions of langchain - we'll be receiving SecretStr api keys (already are for AI21) once langchain continues to make these api key changes. Thanks again @albertjan for your contribution!

@Yun-Kim Yun-Kim enabled auto-merge (squash) October 31, 2023 19:37
@albertjan
Copy link
Contributor Author

albertjan commented Oct 31, 2023

You're welcome @Yun-Kim ☺️.

The existing test cases should be sufficient since we continue to test latest versions of langchain

A test using the ChatAnthropic would already exercise the SecretStr bit of the _format_api_key function. But yeah the tests will eventually exercise it. 👍

Copy link
Contributor

@majorgreys majorgreys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do the existing tests cover the fixed case? I don't see any change to a span tag check.

@ZStriker19
Copy link
Contributor

Agreeing with Tahir, can we add a test for span.set_tag_str(API_KEY, _format_api_key(api_key)) # override api_key for Pinecone to verify the tag is changed as we expect?

@Yun-Kim
Copy link
Contributor

Yun-Kim commented Nov 2, 2023

@majorgreys @ZStriker19 the existing test cases cover this fixed case - langchain is in the process of switching over to use SecretStr api keys and already has been using SecretStr keys for the AI21 provider, which we already have a test case for.

can we add a test for span.set_tag_str(API_KEY, _format_api_key(api_key)) # override api_key for Pinecone to verify the tag is changed as we expect?

That change is just a refactor, no functionality change as we're not getting that information through langchain.

@Yun-Kim Yun-Kim closed this Nov 2, 2023
auto-merge was automatically disabled November 2, 2023 18:07

Pull request was closed

@Yun-Kim Yun-Kim reopened this Nov 2, 2023
@Yun-Kim Yun-Kim enabled auto-merge (squash) November 2, 2023 18:08
@Yun-Kim Yun-Kim requested a review from mabdinur November 3, 2023 15:08
@Yun-Kim Yun-Kim merged commit 6dc61f5 into DataDog:2.x Nov 3, 2023
github-actions bot pushed a commit that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 6dc61f5)
github-actions bot pushed a commit that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 6dc61f5)
github-actions bot pushed a commit that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 6dc61f5)
@albertjan
Copy link
Contributor Author

Thanks all! 😄

@albertjan albertjan deleted the fix/handling_of_secret_str_api_keys branch November 4, 2023 07:59
mabdinur pushed a commit that referenced this pull request Nov 6, 2023
Backport 6dc61f5 from #7430 to 1.20.

Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Albert-Jan Nijburg <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
Yun-Kim added a commit that referenced this pull request Nov 6, 2023
Backport 6dc61f5 from #7430 to 2.1.

Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Albert-Jan Nijburg <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
majorgreys pushed a commit that referenced this pull request Nov 6, 2023
Backport 6dc61f5 from #7430 to 2.0.

Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Albert-Jan Nijburg <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants