Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mask API Key for OpenAI based ChatModels (OpenAI,AzureOpenAi,Konko) #12542

Closed

Conversation

onesolpark
Copy link
Contributor

  • Description: Added API Key masking for all OpenAI based ChatModels (OpenAI, Azure OpenAI, Konko)
    • Updated OpenAI Chat model to use SecretStr to mask sensitive API Key. (Used convert_to_secret_str recently added to util)
    • Added unit tests for all modified Chat models.
  • Issue: For New Contributors: Use SecretStr for api_keys #12165
  • Dependencies: None
  • Tag maintainer: @eyurtsev

Konko and OpenAI both implemented OpenAI, so needed to change all three to avoid errors.

@vercel
Copy link

vercel bot commented Oct 30, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Nov 1, 2023 3:25pm

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:improvement Medium size change to existing code to handle new use-cases labels Oct 30, 2023
@baskaryan baskaryan assigned eyurtsev and unassigned eyurtsev Oct 30, 2023
@baskaryan baskaryan added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Oct 30, 2023
@onesolpark onesolpark force-pushed the azureopenai-secretstr branch from 73d2f91 to 75e6434 Compare October 31, 2023 06:04
@onesolpark
Copy link
Contributor Author

Rebased and updated poetry.lock with changes to content-hash
@baskaryan Can you please another look at this?

@onesolpark onesolpark closed this Nov 1, 2023
@onesolpark onesolpark force-pushed the azureopenai-secretstr branch from 977aa26 to f0eba1a Compare November 1, 2023 01:11
@onesolpark onesolpark reopened this Nov 1, 2023
@onesolpark
Copy link
Contributor Author

@eyurtsev pinging for your help on workflow and merge :)

@eyurtsev eyurtsev self-assigned this Nov 1, 2023
@eyurtsev eyurtsev self-requested a review November 1, 2023 14:22
Yun-Kim added a commit to DataDog/dd-trace-py that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
github-actions bot pushed a commit to DataDog/dd-trace-py that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 6dc61f5)
github-actions bot pushed a commit to DataDog/dd-trace-py that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 6dc61f5)
github-actions bot pushed a commit to DataDog/dd-trace-py that referenced this pull request Nov 3, 2023
Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 6dc61f5)
mabdinur pushed a commit to DataDog/dd-trace-py that referenced this pull request Nov 6, 2023
Backport 6dc61f5 from #7430 to 1.20.

Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Albert-Jan Nijburg <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
Yun-Kim added a commit to DataDog/dd-trace-py that referenced this pull request Nov 6, 2023
Backport 6dc61f5 from #7430 to 2.1.

Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Albert-Jan Nijburg <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
majorgreys pushed a commit to DataDog/dd-trace-py that referenced this pull request Nov 6, 2023
Backport 6dc61f5 from #7430 to 2.0.

Currently the anthropic chain implementation in langchain uses a
pydantic SecretStr as an api key this is causing errors in our pipeline
when ddtrace tries to format the api key.

With this PR: langchain-ai/langchain#12542 the
OpenAI implementation will also start using a SecretStr. I'm sure at
that point there will be a few more people asking why things are broken.

I'm struggling setting up and running the tests, riot doesn't print
anything. And I have no experience with the cassettes testing methods.
Can someone help with this? I think if we add a test that uses the
Anthropic LLM we will see the failure before. And this will fix it.

I've updated the type comment to the function, but the env doesn't know
about Pydantic so I don't know if this is a valid thing to do.

## Checklist

- [X] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [X] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [X] Change is maintainable (easy to change, telemetry, documentation).
- [X] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [X] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [x] Title is accurate.
- [x] No unnecessary changes are introduced.
- [x] Description motivates each change.
- [x] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [x] Testing strategy adequately addresses listed risk(s).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] Release note makes sense to a user of the library.
- [x] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [x] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [x] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [x] This PR doesn't touch any of that.

---------

Co-authored-by: Albert-Jan Nijburg <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
@onesolpark
Copy link
Contributor Author

@eyurtsev lots of changes happened and I am trying to change accordingly and asking your opinion.

  1. Konko no longer implements ChatOpenAI model Bagatur/oai v1 scratch #12948 (seems accidental as you mentioned)
    => Should I remove changes to Konko in this PR or just keep them.
  2. AzureChatOpenAI derives from ChatOpenAI chat-model and needs both of them to get fixed to work correctly but ChatOpenAI seems like to be assigned to someone else.
    => Is it okay to fix both of them in this PR?

Also this PR has a lot of commits that are predated, do you want me to create a new PR with the changes?

@hwchase17
Copy link
Contributor

i would likely suggest close and open new ones, probably one for each model

thanks (and sorry!)

@hwchase17 hwchase17 closed this Nov 30, 2023
baskaryan pushed a commit that referenced this pull request May 1, 2024
**Description:** Add tests to check API keys and Active Directory tokens
are masked
**Issue:** Resolves #12165 for OpenAI and Azure OpenAI models
**Dependencies:** None

Also resolves #12473 which may be closed.

Additional contributors @alex4321 (#12473) and @onesolpark (#12542)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases lgtm PR looks good. Use to confirm that a PR is ready for merging. Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants