Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: redundant casting to SecretStr API keys #14446

Closed
wants to merge 20 commits into from

Conversation

rancomp
Copy link
Contributor

@rancomp rancomp commented Dec 8, 2023

  • Description: This PR serves as standardization of the masking and testing.
  • Some API keys were not really masked when stored in the values dict. They should be masked and cast back to string only during the POST request. I fixed 1 occurance of this.
  • Some tests were not covering this flaw because they were repeating the back-and-forth pattern of casting to SecretStr and invoking .get_secret_value() immediately. This pattern is confusing, and in order to clear this out I added an extract_secret_str : SecretStr | Str | None -> str method.
  • The code-base was filled with secretstr, secret_str, secret_string names. I replaced all of them with secretstr.
  • Issue: fixes Issue: Flawed implementations of SecretStr for API keys #14445, related For New Contributors: Use SecretStr for api_keys #12165
  • Dependencies: None
  • Tag maintainer: @eyurtsev @hwchase17

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. Ɑ: models Related to LLMs or chat model modules 🤖:improvement Medium size change to existing code to handle new use-cases labels Dec 8, 2023
Copy link

vercel bot commented Dec 8, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Jan 18, 2024 8:40am

@rancomp rancomp changed the title remove back and forth casting and correct some tests Fix: correct flawed implementations of SecretStr masking of API keys Dec 8, 2023
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Dec 8, 2023
@rancomp rancomp changed the title Fix: correct flawed implementations of SecretStr masking of API keys Fix: correct redundant casting to SecretStr API keys Dec 8, 2023
Copy link
Member

@efriis efriis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciate it @rancomp ! I'll actually recommend picking this up again next week. We're in the middle of a big package organizing effort (context here) that may cause merge conflicts with this that will have to be resolved later next week.

Great change though! Just going to need a few more days to merge it in :)

@efriis efriis self-assigned this Dec 9, 2023
@rancomp
Copy link
Contributor Author

rancomp commented Dec 9, 2023

Hey @efriis ! Thanks, I was actually not aware of that big move. I'll revisit this PR after the merge is complete.

@rancomp rancomp changed the title Fix: correct redundant casting to SecretStr API keys Fix: redundant casting to SecretStr API keys Dec 9, 2023
@efriis
Copy link
Member

efriis commented Dec 12, 2023

As you can see from the merge conflict list, it's in 😅

Feel free to ping when this PR is updated!

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Dec 13, 2023
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Dec 13, 2023
@rancomp
Copy link
Contributor Author

rancomp commented Dec 14, 2023

Hey @efriis , can you check it out now? There's another thing I wanted to do which is clean up the tests, but I think it's better to separate it to another PR.

@hwchase17
Copy link
Contributor

is it possible to keep the name of the function as-is in master for this pr? two reasons:

  1. makes it tough to see what logic is actually changing
  2. with that change in naming, is not backwards compatible, so we'd need to make sure to bump versions across langchain/langchain-community/langchain-core at the same time which is tricky

also, in general trying to NOT make backwards incompatible changes to langchain-core, which this is

@rancomp
Copy link
Contributor Author

rancomp commented Dec 22, 2023

is it possible to keep the name of the function as-is in master for this pr? two reasons:

  1. makes it tough to see what logic is actually changing
  2. with that change in naming, is not backwards compatible, so we'd need to make sure to bump versions across langchain/langchain-community/langchain-core at the same time which is tricky

also, in general trying to NOT make backwards incompatible changes to langchain-core, which this is

hey. yeah definitely agree with you.

@rancomp
Copy link
Contributor Author

rancomp commented Jan 8, 2024

@hwchase17 @efriis I have a small change I'm proposing within langchain-core over the convert_to_secret_str method. The change I'm proposing allows this to accept a None and treat it as an empty string "". This makes it 1) consistent with the behavior of the SecretStr type from pydantic, 2) more compatible with the api_key types, which are usually typed as Optional[str], and defaulting to None.

@rancomp
Copy link
Contributor Author

rancomp commented Jan 18, 2024

@efriis this is a cleanup / standardization. Do you think it's still relevant?

@efriis
Copy link
Member

efriis commented Jan 19, 2024

Ah I think this actually bypasses an important part of type checking, where some classes may want to require that the passed in string is not None.

If you want to fix the case where it wasn't actually storing a masked SecretStr separately though, that would be great!

@efriis efriis closed this Jan 19, 2024
@rancomp
Copy link
Contributor Author

rancomp commented Jan 19, 2024

Ah I think this actually bypasses an important part of type checking, where some classes may want to require that the passed in string is not None.

If you want to fix the case where it wasn't actually storing a masked SecretStr separately though, that would be great!

I think there's confusion here. This PR:

  • Still allows for the API-keys to be None
  • Changes the way the SecretStr is extracted: Instead of casting None to SecretStr and then extracting the secret, the secret is extracted for Union[SecretStr, str, None] type all together through a single function.
  • I made sure that all the classes use the convert_to_secretstr function (for standardization).

What I can also change is that the API-key type hint is changed to Union[SecretStr, str, None] instead of Optional[str] or Optional[SecretStr].

What I can do is allow for the API-keys to be passed as SecretStr and not just str.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:improvement Medium size change to existing code to handle new use-cases Ɑ: models Related to LLMs or chat model modules size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issue: Flawed implementations of SecretStr for API keys
3 participants