You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be great to have a functionality which allow users to anonymize some part of the data (source files) sent to AI Providers.
Effectively, Companies don't want to send their code to a third party (even if protection is provided), so it is important for them that the code could not be associate with them.
And allow users to save sent body content in a local log file to be analyzed by security team.
Proposed solution
For the anonymization part, it could be done by adding a configuration table in the plugin settings.
This table contains 2 columns
one with a regexp to identify the pattern to anonymize and a
second with the action to do (shuffle, random, specific text)
Here is the link to the code I have started to do (branch feat/anonymization)
To simplify the implementation, it would be great to have a centralized method in CodeGPT which is responsible to call the external AI Provider, Like that we could do the Anonymization and log in a single place ;-)
The text was updated successfully, but these errors were encountered:
Something I have never thought about, thank you! This could definitely benefit the community who are using the extension within their company (especially those who work with proprietary data). However, this seems like a very use case specific feature that I unfortunately can't find the time to implement, but I'm happy to accept PRs.
A few notes though:
From a UI/UX perspective, in my opinion, 'Editor Anonymizations' is a bit misleading, and I would rather call it 'Data Masking' or something similar. This is because there are many things not related to the editor, such as git commit message generation or even basic chatting. Also, I think the masking and log path configuration deserve their own settings page: Tools | CodeGPT | Security, since this is something that most users probably don't find that useful.
Implementation-wise, in theory, we could mask the data after the request body is built (the same place where you added the logging) and work on the final string. However, then there's always a risk of masking some important keys, which could break everything, but this seems to be avoidable if you're converting the final string back to a map and masking the values only.
Yes the objective is to Anonymize data at sending, and deanonymize the result before display response in the TextField
The deanonymization will be done thanks to caching transformations.
Describe the need of your request
It would be great to have a functionality which allow users to anonymize some part of the data (source files) sent to AI Providers.
Effectively, Companies don't want to send their code to a third party (even if protection is provided), so it is important for them that the code could not be associate with them.
And allow users to save sent body content in a local log file to be analyzed by security team.
Proposed solution
For the anonymization part, it could be done by adding a configuration table in the plugin settings.
This table contains 2 columns
Here is the link to the code I have started to do (branch feat/anonymization)
Additional context
To simplify the implementation, it would be great to have a centralized method in CodeGPT which is responsible to call the external AI Provider, Like that we could do the Anonymization and log in a single place ;-)
The text was updated successfully, but these errors were encountered: