forked from ZachNagengast/chatgpt-retrieval-plugin
-
Notifications
You must be signed in to change notification settings - Fork 0
/
pii_detection.py
30 lines (25 loc) · 1.3 KB
/
pii_detection.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from services.openai import get_chat_completion
def screen_text_for_pii(text: str) -> bool:
# This prompt is just an example, change it to fit your use case
messages = [
{
"role": "system",
"content": f"""
You can only respond with the word "True" or "False", where your answer indicates whether the text in the user's message contains PII.
Do not explain your answer, and do not use punctuation.
Your task is to identify whether the text extracted from your company files
contains sensitive PII information that should not be shared with the broader company. Here are some things to look out for:
- An email address that identifies a specific person in either the local-part or the domain
- The postal address of a private residence (must include at least a street name)
- The postal address of a public place (must include either a street name or business name)
- Notes about hiring decisions with mentioned names of candidates. The user will send a document for you to analyze.
""",
},
{"role": "user", "content": text},
]
completion = get_chat_completion(
messages,
)
if completion.startswith("True"):
return True
return False