Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added partially extracted slots support for the GroupSlots #394

Merged
merged 25 commits into from
Nov 7, 2024
Merged
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
f1857f6
Added flag to
NotBioWaste905 Oct 1, 2024
c334ff5
First test attempts
NotBioWaste905 Oct 1, 2024
8306bbb
linting
NotBioWaste905 Oct 1, 2024
33f05d0
Added groupslot tutorial for slots
NotBioWaste905 Oct 7, 2024
09937ae
Switched to unit tests
NotBioWaste905 Oct 9, 2024
218e8e9
Lint
NotBioWaste905 Oct 9, 2024
f217832
simplify recursive_setattr
RLKRo Oct 14, 2024
da48f68
update docstrings
RLKRo Oct 14, 2024
a09037c
remove unrelated llm tests
RLKRo Oct 14, 2024
e534f4c
rewrite partial extraction tests
RLKRo Oct 14, 2024
66f3db0
rename allow_partially_extracted to allow_partial_extraction
RLKRo Oct 15, 2024
c489024
add check_happy_path block to tutorial
RLKRo Oct 15, 2024
e6a9468
reformat tutorial
RLKRo Oct 15, 2024
78f3b2b
rewrite tutorial text
RLKRo Oct 15, 2024
4f59a35
Merge branch 'refs/heads/dev' into feat/slots_extraction_update
RLKRo Oct 15, 2024
37b0218
Updated happy path, fixed the script
NotBioWaste905 Oct 22, 2024
7915188
minor text changes
RLKRo Oct 23, 2024
a06ea3c
fix codestyle
RLKRo Oct 23, 2024
204c4e2
Working on tutorial
NotBioWaste905 Oct 30, 2024
8869eef
Added GroupSlotsExtracted class with required field
NotBioWaste905 Oct 31, 2024
376bebd
lint
NotBioWaste905 Oct 31, 2024
713203c
Removed `GroupSlotsExtracted`, updated tutorial
NotBioWaste905 Nov 2, 2024
08dab5d
update tutorial: fix wording
RLKRo Nov 6, 2024
332fa34
update tutorial: change script to showcase behavior with different se…
RLKRo Nov 6, 2024
b31936a
update tutorial: fix wording
RLKRo Nov 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
231 changes: 124 additions & 107 deletions tutorials/slots/2_partial_extraction.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,10 +12,8 @@
from chatsky import (
RESPONSE,
TRANSITIONS,
PRE_TRANSITION,
PRE_RESPONSE,
GLOBAL,
LOCAL,
Pipeline,
Transition as Tr,
conditions as cnd,
Expand Down Expand Up @@ -63,150 +61,169 @@

## Code explanation

In this example we define two group slots: `person` and `friend`.
Note that in the `person` slot we set `allow_partial_extraction` to `True`
which allows us to _update_ slot values and not
rewrite them in case we don't get full information at once.
In this example we showcase the behavior of
different group slot extraction settings:

So if we send "[email protected]" as user email and after that send only Bitcoin address
the extracted user data would be "<bitcoin_address> [email protected]"
and not "<bitcoin_address> default_email".
We can compare that behaviour with `fried` slot extraction where we have set `success_only=False`
that enables us to send unly partial info that can be overwritten with default values.
Group `partial_extraction` is marked with `allow_partial_extraction`.
Any slot in this group is saved if and only if that slot is successfully
extracted.

Group `success_only_extraction` is extracted with the `success_only`
flag set to True.
Any slot in this group is saved if and only if all of the slots in the group
are successfully extracted within a single `Extract` call.

Group `success_only_false` is extracted with the `success_only` set to False.
Any slot in this group is saved (even if extraction was not successful).
RLKRo marked this conversation as resolved.
Show resolved Hide resolved

Group `sub_slot_success_only_extraction` is extracted by passing all of its
child slots to the `Extract` method with the `success_only` flag set to True.
The behavior is equivalent to that of `partial_extraction`.
"""

# %%
sub_slots = {
"date": RegexpSlot(
regexp=r"(0?[1-9]|(?:1|2)[0-9]|3[0-1])[\.\/]"
r"(0?[1-9]|1[0-2])[\.\/](\d{4}|\d{2})",
),
"email": RegexpSlot(
regexp=r"[\w\.-]+@[\w\.-]+\.\w{2,4}",
),
}

SLOTS = {
"person": GroupSlot(
coin_address=RegexpSlot(
regexp=r"(\b[a-zA-Z0-9]{34}\b)",
default_value="default_address",
match_group_idx=1,
required=True,
),
email=RegexpSlot(
regexp=r"([\w\.-]+@[\w\.-]+\.\w{2,4})",
default_value="default_email",
match_group_idx=1,
),
"partial_extraction": GroupSlot(
**sub_slots,
allow_partial_extraction=True,
),
"friend": GroupSlot(
coin_address=RegexpSlot(
regexp=r"(\b[a-zA-Z0-9]{34}\b)", default_value="default_address"
),
email=RegexpSlot(
regexp=r"([\w\.-]+@[\w\.-]+\.\w{2,4})",
default_value="default_email",
),
"success_only_extraction": GroupSlot(
**sub_slots,
),
"success_only_false": GroupSlot(
**sub_slots,
),
"sub_slot_success_only_extraction": GroupSlot(
**sub_slots,
),
}

script = {
GLOBAL: {
TRANSITIONS: [
Tr(dst=("user_flow", "ask"), cnd=cnd.Regexp(r"^[sS]tart"))
Tr(dst=("main", "start"), cnd=cnd.ExactMatch("/start")),
Tr(dst=("main", "reset"), cnd=cnd.ExactMatch("/reset")),
Tr(dst=("main", "print"), priority=0.5),
]
},
"user_flow": {
LOCAL: {
PRE_TRANSITION: {"get_slots": proc.Extract("person")},
TRANSITIONS: [
Tr(
dst=("root", "utter_user"),
cnd=cnd.SlotsExtracted("person.email"),
priority=1.2,
),
# Tr(dst=("user_flow", "repeat_question"), priority=0.8),
],
},
"ask": {RESPONSE: "Please, send your email and bitcoin address."},
"repeat_question": {
RESPONSE: "Please, send your bitcoin address and email again."
"main": {
"start": {RESPONSE: "Hi! Send me email and date."},
"reset": {
PRE_RESPONSE: {"reset_slots": proc.UnsetAll()},
RESPONSE: "All slots have been reset.",
},
},
"friend_flow": {
LOCAL: {
PRE_TRANSITION: {
"get_slots": proc.Extract("friend", success_only=False)
},
TRANSITIONS: [
Tr(
dst=("root", "utter_friend"),
cnd=cnd.SlotsExtracted(
"friend.coin_address", "friend.email", mode="any"
),
priority=1.2,
"print": {
PRE_RESPONSE: {
"partial_extraction": proc.Extract("partial_extraction"),
# partial extraction is always successful;
# success_only doesn't matter
"success_only_extraction": proc.Extract(
"success_only_extraction", success_only=True
),
Tr(
dst=("friend_flow", "ask"),
cnd=cnd.ExactMatch("update"),
priority=0.8,
# success_only is True by default
"success_only_false": proc.Extract(
"success_only_false", success_only=False
),
Tr(dst=("friend_flow", "repeat_question"), priority=0.8),
],
},
"ask": {
RESPONSE: "Please, send your friends bitcoin address and email."
},
"repeat_question": {
RESPONSE: "Please, send your friends bitcoin address and email again."
},
},
"root": {
"start": {
TRANSITIONS: [Tr(dst=("user_flow", "ask"))],
},
"fallback": {
RESPONSE: "Finishing query",
TRANSITIONS: [Tr(dst=("user_flow", "ask"))],
},
"utter_friend": {
"sub_slot_success_only_extraction": proc.Extract(
"sub_slot_success_only_extraction.email",
"sub_slot_success_only_extraction.date",
success_only=True,
),
},
RESPONSE: rsp.FilledTemplate(
"Your friends address is {friend.coin_address} and email is {friend.email}"
"Extracted slots:\n"
" Group with partial extraction:\n"
" {partial_extraction}\n"
" Group with success_only:\n"
" {success_only_extraction}\n"
" Group without success_only:\n"
" {success_only_false}\n"
" Extracting sub-slots with success_only:\n"
" {sub_slot_success_only_extraction}"
),
TRANSITIONS: [Tr(dst=("friend_flow", "ask"))],
},
"utter_user": {
RESPONSE: "Your bitcoin address is {person.coin_address}. Your email is {person.email}. You can update your data or type /send to proceed.",
PRE_RESPONSE: {"fill": proc.FillTemplate()},
TRANSITIONS: [
Tr(dst=("friend_flow", "ask"), cnd=cnd.ExactMatch("/send")),
Tr(dst=("user_flow", "ask")),
],
},
},
}

HAPPY_PATH = [
("Start", "Please, send your email and bitcoin address."),
("/start", "Hi! Send me email and date."),
(
"Only email: [email protected]",
"Extracted slots:\n"
" Group with partial extraction:\n"
" {'date': 'None', 'email': '[email protected]'}\n"
" Group with success_only:\n"
" {'date': 'None', 'email': 'None'}\n"
" Group without success_only:\n"
" {'date': 'None', 'email': '[email protected]'}\n"
" Extracting sub-slots with success_only:\n"
" {'date': 'None', 'email': '[email protected]'}",
),
(
"[email protected]",
"Your bitcoin address is default_address. Your email is [email protected]. You can update your data or type /send to proceed.",
"Only date: 01.01.2024",
"Extracted slots:\n"
" Group with partial extraction:\n"
" {'date': '01.01.2024', 'email': '[email protected]'}\n"
" Group with success_only:\n"
" {'date': 'None', 'email': 'None'}\n"
" Group without success_only:\n"
" {'date': '01.01.2024', 'email': 'None'}\n"
" Extracting sub-slots with success_only:\n"
" {'date': '01.01.2024', 'email': '[email protected]'}",
),
("update", "Please, send your email and bitcoin address."),
(
"1FeexV6bAHb8ybZjqQMjJrcCrHGW9sb6uF",
"Your bitcoin address is 1FeexV6bAHb8ybZjqQMjJrcCrHGW9sb6uF. Your email is [email protected]. You can update your data or type /send to proceed.",
"Both email and date: [email protected]; 02.01.2024",
"Extracted slots:\n"
" Group with partial extraction:\n"
" {'date': '02.01.2024', 'email': '[email protected]'}\n"
" Group with success_only:\n"
" {'date': '02.01.2024', 'email': '[email protected]'}\n"
" Group without success_only:\n"
" {'date': '02.01.2024', 'email': '[email protected]'}\n"
" Extracting sub-slots with success_only:\n"
" {'date': '02.01.2024', 'email': '[email protected]'}",
),
("/send", "Please, send your friends bitcoin address and email."),
(
"[email protected]",
"Your friends address is default_address and email is [email protected]",
"Partial update (date only): 03.01.2024",
"Extracted slots:\n"
" Group with partial extraction:\n"
" {'date': '03.01.2024', 'email': '[email protected]'}\n"
" Group with success_only:\n"
" {'date': '02.01.2024', 'email': '[email protected]'}\n"
" Group without success_only:\n"
" {'date': '03.01.2024', 'email': 'None'}\n"
" Extracting sub-slots with success_only:\n"
" {'date': '03.01.2024', 'email': '[email protected]'}",
),
("update", "Please, send your friends bitcoin address and email."),
(
"3Nxwenay9Z8Lc9JBiywExpnEFiLp6Afp8v",
"Your friends address is 3Nxwenay9Z8Lc9JBiywExpnEFiLp6Afp8v and email is default_email",
"No slots here but `Extract` will still be called.",
"Extracted slots:\n"
" Group with partial extraction:\n"
" {'date': '03.01.2024', 'email': '[email protected]'}\n"
" Group with success_only:\n"
" {'date': '02.01.2024', 'email': '[email protected]'}\n"
" Group without success_only:\n"
" {'date': 'None', 'email': 'None'}\n"
" Extracting sub-slots with success_only:\n"
" {'date': '03.01.2024', 'email': '[email protected]'}",
),
]


# %%
pipeline = Pipeline(
script=script,
start_label=("root", "start"),
fallback_label=("root", "fallback"),
start_label=("main", "start"),
slots=SLOTS,
)

Expand Down
Loading