Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: request contextualisation - core functionality #65

Open
wants to merge 57 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
21506e6
context logic subpackage; type-hint context extraction
Jun 21, 2024
a87e8e2
reworked type hint info extraction; extended functionality to also re…
ds-jakub-cierocki Jun 24, 2024
3ad4ecd
hidden args handling enabled
ds-jakub-cierocki Jun 24, 2024
b0cc0ae
improved type hints parsing and compatibility using package
ds-jakub-cierocki Jun 28, 2024
4ff5f62
dedicated exceptions for contex-related operations
ds-jakub-cierocki Jun 28, 2024
c479c50
useful classmethods for context-related operations
ds-jakub-cierocki Jun 28, 2024
e3bb127
make whole context utils module protected; added IQL parsing helper; …
ds-jakub-cierocki Jun 28, 2024
de72c7c
parsing type hints _extract_params_and_context() no longer excludes B…
ds-jakub-cierocki Jun 28, 2024
d3958c0
adjusted the existing code to be aware of contexts (promts yet untouc…
ds-jakub-cierocki Jun 28, 2024
be338bf
adjusted _type_validators.validate_arg_type() to handle typing.Union[]
ds-jakub-cierocki Jul 2, 2024
78f1535
context._utils._does_arg_allow_context() fix
ds-jakub-cierocki Jul 2, 2024
308e2e1
context record is now based on pydantic.BaseModel rather than datacla…
ds-jakub-cierocki Jul 2, 2024
73741d9
type hint lifting
ds-jakub-cierocki Jul 2, 2024
902f5ff
IQL generating LLM prompt passes BaseCallerContext() as filter argume…
ds-jakub-cierocki Jul 2, 2024
6309070
comments cleanup
ds-jakub-cierocki Jul 2, 2024
d523bf7
type hint fixes
ds-jakub-cierocki Jul 3, 2024
efe212f
Merge branch 'main' (which includes a large refactor by Michal) into …
ds-jakub-cierocki Jul 3, 2024
9ba89e5
post-merge fixes + minor refactor
ds-jakub-cierocki Jul 3, 2024
5fd802f
added missing docstrings; fixed type hints; fixed issues detected by …
ds-jakub-cierocki Jul 4, 2024
09bac55
reworked parse_param_type() function to increase performance, general…
ds-jakub-cierocki Jul 4, 2024
d42a369
fix: removed duplicated line from the prompt template
ds-jakub-cierocki Jul 4, 2024
c0b0522
adjusted existing unit tests to work with new contextualization logic
ds-jakub-cierocki Jul 4, 2024
9b2e131
linter-recommended fixes
ds-jakub-cierocki Jul 4, 2024
2d0ef4b
contextualization mechanism - dedicated unit tests
ds-jakub-cierocki Jul 5, 2024
6466f61
cleaned up overengineered code remanining from the previous iteration…
ds-jakub-cierocki Jul 5, 2024
637f7fa
replaced pydantic.BaseModel by dataclasses.dataclass, pydantic no lon…
ds-jakub-cierocki Jul 8, 2024
f867e25
BaseCallerContext: dataclass w.o. fields -> interface (abstract class…
ds-jakub-cierocki Jul 8, 2024
3423033
LLM now pastes Context() instead of BaseCallerContext() to indicate t…
ds-jakub-cierocki Jul 8, 2024
0d8cd1e
docstring typo fixes; more precise return type hint
ds-jakub-cierocki Jul 9, 2024
c97ba15
renamed Context() -> AskerContext(); added more detailed detailed exa…
ds-jakub-cierocki Jul 9, 2024
1294a9c
type hint parsing changes: SomeCustomContext -> AskerContext; Union[a…
ds-jakub-cierocki Jul 9, 2024
999759b
refactor: collection.results.[ViewExecutionResult, ExecutionResult]."…
ds-jakub-cierocki Jul 12, 2024
2e1005a
param type parsing: correctly handling builtins types with args (e.g.…
ds-jakub-cierocki Jul 12, 2024
820066d
type hint fix: explcitly marked BaseCallerContext.alias as typing.Cla…
ds-jakub-cierocki Jul 12, 2024
25fbfa6
docs + benchmarks adjusted to meet new naming [ExecutionResult, ViewE…
ds-jakub-cierocki Jul 15, 2024
a154577
redesigned context-not-available error to follow the same principles …
ds-jakub-cierocki Jul 15, 2024
623effd
EXPERIMENTAL: reworked context injection such it is handled immediate…
ds-jakub-cierocki Jul 15, 2024
afacf5b
additional unit tests for the new contextualization mechanism
ds-jakub-cierocki Jul 19, 2024
dd8b339
context benchmark script and data
ds-jakub-cierocki Jul 22, 2024
6bb0816
refactored main prompt (too long lines), missing end-of-line characters
ds-jakub-cierocki Jul 22, 2024
f388f92
better error handling
ds-jakub-cierocki Jul 22, 2024
fbecc51
context benchmark dataset fix
ds-jakub-cierocki Jul 23, 2024
5d4ff64
added polars-based accuracy summary to the benchmark
ds-jakub-cierocki Jul 23, 2024
e7e8826
adjusted prompt to reduce halucinations: nested filter/context calls …
ds-jakub-cierocki Jul 23, 2024
f8bf64e
merged main (inc. new benchmarks + large refactor) -> jc/issue-54-req…
ds-jakub-cierocki Aug 7, 2024
c1c871b
merge main
micpst Sep 23, 2024
8eefd9b
fix linters
micpst Sep 23, 2024
c28091f
fix tests
micpst Sep 23, 2024
69a8d58
fix tests
micpst Sep 23, 2024
d6c8fc6
fix tests
micpst Sep 23, 2024
d7026d4
rm old benchmarks
micpst Sep 23, 2024
e8271ac
some renames and stuff
micpst Sep 23, 2024
bdcc7b3
fix benchmarks
micpst Sep 23, 2024
71f53be
merge main
micpst Sep 25, 2024
c82e579
rm chroma file
micpst Sep 25, 2024
f5a40cb
add contexts to benchmarks + fix types
micpst Sep 30, 2024
fab9d3f
small refactor
micpst Oct 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
post-merge fixes + minor refactor
  • Loading branch information
ds-jakub-cierocki committed Jul 3, 2024
commit 9ba89e5e43e365cf8fcc105d4eedbff095295d8a
4 changes: 2 additions & 2 deletions src/dbally/collection/collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ async def ask(
dry_run: bool = False,
return_natural_response: bool = False,
llm_options: Optional[LLMOptions] = None,
context: Optional[CustomContextsList] = None
contexts: Optional[CustomContextsList] = None
) -> ExecutionResult:
"""
Ask question in a text form and retrieve the answer based on the available views.
Expand Down Expand Up @@ -217,7 +217,7 @@ async def ask(
n_retries=self.n_retries,
dry_run=dry_run,
llm_options=llm_options,
context=context
contexts=contexts
)
end_time_view = time.monotonic()

Expand Down
4 changes: 2 additions & 2 deletions src/dbally/iql/_query.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ async def parse(
source: str,
allowed_functions: List["ExposedFunction"],
event_tracker: Optional[EventTracker] = None,
context: Optional[CustomContextsList] = None
contexts: Optional[CustomContextsList] = None
) -> Self:
"""
Parse IQL string to IQLQuery object.
Expand All @@ -43,5 +43,5 @@ async def parse(
IQLQuery object
"""

root = await IQLProcessor(source, allowed_functions, context, event_tracker).process()
root = await IQLProcessor(source, allowed_functions, contexts, event_tracker).process()
return cls(root=root, source=source)
8 changes: 6 additions & 2 deletions src/dbally/iql_generator/iql_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
from dbally.prompt.elements import FewShotExample
from dbally.prompt.template import PromptTemplate
from dbally.views.exposed_functions import ExposedFunction
from dbally.context.context import CustomContextsList

ERROR_MESSAGE = "Unfortunately, generated IQL is not valid. Please try again, \
generation of correct IQL is very important. Below you have errors generated by the system:\n{error}"
Expand Down Expand Up @@ -42,7 +43,8 @@ async def generate_iql(
examples: Optional[List[FewShotExample]] = None,
llm_options: Optional[LLMOptions] = None,
n_retries: int = 3,
) -> IQLQuery:
contexts: Optional[CustomContextsList] = None
) -> Optional[IQLQuery]:
"""
Generates IQL in text form using LLM.

Expand All @@ -60,7 +62,7 @@ async def generate_iql(
prompt_format = IQLGenerationPromptFormat(
question=question,
filters=filters,
examples=examples,
examples=examples or [],
micpst marked this conversation as resolved.
Show resolved Hide resolved
)
formatted_prompt = self._prompt_template.format_prompt(prompt_format)

Expand All @@ -78,7 +80,9 @@ async def generate_iql(
source=iql,
allowed_functions=filters,
event_tracker=event_tracker,
contexts=contexts
)
except IQLError as exc:
# TODO handle the possibility of variable `response` being not initialized while runnning the following line
formatted_prompt = formatted_prompt.add_assistant_message(response)
formatted_prompt = formatted_prompt.add_user_message(ERROR_MESSAGE.format(error=exc))
73 changes: 0 additions & 73 deletions src/dbally/iql_generator/iql_prompt_template.py

This file was deleted.

5 changes: 5 additions & 0 deletions src/dbally/iql_generator/prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,11 @@ def __init__(
"You MUST use only these methods:\n"
"\n{filters}\n"
"It is VERY IMPORTANT not to use methods other than those listed above."
"If a called function argument value is not directly specified in the query but instead requires knowledge of some additional context, than substitute that argument value by: BaseCallerContext()."
'The typical input phrase referencing some additional context contains the word "my" or similar phrasing, e.g. "my position name", "my company valuation".'
"In that case, the part of the output will look like this:"
"filter4(BaseCallerContext())"
"It is VERY IMPORTANT not to use methods other than those listed above."
"""If you DON'T KNOW HOW TO ANSWER DON'T SAY \"\", SAY: `UNSUPPORTED QUERY` INSTEAD! """
"This is CRUCIAL, otherwise the system will crash. "
),
Expand Down
2 changes: 1 addition & 1 deletion src/dbally/views/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ async def ask(
n_retries: int = 3,
dry_run: bool = False,
llm_options: Optional[LLMOptions] = None,
context: Optional[CustomContextsList] = None
contexts: Optional[CustomContextsList] = None
) -> ViewExecutionResult:
"""
Executes the query and returns the result.
Expand Down
3 changes: 2 additions & 1 deletion src/dbally/views/structured.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ async def ask(
n_retries: int = 3,
dry_run: bool = False,
llm_options: Optional[LLMOptions] = None,
context: Optional[CustomContextsList] = None
contexts: Optional[CustomContextsList] = None
) -> ViewExecutionResult:
"""
Executes the query and returns the result. It generates the IQL query from the natural language query\
Expand Down Expand Up @@ -71,6 +71,7 @@ async def ask(
event_tracker=event_tracker,
llm_options=llm_options,
n_retries=n_retries,
contexts=contexts
)

await self.apply_filters(iql)
Expand Down
Loading