Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge extended_conditions into main repository #35

Open
wants to merge 194 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
194 commits
Select commit Hold shift + click to select a range
4343dff
dev: commit all
kudep Oct 24, 2022
cd218c0
Add source, example, test files
RLKRo Nov 3, 2022
9b706b0
Move df_extended_conditions to dff/script/logic/extended_conditions
RLKRo Nov 3, 2022
46cbc61
Move examples to examples/extended_conditions
RLKRo Nov 3, 2022
33a1dbe
Move tests to tests/extended_conditions
RLKRo Nov 3, 2022
9a57271
Replace old addon names
RLKRo Nov 3, 2022
ea80beb
refactor: remove info from __init__.py
RLKRo Nov 9, 2022
de4c2e3
fix: references to files in examples
RLKRo Nov 9, 2022
b55e5b8
Add rasa docker
RLKRo Nov 9, 2022
e4a3853
Update setup.py
RLKRo Nov 9, 2022
8ed8b1c
Add env variables
RLKRo Nov 9, 2022
4d70a8a
test: partially fix huggingface tests
RLKRo Nov 9, 2022
2795e73
merge moved examples
ruthenian8 Dec 1, 2022
7230f62
\nremove utils, \nmigrate examples to pipeline, \nadd readme, \nalter…
ruthenian8 Dec 5, 2022
e13e2dc
reformat docs for RST, debug hf models: save dataset for hf matcher
ruthenian8 Dec 7, 2022
c7a1f44
debug examples #1: introduce skip conditions
ruthenian8 Dec 7, 2022
f33d657
update documentation and examples: add docstrings, module-level docs"
ruthenian8 Dec 11, 2022
f66d67e
Merge branch 'rdev' into merge/extended_conditions
ruthenian8 Dec 27, 2022
3f1563c
Apply formatting:
ruthenian8 Dec 27, 2022
15cc5aa
update references in tests
ruthenian8 Dec 27, 2022
b448d9b
fix tests for remote execution
ruthenian8 Dec 28, 2022
57407d5
Merge branch 'dev' into merge/extended_conditions
ruthenian8 Dec 28, 2022
e3ad218
format test_dialogflow.py
ruthenian8 Dec 28, 2022
fa8feee
Fix CI problems:
ruthenian8 Dec 29, 2022
f92b95d
Alter testing options:
ruthenian8 Dec 29, 2022
31512d9
Change deployment options:
ruthenian8 Dec 29, 2022
b9372f3
fix tests for rasa & dialogflow
ruthenian8 Dec 29, 2022
cefddaa
improve coverage by removing untested lines and adding new tests
ruthenian8 Dec 30, 2022
6a31168
revert rasa example
ruthenian8 Dec 30, 2022
df76f85
reformat rasa example
ruthenian8 Dec 30, 2022
ce5af99
debug Dataset class: allow instantiation from list
ruthenian8 Jan 4, 2023
aa5197a
adjust examples for doc building
ruthenian8 Jan 9, 2023
4381faa
rewrite examples
ruthenian8 Jan 13, 2023
ef641ac
merge remote dev
ruthenian8 Mar 6, 2023
3c64aae
adapt for Message class
ruthenian8 Mar 6, 2023
2e5fa91
format file headers; alter coverage.yml
ruthenian8 Mar 6, 2023
6bf99f3
add device to hf example; remove hf from .env_file
ruthenian8 Mar 6, 2023
9fc1eca
add parameters to BaseModel abstract class; change build_docs.yml
ruthenian8 Mar 6, 2023
ff16b82
fix rasa random_seed for training uniformity
ruthenian8 Mar 7, 2023
3abaa3e
rasa add random_seed
ruthenian8 Mar 7, 2023
abd7767
Update dockerfile_extended_conditions
ruthenian8 Mar 7, 2023
7e65385
use ast.literal_eval to circumvent file creation
ruthenian8 Mar 7, 2023
e0002b5
redefine skip conditions for tests; update docs
ruthenian8 Mar 7, 2023
8bec534
remove skip conditions for test_dialogflow
ruthenian8 Mar 7, 2023
c86939e
docs: fix warnings
avsakharov Mar 9, 2023
19d0bc8
change workflow for build docs
ruthenian8 Mar 10, 2023
f66229e
correct typo
ruthenian8 Mar 10, 2023
a7a094a
fix typo
ruthenian8 Mar 10, 2023
e496c37
add python hash seed to .env_file
ruthenian8 Mar 11, 2023
e10ead8
rework gensim example
ruthenian8 Mar 14, 2023
03de8b5
change thresholds for gensim example && remove variables from test_full
ruthenian8 Mar 14, 2023
2a3ffa1
use correct url && remove unused imports
ruthenian8 Mar 14, 2023
8c4e6f5
remove old code from test_dialogflow
ruthenian8 Mar 14, 2023
63fe863
use word2vec format to avoid problems with pickle.load
ruthenian8 Mar 14, 2023
77f4d94
employ additional skip conditions for examples; change threshold in g…
ruthenian8 Mar 15, 2023
5d72701
remove sklearn dependency from conftest; check spelling; import sklea…
ruthenian8 Mar 15, 2023
64f9def
circumvent import errors from pyyaml; remove torch.device from type a…
ruthenian8 Mar 15, 2023
7cd57de
Merge branch 'dev' into merge/extended_conditions
ruthenian8 Mar 15, 2023
3361577
add empty line to test_full
ruthenian8 Mar 15, 2023
73b0d5c
circumvent 'import joblib' error in 'test_no_deps'
ruthenian8 Mar 15, 2023
677ff33
import numpy in test_sklearn after skip_conditions
ruthenian8 Mar 15, 2023
64d643b
change docs for modules
ruthenian8 Mar 15, 2023
ef8659d
apply lint
ruthenian8 Mar 15, 2023
29af70a
Update documentation for extra_conditions
ruthenian8 Mar 16, 2023
388e5a5
document utils; change header for hf_api_model
ruthenian8 Mar 16, 2023
77f6971
apply lint: invalid docs in utils
ruthenian8 Mar 16, 2023
50ad934
merge dev into extended conditions
ruthenian8 Nov 29, 2023
1713194
partial fix of tests
ruthenian8 Nov 29, 2023
752b438
Update workflows
ruthenian8 Nov 29, 2023
9a2dc1b
Update setup.py
ruthenian8 Nov 30, 2023
75017cd
Update setup.py
ruthenian8 Nov 30, 2023
b4d80dd
Update setup.py
ruthenian8 Nov 30, 2023
eb96c42
correct setup.py
ruthenian8 Nov 30, 2023
c9e0a34
Update conftest.py
ruthenian8 Nov 30, 2023
3ced545
update tutorials; use categorical_code as normal attribute
ruthenian8 Nov 30, 2023
90e8f25
Update pytest markers
ruthenian8 Dec 1, 2023
a58e5bd
Update tutors
ruthenian8 Dec 1, 2023
8794344
update test_full
ruthenian8 Dec 1, 2023
25b6d2d
Update docs & code comments
ruthenian8 Dec 4, 2023
bd7355a
require requests for extended conditions; update requirements in tuto…
ruthenian8 Dec 4, 2023
4614c93
rename BaseModel to ExtrasBaseModel
ruthenian8 Dec 4, 2023
f89605d
set up GDF in test_full
ruthenian8 Dec 4, 2023
8ef14f6
add debug print to test_tutors
ruthenian8 Dec 4, 2023
881826e
add realpath directives to workflows; alter transformers version
ruthenian8 Dec 4, 2023
fa2d619
Update env variables
ruthenian8 Dec 4, 2023
e533d5b
Update imports
ruthenian8 Dec 4, 2023
b9d90b5
Update hf example
ruthenian8 Dec 4, 2023
98fbda2
configure softmax from dim=0 to dim=1
ruthenian8 Dec 4, 2023
cdf8ec0
Update happy path
ruthenian8 Dec 4, 2023
5fbc16f
Merge branch 'dev' into merge/extended_conditions
RLKRo Dec 13, 2023
9488b6c
Updated extra dependencies
NotBioWaste Jul 1, 2024
564601f
Added ext profile to CONTRIBUTING.md
NotBioWaste Jul 1, 2024
ca02a08
Fix typo
NotBioWaste Jul 1, 2024
a0edbc6
Merge remote-tracking branch 'origin/dev' into merge/extended_conditions
NotBioWaste Jul 5, 2024
cd6c025
Reworking namespaces and label caching
NotBioWaste Jul 10, 2024
3809d57
Moved llm_conditions to
NotBioWaste Jul 11, 2024
d520d0c
Fixed models call
NotBioWaste Jul 11, 2024
fd77a11
Fixed dependecies and references to modules
NotBioWaste Jul 15, 2024
d2d3680
Fixed tests, rewriting tutorials
NotBioWaste Jul 18, 2024
8ace188
Added caching for async API calls, working on async ExtrasBaseAPIModel
NotBioWaste905 Jul 19, 2024
f639141
Removed local models, updated tutorials
NotBioWaste905 Jul 19, 2024
99ced4d
Fixed namespace reference
NotBioWaste905 Jul 19, 2024
f15be68
Started working on llm_responses
NotBioWaste905 Jul 19, 2024
7dd03a1
Fixed typos in tutorials
NotBioWaste Jul 22, 2024
56b7789
Created class, created 1st tutorial
NotBioWaste Jul 22, 2024
af60115
Added dependecies for langchain
NotBioWaste Jul 22, 2024
b3b79a5
Fixed adding custom prompt for each node
NotBioWaste Jul 22, 2024
6eb910d
Added image processing, updated tutorial
NotBioWaste Jul 22, 2024
1f8cddc
Added typehint
NotBioWaste Jul 22, 2024
74cd954
Added llm_response, LLM_API, history management
NotBioWaste Jul 22, 2024
1fd31a2
Fixed image reading
NotBioWaste Jul 22, 2024
2c48490
Started llm condition
NotBioWaste Jul 24, 2024
a1884e5
Added message_to_langchain
NotBioWaste Jul 24, 2024
61f302e
Implementing deepeval integration
NotBioWaste Jul 29, 2024
38a8f8f
Figured out how to implement DeepEval functions
NotBioWaste905 Jul 30, 2024
592267f
Adding conditions
NotBioWaste Jul 31, 2024
baccc47
Implemented simple conditions call, added BaseMethod class, renaming,…
NotBioWaste Aug 1, 2024
8e84ba1
Fixed history extraction
NotBioWaste Aug 2, 2024
2b2847b
Delete test_bot.py
NotBioWaste905 Aug 2, 2024
7e336ac
Fixed prompt handling, switched to AIMessage in LLM response
NotBioWaste Aug 5, 2024
71babbf
Merge branch 'feat/llm_responses' of https://github.com/deeppavlov/di…
NotBioWaste Aug 5, 2024
351ae06
Fixed conditions call
NotBioWaste Aug 5, 2024
e3d0d15
Working on autotesting
NotBioWaste Aug 5, 2024
0405998
Added tests
NotBioWaste Aug 7, 2024
3dbfd0c
Removed unused method
NotBioWaste Aug 7, 2024
5c876ba
Added annotations
NotBioWaste Aug 7, 2024
8f1932c
Added structured output support, tweaked tests
NotBioWaste Aug 7, 2024
aedf47e
Reworking tutorials
NotBioWaste Aug 7, 2024
adadb05
Reworked prompt usage and hierarchy, reworked filters and methods
NotBioWaste Aug 12, 2024
0288896
No idea how to make script smaller in tutorials
NotBioWaste Aug 12, 2024
67e2758
Small fixes in tutorials and structured generation
NotBioWaste Aug 13, 2024
428a9f0
Working on user guide
NotBioWaste Aug 14, 2024
5e26b4b
Fixed some tutorials, finished user guide
NotBioWaste Aug 14, 2024
5dbb6cd
Bugfixes in docs
NotBioWaste Aug 14, 2024
db63d1a
Lint
NotBioWaste Aug 14, 2024
2b9080f
Removed type annotation that broke docs building
NotBioWaste Aug 14, 2024
2bcda71
Tests and bugfixes
NotBioWaste Aug 15, 2024
d2f28ed
Deleted DeepEval references
NotBioWaste Aug 15, 2024
7318c91
Numpy versions trouble
NotBioWaste Aug 15, 2024
27eae27
Fixed dependecies
NotBioWaste Aug 16, 2024
3fed1fc
Made everything asynchronous
NotBioWaste Aug 16, 2024
30862ca
Added and unified docstring
NotBioWaste Aug 16, 2024
06ab5bc
Added 4th tutorial, fixed message_schema parameter passing
NotBioWaste Aug 16, 2024
798a77b
Bugfix, added max_size to the message_to_langchain function
NotBioWaste Aug 20, 2024
3343159
Made even more everything asynchronous
NotBioWaste Aug 21, 2024
014ff7e
Remade condition, added logprob check
NotBioWaste Aug 21, 2024
761bd81
Async bugfix, added model_result_to_text, working on message_schema f…
NotBioWaste Aug 22, 2024
90a811e
Minor fixes, tinkering tests
NotBioWaste Aug 23, 2024
5bff191
Merge branch 'refs/heads/dev' into feat/llm_responses
RLKRo Aug 23, 2024
8b88ba6
update lock file
RLKRo Aug 23, 2024
20c4afd
Merge remote-tracking branch 'origin/feat/llm_responses' into feat/ll…
RLKRo Aug 23, 2024
0139421
Merge remote-tracking branch 'origin/master' into feat/llm_responses
NotBioWaste905 Sep 18, 2024
9bb0cba
Updating to v1.0
NotBioWaste905 Sep 23, 2024
f2d6b68
Finished tests, finished update
NotBioWaste905 Sep 26, 2024
6fddaea
lint
NotBioWaste905 Sep 26, 2024
e06bc2b
Started working on llm slots
NotBioWaste905 Sep 26, 2024
22d8efc
Resolving pydantic errors
NotBioWaste905 Sep 27, 2024
aa735b5
Delete llmslot_test.py
NotBioWaste905 Sep 27, 2024
cc91133
Finished LLMSlot, working on LLMGroupSlot
NotBioWaste905 Sep 27, 2024
8756838
Merge remote-tracking branch 'origin/feat/llm_responses' into feat/ll…
NotBioWaste905 Sep 27, 2024
f1857f6
Added flag to
NotBioWaste905 Oct 1, 2024
c334ff5
First test attempts
NotBioWaste905 Oct 1, 2024
8306bbb
linting
NotBioWaste905 Oct 1, 2024
f842776
Merge branch 'feat/slots_extraction_update' into feat/llm_responses
NotBioWaste905 Oct 1, 2024
ada17ca
Merge remote-tracking branch 'origin/feat/llm_responses' into feat/ll…
NotBioWaste905 Oct 1, 2024
a45f653
File structure fixed
NotBioWaste905 Oct 3, 2024
3838d30
Fixed naming
NotBioWaste905 Oct 3, 2024
0e650f8
Create LLMCondition and LLMResponse classes
NotBioWaste905 Oct 3, 2024
ca79f94
Merge branch 'dev' into merge/extended_conditions
NotBioWaste905 Oct 9, 2024
015cb4f
Debugging flattening
NotBioWaste905 Oct 23, 2024
b6e5eeb
Bugfix
NotBioWaste905 Oct 23, 2024
b20137e
Added return_type property for LLMSlot
NotBioWaste905 Oct 23, 2024
25f5b04
Changed return_type from Any to type
NotBioWaste905 Oct 23, 2024
b651087
lint
NotBioWaste905 Oct 23, 2024
284555d
Fixed dependency namings
NotBioWaste905 Oct 23, 2024
354b51d
Fixed singledispatch
NotBioWaste905 Oct 24, 2024
640aeb3
Removed Dataset and ExtrasBaseModel, created HasLabel and HasMatch co…
NotBioWaste905 Oct 28, 2024
ee7d5e2
Removed deprecated files
NotBioWaste905 Oct 28, 2024
492239d
Deleted synchronous variants, removed property models_labels from Con…
NotBioWaste905 Oct 28, 2024
474cd7f
Deleted unused modules, merged classes with their abstract variants
NotBioWaste905 Oct 30, 2024
1b5a77b
removed deprecated from_script from tutorials
NotBioWaste905 Nov 2, 2024
c18d375
Fixed LLMCondition class
NotBioWaste905 Nov 2, 2024
e884494
Removed inner functions, fixed signatures in conditions
NotBioWaste905 Nov 2, 2024
459f7fc
Fixed missing 'models' field in Pipeline, updated tutorials
NotBioWaste905 Nov 6, 2024
57a2d9d
Merge branch 'feat/llm_responses' into merge/extended_conditions
NotBioWaste905 Nov 6, 2024
96545dc
Revert "Merge branch 'feat/llm_responses' into merge/extended_conditi…
NotBioWaste905 Feb 19, 2025
29815b6
Moved models from remote_api directory
NotBioWaste905 Feb 19, 2025
074bd4e
Removed unused methods
NotBioWaste905 Feb 19, 2025
463a309
lint
NotBioWaste905 Feb 19, 2025
bb9aa58
Merge remote-tracking branch 'origin/dev' into merge/extended_conditions
NotBioWaste905 Feb 19, 2025
feab6c0
Refactor Rasa integration: update model classes and remove unused uti…
NotBioWaste905 Feb 20, 2025
a2c5290
Reworked a tutorial
NotBioWaste905 Feb 27, 2025
d7746e7
Now models are stored in pipeline
NotBioWaste905 Feb 27, 2025
b4a43a9
Working on models_labels field
NotBioWaste905 Feb 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .env_file
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,12 @@ MON_PORT=8765
YDB_ENDPOINT=grpc://localhost:2136
YDB_DATABASE=/local
YDB_ANONYMOUS_CREDENTIALS=1
RASA_API_KEY=rasa
PYTHONHASHSEED=42
CLICKHOUSE_DB=test
CLICKHOUSE_USER=username
CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT=1
CLICKHOUSE_PASSWORD=pass
SUPERSET_USERNAME=superset
SUPERSET_PASSWORD=superset
SUPERSET_METADATA_PORT=5433
SUPERSET_METADATA_PORT=5433
12 changes: 12 additions & 0 deletions .github/workflows/test_coverage.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,21 @@ jobs:
python -m pip install --upgrade pip poetry==1.8.5
python -m poetry install --with test,tutorials --all-extras --no-ansi --no-interaction

- name: Create gdf_account.json
uses: jsdaniell/[email protected]
with:
name: "gdf_account.json"
json: ${{ secrets.GDF_ACCOUNT_JSON }}

- name: write realpath to env
run: |
echo "GDF_ACCOUNT_JSON=$(realpath ./gdf_account.json)" >> $GITHUB_ENV

- name: run tests
env:
TG_BOT_TOKEN: ${{ secrets.TG_BOT_TOKEN }}
TG_BOT_USERNAME: ${{ secrets.TG_BOT_USERNAME }}
HF_API_KEY: ${{ secrets.HF_API_KEY }}
GDF_ACCOUNT_JSON: ${{ env.GDF_ACCOUNT_JSON }}
run: |
python -m poetry run poe test_all
24 changes: 24 additions & 0 deletions .github/workflows/test_full.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,22 @@ jobs:
python -m pip install --upgrade pip poetry==1.8.5
python -m poetry install --with test,tutorials --all-extras --no-ansi --no-interaction

- name: Create gdf_account.json
uses: jsdaniell/[email protected]
with:
name: "gdf_account.json"
json: ${{ secrets.GDF_ACCOUNT_JSON }}

- name: write realpath to env
run: |
echo "GDF_ACCOUNT_JSON=$(realpath ./gdf_account.json)" >> $GITHUB_ENV

- name: run pytest
env:
TG_BOT_TOKEN: ${{ secrets.TG_BOT_TOKEN }}
TG_BOT_USERNAME: ${{ secrets.TG_BOT_USERNAME }}
HF_API_KEY: ${{ secrets.HF_API_KEY }}
GDF_ACCOUNT_JSON: ${{ env.GDF_ACCOUNT_JSON }}
run: |
python -m poetry run poe test_no_cov

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leave new lines be, please.

Expand All @@ -59,9 +71,21 @@ jobs:
python -m pip install --upgrade pip poetry==1.8.5
python -m poetry install --with test --no-ansi --no-interaction

- name: Create gdf_account.json
uses: jsdaniell/[email protected]
with:
name: "gdf_account.json"
json: ${{ secrets.GDF_ACCOUNT_JSON }}

- name: write realpath to env
run: |
echo "GDF_ACCOUNT_JSON=$(realpath ./gdf_account.json)" >> $GITHUB_ENV

- name: run pytest
env:
TG_BOT_TOKEN: ${{ secrets.TG_BOT_TOKEN }}
TG_BOT_USERNAME: ${{ secrets.TG_BOT_USERNAME }}
HF_API_KEY: ${{ secrets.HF_API_KEY }}
GDF_ACCOUNT_JSON: ${{ env.GDF_ACCOUNT_JSON }}
run: |
python -m poetry run poe test_no_deps
11 changes: 9 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,7 @@ Tests are configured via [`.env_file`](.env_file).
Chatsky uses docker images for two purposes:
1. Database images for integration testing.
2. Images for statistics collection.
3. Setting up Rasa framework for working with extended conditions.

The first group can be launched via

Expand All @@ -164,9 +165,15 @@ docker compose --profile stats up

This will download and launch Superset Dashboard, Clickhouse, OpenTelemetry Collector.

To launch both groups run
The third group can be launched via

```bash
docker compose --profile ext up
```

To launch all groups run
```bash
docker compose --profile context_storage --profile stats up
docker compose --profile context_storage --profile stats --profile ext up
```

This will be done automatically when running `poetry run poe test_all`.
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
![Chatsky](https://raw.githubusercontent.com/deeppavlov/chatsky/master/docs/source/_static/images/Chatsky-full-dark.svg)
![Chatsky](docs/source/_static/images/Chatsky-full-dark.svg)

[![Documentation Status](https://github.com/deeppavlov/chatsky/workflows/build_and_publish_docs/badge.svg?branch=dev)](https://deeppavlov.github.io/chatsky)
[![Codestyle](https://github.com/deeppavlov/chatsky/workflows/codestyle/badge.svg?branch=dev)](https://github.com/deeppavlov/chatsky/actions/workflows/codestyle.yml)
Expand Down
2 changes: 2 additions & 0 deletions chatsky/__rebuild_pydantic_models__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
from chatsky.core.ctx_utils import ServiceState, FrameworkData, ContextMainInfo
from chatsky.core.service import PipelineComponent
from chatsky.llm import LLM_API
from chatsky.ml.models.base_model import ExtrasBaseAPIModel
from chatsky.ml.models.hf_api_model import HFAPIModel

ContextMainInfo.model_rebuild()
ContextDict.model_rebuild()
Expand Down
85 changes: 85 additions & 0 deletions chatsky/conditions/ml.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""
Conditions
------------

This module provides condition functions for annotation processing.
"""

from typing import Optional, List

try:
# !!! remove sklearn, use pure python instead
from sklearn.metrics.pairwise import cosine_similarity

sklearn_available = True
except ImportError:
sklearn_available = False
from chatsky import Context
from chatsky.conditions.standard import BaseCondition
from chatsky.ml.models.base_model import ExtrasBaseAPIModel


class HasLabel(BaseCondition):
"""
Use this condition, when you need to check, whether the probability
of a particular label for the last annotated user utterance surpasses the threshold.

:param label: String name or a reference to a DatasetItem object, or a collection thereof.
:param namespace: Namespace key of a particular model that should detect the dataset_item.
If not set, all namespaces will be searched for the required dataset_item.
:param threshold: The minimal label probability that triggers a positive response
from the function.
"""

label: str
model_name: str
threshold: float = 0.9

async def call(self, ctx: Context) -> bool:
model = ctx.pipeline.models[self.model_name]
# Predict labels for the last request
# and store them in framework_data with uuid of the model as a key
await model(ctx)
if model.model_id not in ctx.framework_data.models_labels:
return False
if model.model_id is not None:
return ctx.framework_data.models_labels.get(model.model_id, {}).get(self.label, 0) >= self.threshold
scores = [item.get(self.label, 0) for item in ctx.framework_data.models_labels.values()]
comparison_array = [item >= self.threshold for item in scores]
return any(comparison_array)


class HasMatch(BaseCondition):
"""
Use this condition, if you need to check whether the last request matches
any of the pre-defined intent utterances.
The model passed to this function should be in the fit state.

:param model: Any model from the :py:mod:`~chatsky.ml.models.local.cosine_matchers` module.
:param positive_examples: Utterances that the request should match.
:param negative_examples: Utterances that the request should not match.
:param threshold: Similarity threshold that triggers a positive response from the function.
"""

model_name: str
positive_examples: Optional[List[str]]
negative_examples: Optional[List[str]] = []
threshold: float = 0.9

def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)

async def call(self, ctx: Context) -> bool:
if not (ctx.last_request and ctx.last_request.text):
return False

model = ctx.pipeline.models[self.model_name]

input_vector = model.transform(ctx.last_request.text)
positive_vectors = [model.transform(item) for item in self.positive_examples]
negative_vectors = [model.transform(item) for item in self.negative_examples]
positive_sims = [cosine_similarity(input_vector, item)[0][0] for item in positive_vectors]
negative_sims = [cosine_similarity(input_vector, item)[0][0] for item in negative_vectors]
max_pos_sim = max(positive_sims)
max_neg_sim = 0 if len(negative_sims) == 0 else max(negative_sims)
return bool(max_pos_sim > self.threshold > max_neg_sim)
5 changes: 5 additions & 0 deletions chatsky/core/ctx_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,11 @@ class FrameworkData(BaseModel, arbitrary_types_allowed=True):
"Enables complex stats collection across multiple turns."
slot_manager: SlotManager = Field(default_factory=SlotManager)
"Stores extracted slots."
models_labels: Dict[str, Dict[str, float]] = Field(default_factory=dict)
"""
Stores labels predicted by models.
The key is the model id, the value is a dictionary with labels and their probabilities.
"""


class ContextMainInfo(BaseModel):
Expand Down
3 changes: 2 additions & 1 deletion chatsky/core/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
from chatsky.core.script_parsing import JSONImporter, Path

if TYPE_CHECKING:
from chatsky.ml.models.base_model import ExtrasBaseAPIModel
from chatsky.llm.llm_api import LLM_API

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -82,7 +83,7 @@ class Pipeline(BaseModel, extra="forbid", arbitrary_types_allowed=True):
"""
Slots configuration.
"""
models: Dict[str, LLM_API] = Field(default_factory=dict)
models: Dict[str, Union[LLM_API, ExtrasBaseAPIModel]] = Field(default_factory=dict)
"""
LLM models to be made available in custom functions.
"""
Expand Down
1 change: 1 addition & 0 deletions chatsky/ml/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# -*- coding: utf-8 -*-
3 changes: 3 additions & 0 deletions chatsky/ml/models/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from .google_dialogflow_model import GoogleDialogFlowModel # noqa: F401
from .rasa_model import RasaModel # noqa: F401
from .hf_api_model import HFAPIModel # noqa: F401
58 changes: 58 additions & 0 deletions chatsky/ml/models/base_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
"""
Base Model
-----------
This module defines an abstract interface for label-scoring models, :py:class:`~ExtrasBaseModel`.
When defining custom label-scoring models, always inherit from this class.
"""

from copy import copy
from abc import ABC, abstractmethod

from chatsky import Context

import uuid


class ExtrasBaseAPIModel(ABC):
"""
Base class for label-scoring models running on remote server and accessed via API.
Predicted scores for labels are stored in :py:class:`~chatsky.script.Context.framework_data`.
"""

def __init__(self) -> None:
self.model_id = uuid.uuid4()

def __deepcopy__(self, *args, **kwargs):
return copy(self)

@abstractmethod
async def predict(self, request: str) -> dict:
"""
Predict the probability of one or several classes.

:param request: Target request string.
"""
raise NotImplementedError

async def transform(self, request: str):
"""
Get a numeric representation of the input data.

:param request: Target request string.
"""
raise NotImplementedError

async def __call__(self, ctx: Context):
"""
Saves the retrieved labels to a subspace inside the `framework_states` field of the context.
Creates the missing namespaces, if necessary.
"""

if ctx.last_request and ctx.last_request.text:
labels: dict = await self.predict(ctx.last_request.text)
else:
labels = dict()

ctx.framework_data.models_labels[self.model_id] = labels

return ctx
84 changes: 84 additions & 0 deletions chatsky/ml/models/google_dialogflow_model.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""
Google Dialogflow Model
------------------------

The module allows you to use Google Dialogflow as a service
to gain insights about user intents.
"""

import uuid
import json
from pathlib import Path
from async_lru import alru_cache

from chatsky.ml.models.base_model import ExtrasBaseAPIModel

try:
from google.cloud import dialogflow_v2
from google.oauth2 import service_account

dialogflow_available = True
except ImportError:
dialogflow_v2 = None
service_account = None
dialogflow_available = False


class GoogleDialogFlowModel(ExtrasBaseAPIModel):
"""
This class implements an asynchronous connection to Google Dialogflow for dialog annotation.
Note, that before you use the class, you need to set up a Dialogflow project,
create intents, and train a language model, which can be easily done
using the Dialogflow web interface (see the official
`instructions <https://cloud.google.com/dialogflow/es/docs/quick/build-agent>`_).
After this is done, you should obtain a service account JSON file from Google
and pass it to this class, using :py:meth:`~from_file` method.

:param model: A parsed service account json for your dialogflow project.
Calling json.load() on the file obtained from Google is sufficient to get the
credentials object. Alternatively, use :py:meth:`~from_file` method.
:param namespace_key: Name of the namespace in framework states that the model will be using.
:param language: Language parameter is passed to the Dialogflow wrapper.
"""

def __init__(
self,
model: dict,
*,
language: str = "en",
) -> None:
if not dialogflow_available:
raise ImportError("`google-cloud-dialogflow` package missing. Try `pip install chatsky[dialogflow]`.")
super().__init__()
self._language = language
if isinstance(model, dict):
info = model
else:
raise ValueError("Please, pass the service account credentials as dict.")

self._credentials = service_account.Credentials.from_service_account_info(info)

@classmethod
def from_file(cls, filename: str, language: str = "en"):
"""
:param filename: Path to the Dialogflow credentials saved as JSON.
:param language: The language parameter is forwarded to the underlying
Dialogflow wrapper.
"""
assert Path(filename).exists(), f"Path {filename} does not exist."
with open(filename, "r", encoding="utf-8") as file:
info = json.load(file)
return cls(model=info, language=language)

@alru_cache(maxsize=10)
async def predict(self, request: str) -> dict:
session_id = uuid.uuid4()
session_client = dialogflow_v2.SessionsAsyncClient(credentials=self._credentials)
session_path = session_client.session_path(self._credentials.project_id, session_id)
query_input = dialogflow_v2.QueryInput(text=dialogflow_v2.TextInput(text=request, language_code=self._language))
request = dialogflow_v2.DetectIntentRequest(session=session_path, query_input=query_input)
response = await session_client.detect_intent(request=request)
result: dialogflow_v2.QueryResult = response.query_result
if result.intent is not None:
return {result.intent.display_name: result.intent_detection_confidence}
return {}
Loading
Loading