Skip to content

Commit

Permalink
merge main
Browse files Browse the repository at this point in the history
  • Loading branch information
micpst committed Sep 2, 2024
2 parents 9e37c82 + 9f6b5df commit db368ca
Show file tree
Hide file tree
Showing 50 changed files with 2,352 additions and 579 deletions.
30 changes: 20 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,26 @@
# <h1 align="center">🦮 db-ally</h1>
<div align="center">

<p align="center">
<em>Efficient, consistent and secure library for querying structured data with natural language</em>
<picture>
<source media="(prefers-color-scheme: light)" srcset="docs/assets/banner-light.svg">
<img alt="dbally logo" src="docs/assets/banner-dark.svg" width="40%" height="40%">
</picture>

<br/>
<br/>

<p>
<em>Efficient, consistent and secure library for querying structured data with natural language</em>
</p>

---
[![PyPI - License](https://img.shields.io/pypi/l/dbally)](https://pypi.org/project/dbally)
[![PyPI - Version](https://img.shields.io/pypi/v/dbally)](https://pypi.org/project/dbally)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/dbally)](https://pypi.org/project/dbally)

* **Documentation:** [db-ally.deepsense.ai](https://db-ally.deepsense.ai/)
* **Source code:** [github.com/deepsense-ai/db-ally](https://github.com/deepsense-ai/db-ally)
</div>

---


**db-ally** is an LLM-powered library for creating natural language interfaces to data sources. While it occupies a similar space to the text-to-SQL solutions, its goals and methods are different. db-ally allows developers to outline specific use cases for the LLM to handle, detailing the desired data format and the possible operations to fetch this data.
db-ally is an LLM-powered library for creating natural language interfaces to data sources. While it occupies a similar space to the text-to-SQL solutions, its goals and methods are different. db-ally allows developers to outline specific use cases for the LLM to handle, detailing the desired data format and the possible operations to fetch this data.

db-ally effectively shields the complexity of the underlying data source from the model, presenting only the essential information needed for solving the specific use cases. Instead of generating arbitrary SQL, the model is asked to generate responses in a simplified query language.

Expand All @@ -25,7 +33,7 @@ The benefits of db-ally can be described in terms of its four main characteristi

## Quickstart

In db-ally, developers define their use cases by implementing [**views**](https://db-ally.deepsense.ai/concepts/views) and **filters**. A list of possible filters is presented to the LLM in terms of [**IQL**](https://db-ally.deepsense.ai/concepts/iql) (Intermediate Query Language). Views are grouped and registered within a [**collection**](https://db-ally.deepsense.ai/concepts/views), which then serves as an entry point for asking questions in natural language.
In db-ally, developers define their use cases by implementing [**views**](https://db-ally.deepsense.ai/concepts/views), **filters** and **aggregations**. A list of possible filters and aggregations is presented to the LLM in terms of [**IQL**](https://db-ally.deepsense.ai/concepts/iql) (Intermediate Query Language). Views are grouped and registered within a [**collection**](https://db-ally.deepsense.ai/concepts/views), which then serves as an entry point for asking questions in natural language.

This is a basic implementation of a db-ally view for an example HR application, which retrieves candidates from an SQL database:

Expand All @@ -52,8 +60,10 @@ class CandidateView(SqlAlchemyBaseView):
"""
return Candidate.country == country

engine = create_engine('sqlite:///examples/recruiting/data/candidates.db')

llm = LiteLLM(model_name="gpt-3.5-turbo")
engine = create_engine("sqlite:///examples/recruiting/data/candidates.db")

my_collection = create_collection("collection_name", llm)
my_collection.add(CandidateView, lambda: CandidateView(engine))

Expand Down
26 changes: 25 additions & 1 deletion benchmarks/sql/bench/pipelines/base.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
from abc import ABC, abstractmethod
from dataclasses import dataclass
from typing import Any, Dict, Optional
from typing import Any, Dict, Optional, Union

from dbally.iql._exceptions import IQLError
from dbally.iql._query import IQLQuery
from dbally.iql_generator.prompt import UnsupportedQueryError
from dbally.llms.base import LLM
from dbally.llms.clients.exceptions import LLMError
from dbally.llms.litellm import LiteLLM
from dbally.llms.local import LocalLLM

Expand All @@ -16,6 +20,25 @@ class IQL:
source: Optional[str] = None
unsupported: bool = False
valid: bool = True
generated: bool = True

@classmethod
def from_query(cls, query: Optional[Union[IQLQuery, Exception]]) -> "IQL":
"""
Creates an IQL object from the query.
Args:
query: The IQL query or exception.
Returns:
The IQL object.
"""
return cls(
source=query.source if isinstance(query, (IQLQuery, IQLError)) else None,
unsupported=isinstance(query, UnsupportedQueryError),
valid=not isinstance(query, IQLError),
generated=not isinstance(query, LLMError),
)


@dataclass
Expand Down Expand Up @@ -47,6 +70,7 @@ class EvaluationResult:
"""

db_id: str
question_id: str
question: str
reference: ExecutionResult
prediction: ExecutionResult
Expand Down
40 changes: 9 additions & 31 deletions benchmarks/sql/bench/pipelines/collection.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,8 @@
import dbally
from dbally.collection.collection import Collection
from dbally.collection.exceptions import NoViewFoundError
from dbally.iql._exceptions import IQLError
from dbally.iql_generator.prompt import UnsupportedQueryError
from dbally.view_selection.llm_view_selector import LLMViewSelector
from dbally.views.exceptions import IQLGenerationError
from dbally.views.exceptions import ViewExecutionError

from ..views import VIEWS_REGISTRY
from .base import IQL, EvaluationPipeline, EvaluationResult, ExecutionResult, IQLResult
Expand Down Expand Up @@ -74,44 +72,23 @@ async def __call__(self, data: Dict[str, Any]) -> EvaluationResult:
return_natural_response=False,
)
except NoViewFoundError:
prediction = ExecutionResult(
view_name=None,
iql=None,
sql=None,
)
except IQLGenerationError as exc:
prediction = ExecutionResult()
except ViewExecutionError as exc:
prediction = ExecutionResult(
view_name=exc.view_name,
iql=IQLResult(
filters=IQL(
source=exc.filters,
unsupported=isinstance(exc.__cause__, UnsupportedQueryError),
valid=not (exc.filters and not exc.aggregation and isinstance(exc.__cause__, IQLError)),
),
aggregation=IQL(
source=exc.aggregation,
unsupported=isinstance(exc.__cause__, UnsupportedQueryError),
valid=not (exc.aggregation and isinstance(exc.__cause__, IQLError)),
),
filters=IQL.from_query(exc.iql.filters),
aggregation=IQL.from_query(exc.iql.aggregation),
),
sql=None,
)
else:
prediction = ExecutionResult(
view_name=result.view_name,
iql=IQLResult(
filters=IQL(
source=result.context.get("iql"),
unsupported=False,
valid=True,
),
aggregation=IQL(
source=None,
unsupported=False,
valid=True,
),
filters=IQL(source=result.context["iql"]["filters"]),
aggregation=IQL(source=result.context["iql"]["aggregation"]),
),
sql=result.context.get("sql"),
sql=result.context["sql"],
)

reference = ExecutionResult(
Expand All @@ -134,6 +111,7 @@ async def __call__(self, data: Dict[str, Any]) -> EvaluationResult:

return EvaluationResult(
db_id=data["db_id"],
question_id=data["question_id"],
question=data["question"],
reference=reference,
prediction=prediction,
Expand Down
35 changes: 8 additions & 27 deletions benchmarks/sql/bench/pipelines/view.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,7 @@

from sqlalchemy import create_engine

from dbally.iql._exceptions import IQLError
from dbally.iql_generator.prompt import UnsupportedQueryError
from dbally.views.exceptions import IQLGenerationError
from dbally.views.exceptions import ViewExecutionError
from dbally.views.freeform.text2sql.view import BaseText2SQLView
from dbally.views.sqlalchemy_base import SqlAlchemyBaseView

Expand Down Expand Up @@ -94,37 +92,20 @@ async def __call__(self, data: Dict[str, Any]) -> EvaluationResult:
dry_run=True,
n_retries=0,
)
except IQLGenerationError as exc:
except ViewExecutionError as exc:
prediction = ExecutionResult(
view_name=data["view_name"],
iql=IQLResult(
filters=IQL(
source=exc.filters,
unsupported=isinstance(exc.__cause__, UnsupportedQueryError),
valid=not (exc.filters and not exc.aggregation and isinstance(exc.__cause__, IQLError)),
),
aggregation=IQL(
source=exc.aggregation,
unsupported=isinstance(exc.__cause__, UnsupportedQueryError),
valid=not (exc.aggregation and isinstance(exc.__cause__, IQLError)),
),
filters=IQL.from_query(exc.iql.filters),
aggregation=IQL.from_query(exc.iql.aggregation),
),
sql=None,
)
else:
prediction = ExecutionResult(
view_name=data["view_name"],
iql=IQLResult(
filters=IQL(
source=result.context["iql"],
unsupported=False,
valid=True,
),
aggregation=IQL(
source=None,
unsupported=False,
valid=True,
),
filters=IQL(source=result.context["iql"]["filters"]),
aggregation=IQL(source=result.context["iql"]["aggregation"]),
),
sql=result.context["sql"],
)
Expand All @@ -135,12 +116,10 @@ async def __call__(self, data: Dict[str, Any]) -> EvaluationResult:
filters=IQL(
source=data["iql_filters"],
unsupported=data["iql_filters_unsupported"],
valid=True,
),
aggregation=IQL(
source=data["iql_aggregation"],
unsupported=data["iql_aggregation_unsupported"],
valid=True,
),
context=data["iql_context"],
),
Expand All @@ -149,6 +128,7 @@ async def __call__(self, data: Dict[str, Any]) -> EvaluationResult:

return EvaluationResult(
db_id=data["db_id"],
question_id=data["question_id"],
question=data["question"],
reference=reference,
prediction=prediction,
Expand Down Expand Up @@ -209,6 +189,7 @@ async def __call__(self, data: Dict[str, Any]) -> EvaluationResult:

return EvaluationResult(
db_id=data["db_id"],
question_id=data["question_id"],
question=data["question"],
reference=reference,
prediction=prediction,
Expand Down
1 change: 0 additions & 1 deletion benchmarks/sql/bench/views/structured/superhero.py
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,6 @@ class SuperheroView(
SqlAlchemyBaseView,
SuperheroAggregationMixin,
SuperheroFilterMixin,
SuperheroColourAggregationMixin,
SuperheroColourFilterMixin,
AlignmentAggregationMixin,
AlignmentFilterMixin,
Expand Down
2 changes: 1 addition & 1 deletion docs/about/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ Below you can find a list of planned features and integrations.

## Planned Features

- [ ] **Support analytical queries**: support for exposing operations beyond filtering.
- [x] **Support analytical queries**: support for exposing operations beyond filtering.
- [x] **Few-shot prompting configuration**: allow users to configure the few-shot prompting in View definition to
improve IQL generation accuracy.
- [ ] **Request contextualization**: allow to provide extra context for db-ally runs, such as user asking the question.
Expand Down
Loading

0 comments on commit db368ca

Please sign in to comment.