Release v0.5.0 (#207) · databrickslabs/lsql@619ff0a

Commit

Release v0.5.0 (#207)
* Added Command Execution backend which uses Command Execution API on a
cluster ([#95](#95)). In
this release, the databricks labs lSQL library has been updated with a
new Command Execution backend that utilizes the Command Execution API. A
new `CommandExecutionBackend` class has been implemented, which
initializes a `CommandExecutor` instance taking a cluster ID, workspace
client, and language as parameters. The `execute` method runs SQL
commands on the specified cluster, and the `fetch` method returns the
query result as an iterator of Row objects. The existing
`StatementExecutionBackend` class has been updated to inherit from a new
abstract base class called `ExecutionBackend`, which includes a
`save_table` method for saving data to tables and is meant to be a
common base class for both Statement and Command Execution backends. The
`StatementExecutionBackend` class has also been updated to use the new
`ExecutionBackend` abstract class and its constructor now accepts a
`max_records_per_batch` parameter. The `execute` and `fetch` methods
have been updated to use the new `_only_n_bytes` method for logging
truncated SQL statements. Additionally, the `CommandExecutionBackend`
class has several methods, `execute`, `fetch`, and `save_table` to
execute commands on a cluster and save the results to tables in the
databricks workspace. This new backend is intended to be used for
executing commands on a cluster and saving the results in a databricks
workspace.
* Added basic integration with Lakeview Dashboards
([#66](#66)). In this
release, we've added basic integration with Lakeview Dashboards to the
project, enhancing its capabilities. This includes updating the
`databricks-labs-blueprint` dependency to version 0.4.2 with the
`[yaml]` extra, allowing for additional functionality related to
handling YAML files. A new file, `dashboards.py`, has been introduced,
providing a class for interacting with Databricks dashboards, along with
methods for retrieving and saving dashboard configurations.
Additionally, a new `__init__.py` file under the
`src/databricks/labs/lsql/lakeview` directory imports all classes and
functions from the `model.py` module, providing a foundation for further
development and customization. The release also introduces a new file,
`model.py`, containing code generated from OpenAPI specs by the
Databricks SDK Generator, and a template file, `model.py.tmpl`, used for
handling JSON data during integration with Lakeview Dashboards. A new
file, `polymorphism.py`, provides utilities for checking if a value can
be assigned to a specific type, supporting correct data typing and
formatting with Lakeview Dashboards. Furthermore, a `.gitignore` file
has been added to the `tests/integration` directory as part of the
initial steps in adding integration testing to ensure compatibility with
the Lakeview Dashboards platform. Lastly, the `test_dashboards.py` file
in the `tests/integration` directory contains a function,
`test_load_dashboard(ws)`, which uses the `Dashboards` class to save a
dashboard from a source to a destination path, facilitating testing
during the integration process.
* Added dashboard-as-code functionality
([#201](#201)). This commit
introduces dashboard-as-code functionality for the UCX project, enabling
the creation and management of dashboards using code. The feature
resolves multiple issues and includes a new `create-dashboard` command
for creating unpublished dashboards. The functionality is available in
the `lsql` lab and allows for specifying the order and width of widgets,
overriding default widget identifiers, and supporting various SQL and
markdown header arguments. The `dashboard.yml` file is used to define
top-level metadata for the dashboard. This commit also includes
extensive documentation and examples for using the dashboard as a
library and configuring different options.
* Automate opening integration test dashboard in debug mode
([#167](#167)). A new
feature has been added to automatically open the integration test
dashboard in debug mode, making it easier for software engineers to
debug and troubleshoot. This has been achieved by importing the
`webbrowser` and `is_in_debug` modules from
"databricks.labs.blueprint.entrypoint", and adding a check in the
`create` function to determine if the code is running in debug mode. If
it is, a dashboard URL is constructed from the workspace configuration
and dashboard ID, and then opened in a web browser using
"webbrowser.open". This allows for a more streamlined debugging process
for the integration test dashboard. No other parts of the code have been
affected by this change.
* Automatically tile widgets
([#109](#109)). In this
release, we've introduced an automatic widget tiling feature for the
dashboard creation process in our open-source library. The `Dashboards`
class now includes a new class variable, `_maximum_dashboard_width`, set
to 6, representing the maximum width allowed for each row of widgets in
the dashboard. The `create_dashboard` method has been updated to accept
a new `self` parameter, turning it into an instance method. A new
`_get_position` method has been introduced to calculate and return the
next available position for placing a widget, and a
`_get_width_and_height` method has been added to return the width and
height for a widget specification, initially handling `CounterSpec`
instances. Additionally, we've added new unit tests to improve testing
coverage, ensuring that widgets are created, positioned, and sized
correctly. These tests also cover the correct positioning of widgets
based on their order and available space, as well as the expected width
and height for each widget.
* Bump actions/checkout from 4.1.3 to 4.1.6
([#102](#102)). In the
latest release, the 'actions/checkout' GitHub Action has been updated
from version 4.1.3 to 4.1.6, which includes checking the platform to set
the archive extension appropriately. This release also bumps the version
of github/codeql-action from 2 to 3, actions/setup-node from 1 to 4, and
actions/upload-artifact from 2 to 4. Additionally, the
minor-actions-dependencies group was updated with two new versions.
Disabling extensions.worktreeConfig when disabling sparse-checkout was
introduced in version 4.1.4. The release notes and changelog for this
update can be found in the provided link. This commit was made by
dependabot[bot] with contributions from cory-miller and jww3.
* Bump actions/checkout from 4.1.6 to 4.1.7
([#151](#151)). In the
latest release, the 'actions/checkout' GitHub action has been updated
from version 4.1.6 to 4.1.7 in the project's push workflow, which checks
out the repository at the start of the workflow. This change brings
potential bug fixes, performance improvements, or new features compared
to the previous version. The update only affects the version number in
the YAML configuration for the 'actions/checkout' step in the
release.yml file, with no new methods or alterations to existing
functionality. This update aims to ensure a smooth and enhanced user
experience for those utilizing the project's push workflows by taking
advantage of the possible improvements or bug fixes in the new version
of 'actions/checkout'.
* Create a dashboard with a counter from a single query
([#107](#107)). In this
release, we have introduced several enhancements to our
dashboard-as-code approach, including the creation of a `Dashboards`
class that provides methods for getting, saving, and deploying
dashboards. A new method, `create_dashboard`, has been added to create a
dashboard with a single page containing a counter widget. The counter
widget is associated with a query that counts the number of rows in a
specified dataset. The `deploy_dashboard` method has also been added to
deploy the dashboard to the workspace. Additionally, we have implemented
a new feature for creating dashboards with a counter from a single
query, including modifications to the `test_dashboards.py` file and the
addition of four new tests. These changes improve the robustness of the
dashboard creation process and provide a more automated way to view
important metrics.
* Create text widget from markdown file
([#142](#142)). A new
feature has been implemented in the library that allows for the creation
of a text widget from a markdown file, enhancing customization and
readability for users. This development resolves issue
[#1](#1)
* Design document for dashboards-as-code
([#105](#105)). "The latest
release introduces 'Dashboards as Code,' a method for defining and
managing dashboards through configuration files, enabling version
control and controlled changes. The building blocks include `.sql`,
`.md`, and `dashboard.yml` files, with `.sql` defining queries and
determining tile order, and `dashboard.yml` specifying top-level
metadata and tile overrides. Metadata can be inferred or explicitly
defined in the query or files. The tile order can be determined by SQL
file order, `tiles` order in `dashboard.yml`, or SQL file metadata. This
project can also be used as a library for embedding dashboard generation
in your code. Configuration precedence follows command-line flags, SQL
file headers, `dashboard.yml`, and SQL query content. The command-line
interface is utilized for dashboard generation from configuration
files."
* Ensure propagation of `lsql` version into `User-Agent` header when it
is used as library
([#206](#206)). In this
release, the `pyproject.toml` file has been updated to ensure that the
correct version of the `lsql` library is propagated into the
`User-Agent` header when used as a library, improving attribution. The
`databricks-sdk` version has been updated from `0.22.0` to `0.29.0`, and
the `__init__.py` file of the `lsql` library has been modified to add
the `with_user_agent_extra` function from the `databricks.sdk.core`
package for correct attribution. The `backends.py` file has also been
updated with improved type handling in the `_row_to_sql` and
`save_table` functions for accurate SQL insertion and handling of
user-defined classes. Additionally, a test has been added to ensure that
the `lsql` version is correctly propagated in the `User-Agent` header
when used as a library. These changes offer improved functionality and
accurate type handling, making it easier for developers to identify the
library version when used in other projects.
* Fixed counter encodings
([#143](#143)). In this
release, we have improved the encoding of counters in the lsql dashboard
by modifying the `create_dashboard` function in the `dashboards.py`
file. Previously, the counter field encoding was hardcoded as "count,"
but has been changed to dynamically determine the first field name of
the given fields, ensuring that counters are expected to have only one
field. Additionally, a new integration test has been added to the
`tests/integration/test_dashboards.py` file to ensure that the dashboard
deployment functionality correctly handles SQL queries that do not
perform a count. A new test for the `Dashboards` class has also been
added to check that counter field encoding names are created as
expected. The `WorkspaceClient` is mocked and not called in this test.
These changes enhance the accuracy of counter encoding and improve the
overall functionality and reliability of the lsql dashboard.
* Fixed non-existing reference and typo in the documentation
([#104](#104)). In this
release, we've made improvements to the documentation of our open-source
library, specifically addressing issue
[#104](#104). The changes
include fixing a non-existent reference and a typo in the `Library size
comparison` section of the "comparison.md" document. This section
provides guidance for selecting a library based on factors like library
size, unified authentication, and compatibility with various Databricks
warehouses and SQL Python APIs. The updates clarify the required
dependency size for simple applications and scripts, and offer more
detailed information about each library option. We've also added a new
subsection titled `Detailed comparison` to provide a more comprehensive
overview of each library's features. These changes are intended to help
software engineers better understand which library is best suited for
their specific needs, particularly for applications that require data
transfer of large amounts of data serialized in Apache Arrow format and
low result fetching latency, where we recommend using the Databricks SQL
Connector for Python for efficient data transfer and low latency.
* Fixed parsing message
([#146](#146)). In this
release, the warning message logged during the creation of a dashboard
when a ParseError occurs has been updated to provide clearer and more
detailed information about the parsing error. The new error message now
includes the specific query being parsed and the exact parsing error,
enabling developers to quickly identify the cause of parsing issues.
This change ensures that engineers can efficiently diagnose and address
parsing errors, improving the overall development and debugging
experience with a more informative log format: "Parsing {query}:
{error}".
* Improve dashboard as code
([#108](#108)). The
`Dashboards` class in the 'dashboards.py' file has been updated to
improve functionality and usability, with changes such as the addition
of a type variable `T` for type checking and more descriptive names for
methods. The `save_to_folder` method now accepts a `Dashboard` object
and returns a `Dashboard` object, and a new static method
`create_dashboard` has been added. Additionally, two new methods
`_with_better_names` and `_replace_names` have been added for improved
readability. The `get_dashboard` method now returns a `Dashboard` object
instead of a dictionary. The `save_to_folder` method now also formats
SQL code before saving it to file. These changes aim to enhance the
functionality and readability of the codebase and provide more
user-friendly methods for interacting with the `Dashboards` class. In
addition to the changes in the `Dashboards` class, there have been
updates in the organization of the project structure. The
'queries/counter.sql' file has been moved to
'dashboards/one_counter/counter.sql' in the 'tests/integration'
directory. This modification enhances the organization of the project.
Furthermore, several tests for the `Dashboards` class have been
introduced in the 'databricks.labs.lsql.dashboards' module,
demonstrating various functionalities of the class and ensuring that it
functions as intended. The tests cover saving SQL and YML files to a
specified folder, creating a dataset and a counter widget for each
query, deploying dashboards with a given display name or dashboard ID,
and testing the behavior of the `save_to_folder` and `deploy_dashboard`
methods. Lastly, the commit removes the `test_load_dashboard` function
and updates the `test_dashboard_creates_one_dataset_per_query` and
`test_dashboard_creates_one_counter_widget_per_query` functions to use
the updated `Dashboard` class. A new `replace_recursively` function is
introduced to replace specific fields in a dataclass recursively. A new
test function `test_dashboards_deploys_exported_dashboard_definition`
has been added, which reads a dashboard definition from a JSON file,
deploys it, and checks if it's successfully deployed using the
`Dashboards` class. A new test function
`test_dashboard_deploys_dashboard_the_same_as_created_dashboard` has
also been added, which compares the original and deployed dashboards to
ensure they are identical. Overall, these changes aim to improve the
functionality and readability of the codebase and provide more
user-friendly methods for interacting with the `Dashboards` class, as
well as enhance the organization of the project structure and add new
tests for the `Dashboards` class to ensure it functions as intended.
* Infer fields from a query
([#111](#111)). The
`Dashboards` class in the `dashboards.py` file has been updated with the
addition of a new method, `_get_fields`, which accepts a SQL query as
input and returns a list of `Field` objects using the `sqlglot` library
to parse the query and extract the necessary information. The
`create_dashboard` method has been modified to call this new function
when creating `Query` objects for each dataset. If a `ParseError`
occurs, a warning is logged and iteration continues. This allows for the
automatic population of fields when creating a new dashboard,
eliminating the need for manual specification. Additionally, new tests
have been added for invalid queries and for checking if the fields in a
query have the expected names. These tests include
`test_dashboards_skips_invalid_query` and
`test_dashboards_gets_fields_with_expected_names`, which utilize the
caplog fixture and create temporary query files to verify functionality.
Existing functionality related to creating dashboards remains unchanged.
* Make constant all caps
([#140](#140)). In this
release, the project's 'dashboards.py' file has been updated to improve
code readability and maintainability. A constant variable
`_maximum_dashboard_width` has been changed to all caps, becoming
'_MAXIMUM_DASHBOARD_WIDTH'. This modification affects the `Dashboards`
class and its methods, particularly `_get_fields` and '_get_position'.
The `_get_position` method has been revised to use the new all caps
constant variable. This change ensures better visibility of constants
within the code, addressing issue
[#140](#140). It's
important to note that this modification only impacts the
'dashboards.py' file and does not affect any other functionalities.
* Read display name from `dashboard.yml`
([#144](#144)). In this
release, we have introduced a new `DashboardMetadata` dataclass that
reads the display name of a dashboard from a `dashboard.yml` file
located in the dashboard's directory. If the `dashboard.yml` file is
absent, the folder name will be used as the display name. This change
improves the readability and maintainability of the dashboard
configuration by explicitly defining the display name and reducing the
need to specify widget information in multiple places. We have also
added a new fixture called `make_dashboard` for creating and cleaning up
lakeview dashboards in the test suite. The fixture handles creation and
deletion of the dashboard and provides an option to set a custom display
name. Additionally, we have added and modified several unit tests to
ensure the proper handling of the `DashboardMetadata` class and the
dashboard creation process, including tests for missing, present, or
incorrect `display_name` keys in the YAML file. The
`dashboards.deploy_dashboard()` function has been updated to handle
cases where only `dashboard_id` is provided.
* Set widget id in query header
([#154](#154)). In this
release, we've made significant improvements to widget metadata handling
in our open-source library. We've introduced a new `WidgetMetadata`
class that replaces the previous `WidgetMetadata` dataclass, now
featuring a `path` attribute, `spec_type` property, and optional
parameters for `order`, `width`, `height`, and `_id`. The `_get_widgets`
method has been updated to accept an Iterable of `WidgetMetadata`
objects, and both `_get_layouts` and `_get_widgets` methods now sort
widgets using the order field. A new class method,
`WidgetMetadata.from_path`, handles parsing widget metadata from a file
path, replacing the removed `_get_width_and_height` method.
Additionally, the `WidgetMetadata` class is now used in the
`deploy_dashboard` method, and the test suite for the `dashboards`
module has been enhanced with updated
`test_widget_metadata_replaces_width_and_height` and
`test_widget_metadata_replaces_attribute` functions, as well as new
tests for specific scenarios. Issue
[#154](#154) has been
addressed by setting the widget id in the query header, and the
aforementioned changes improve flexibility and ease of use for dashboard
development.
* Use order key in query header if defined
([#149](#149)). In this
release, we've introduced a new feature to use an order key in the query
header if defined, enhancing the flexibility and control over the
dashboard creation process. The `WidgetMetadata` dataclass now includes
an optional `order` parameter of type `int`, and the
`_get_arguments_parser()` method accepts the `--order` flag with type
`int`. The `replace_from_arguments()` method has been updated to support
the new `order` parameter, with a default value of `self.order`. The
`create_dashboard()` method now implements a new `_get_datasets()`
method to retrieve datasets from the dashboard folder and introduces a
`_get_widgets()` method, which accepts a list of files, iterates over
them, and yields tuples containing widgets and their corresponding
metadata, including the order. These improvements enable the use of an
order key in query headers, ensuring the correct order of widgets in the
dashboard creation process. Additionally, a new test case has been added
to verify the correct behavior of the dashboard deployment with a
specified order key in the query header. This feature resolves issue
[#148](#148).
* Use widget width and height defined in query header
([#147](#147)). In this
release, the handling of metadata in SQL files has been updated to
utilize the header of the file, instead of the first line, for improved
readability and flexibility. This change includes a new WidgetMetadata
class for defining the width and height of a widget in a dashboard, as
well as new methods for parsing the widget metadata from a provided
path. The release also includes updates to the documentation to cover
the supported widget arguments `-w or --width` and '-h or --height', and
resolves issue [#114](#114)
by adding a test for deploying a dashboard with a big widget using a new
function `test_dashboard_deploys_dashboard_with_big_widget`.
Additionally, new test cases have been added for creating dashboards
with custom-sized widgets based on query header width and height values,
improving functionality and error handling.

Dependency updates:

* Bump actions/checkout from 4.1.3 to 4.1.6
([#102](#102)).
* Bump actions/checkout from 4.1.6 to 4.1.7
([#151](#151)).
Loading branch information
nfx authored Jul 3, 2024
1 parent 4990ce1 commit 619ff0a
0 comments on commit `619ff0a`

Please sign in to comment.
Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `619ff0a`

Commit

There are no files selected for viewing

0 comments on commit 619ff0a

0 comments on commit `619ff0a`