Skip to content

Releases: CESNET/dp3

Release v0.8.0

23 Oct 13:21
Compare
Choose a tag to compare

This release brings improved performance, new callbacks for modules and many other improvements and bugfixes.

Database Optimizations

This release brings a significant jump in DP³ performance, as many database operations have been optimized through batching. With scalability in mind, there have been some breaking changes to the DB schema, which may require your attention. To update a your DP³ instance to this release, please refer to the Database Optimizations PR for further instructions on how to migrate.

Updater

We discovered a need for a new type of periodic callback that would not fit to the established snapshot mechanism. The main use-case is keep your application up-to-date with data from external systems while avoiding any rate limits or other restrictions by the remote API. For these (and other "slow update") cases, two new module callbacks have been exposed, periodic_update_hook and periodic_eid_update_hook.

API

There is a new endpoint to get distinct value counts for an attribute, mainly aimed at simplifying filtering UIs.

Scripts - A New dp3-script executable

For a while now, DP³ has included multiple scripts that are helpful for various tasks around running your applications. However, using those scripts previously required downloading all source code, and the packages required only by those scripts (namely pandas), were included in the main requirements. This change allows you to execute all included DP³ scripts from the installed package, and in case you are not using them, the dependencies have been moved to a separate group for a lighter installation.

Full Changelog: v0.7.0...v0.8.0

Release 0.7.0 - Config Reload

19 Dec 14:22
Compare
Choose a tag to compare

With this release, DP³ moves to use the newer version 2 of Pydantic, which is used internally for most data validation needs, be it incoming datapoints or configuration. This also means a bump in the FastAPI version used. You may need to reinstall the requirements for DP³ in your existing installations.

Modules

  • Modules and values derived using on_entity_creation and on_new_<attr> callbacks can now be refreshed after configuration changes using the API.
  • Updated BaseModule class to initialize the logger and a SharedFlag for module refreshing purposes. Modules should now place the loading of configuration into the load_config method.
  • CallbackRegistrar now offers new init and finalize hooks to give module context about snapshot creation.

API

  • Added a new telemetry endpoint to show source validity.
  • Added fulltext filters when querying the latest snapshots.
  • The Control section has two new endpoints, one for refreshing all on_entity_creation callbacks for a particular entity, another for reloading the module configuration as mentioned above.

Links

  • Links have become more expressive, as mirrored links are now available, allowing for easier 1-M relation modelling and having data more accessible in all sides of the relation. (see config)
  • Another addition is allowing links in arrays and sets, which allows M-N relations without using multi-value observations attributes.

Schema Tracking

  • The schema defined by db_entities is now tracked in the database and will prompt users on conflicting changes on platform startup.
  • The required DB changes can be applied using dp3 schema-update and should require no manual interaction with the database.

Internal changes:

  • Snapshooter only links entities for making snapshot when necessary (previously unused relations were loaded regardless).
  • Snapshooter and GarbageCollector link cache collections have been merged into one managed by LinkManager.
  • Links to and from deleted entities are now deleted from master records.
  • Various minor logging changes and other fixes.

Release 0.6.0 - Entity Lifetimes

26 Oct 10:13
Compare
Choose a tag to compare

This release adds two new mechanisms to handle entity lifetimes, i.e. ways to scope when entities will be deleted:

  • TTL tokens (entities with specified lifetime timers)
  • Reference counting (weak entities)

Both of these are implemented a new core module called GarbageCollector. There is also a third lifetime, immortal, which means the platform will not delete the entity by itself, just like it was before. This is the default for backwards compatibility reasons.

A new endpoint has been added to the API, which enables sending TTL tokens to specified entities and extending their lifetime.

Delete API
There are also additions to the API to delete selected entities to both the secondary modules and the external API. The delete action will remove the master record and all snapshots.

Filter Empty Snapshots
An additional option was added to SnapShooter, whether to keep_empty entity records as snapshots, or to filter them out in processing.

Docs

Other changes:

  • BUGFIX - SnapShooter running hooks on multiple linked entities no longer runs only on one entity per entity type.
  • An additional reconnect attempt added to TaskQueueReader.
  • Snapshots returned from /{etype} endpoint will be sorted by the entity ID.
  • A few minor changes in logging messages, severity, and length.
  • Deployments using gunicorn will now also include access logs.

v0.5.0

03 Oct 06:13
Compare
Choose a tag to compare

Major bugfix

  • Fixed Snapshooter thread crashing after DB replica set primary change.

Configuration cleaning

  • Extra fields in configuration will now cause an error - previously, users could set entries that would be in the configuration, but be ignored by the platform, leading to confusion. Now, any specified configuration field that is not expected will cause an error.
  • These configuration fields were removed, as long unused by the platform. If you have any of them in your configuration, you can safely remove them (an error will be thrown otherwise):
    • entityspec.key_data_type
    • entityspec.auto_create_record
    • attrspec.probability
    • attrspec.color
    • attrspec.categories

Better record and datapoint history management

  • Improved aggregation of multi-value attributes. Plain attributes are now properly archived. The combination of these changes means great savings in database used disk space, which was over 40% in one of our deployments.

CLI Improvements

  • dp3 check now gives better errors description, printing each piece of source code only once.
  • The deployment setup executable dp3 config has also been improved, making installs easier. The supervisor service.ini config has been fine-tuned. Python installation directories detection now accounts for installations even outside a virtual environment.

API improvements & bugfixes

  • GET /entity/{etype} endpoint now has an optional filter feature.
  • The entity overview request response now includes a document count.

Many small documentation improvements, also a new History management page.

v0.4.0: Schedule Configuring Update

09 Aug 12:38
Compare
Choose a tag to compare

This release adds options to fully configure the history management of your application. Previously, all functions of HistoryManager had at most a "tick_rate" specified in minutes, which is excessive for some applications receiving data several times a day. Similarly, the archivation age could only be specified in days, where some applications may need to remove data much sooner.

The new configuration format is fully documented here and the specifics of cron expressions here.

To use the update, please update your history_manager.yml configuration file - this is how the default config looks now:

aggregation_schedule:  
  minute: "*/10"  

datapoint_cleaning_schedule:  
  minute: "*/30"

snapshot_cleaning:
  schedule: {minute: "15,45"}  
  older_than: 7d  

datapoint_archivation:
  schedule: {hour: 2, minute: 0}  
  older_than: 7d  
  archive_dir: "data/datapoints/"

Patch: Testing automatic publish on release workflow

07 Aug 11:37
Compare
Choose a tag to compare