Releases: CESNET/dp3
Release v0.8.0
This release brings improved performance, new callbacks for modules and many other improvements and bugfixes.
Database Optimizations
This release brings a significant jump in DP³ performance, as many database operations have been optimized through batching. With scalability in mind, there have been some breaking changes to the DB schema, which may require your attention. To update a your DP³ instance to this release, please refer to the Database Optimizations PR for further instructions on how to migrate.
Updater
We discovered a need for a new type of periodic callback that would not fit to the established snapshot mechanism. The main use-case is keep your application up-to-date with data from external systems while avoiding any rate limits or other restrictions by the remote API. For these (and other "slow update") cases, two new module callbacks have been exposed, periodic_update_hook
and periodic_eid_update_hook
.
API
There is a new endpoint to get distinct value counts for an attribute, mainly aimed at simplifying filtering UIs.
Scripts - A New dp3-script
executable
For a while now, DP³ has included multiple scripts that are helpful for various tasks around running your applications. However, using those scripts previously required downloading all source code, and the packages required only by those scripts (namely pandas
), were included in the main requirements. This change allows you to execute all included DP³ scripts from the installed package, and in case you are not using them, the dependencies have been moved to a separate group for a lighter installation.
Full Changelog: v0.7.0...v0.8.0
Release 0.7.0 - Config Reload
With this release, DP³ moves to use the newer version 2 of Pydantic, which is used internally for most data validation needs, be it incoming datapoints or configuration. This also means a bump in the FastAPI version used. You may need to reinstall the requirements for DP³ in your existing installations.
- Modules and values derived using
on_entity_creation
andon_new_<attr>
callbacks can now be refreshed after configuration changes using the API. - Updated
BaseModule
class to initialize the logger and aSharedFlag
for module refreshing purposes. Modules should now place the loading of configuration into theload_config
method. CallbackRegistrar
now offers newinit
andfinalize
hooks to give module context about snapshot creation.
- Added a new telemetry endpoint to show source validity.
- Added fulltext filters when querying the latest snapshots.
- The Control section has two new endpoints, one for refreshing all
on_entity_creation
callbacks for a particular entity, another for reloading the module configuration as mentioned above.
- Links have become more expressive, as mirrored links are now available, allowing for easier 1-M relation modelling and having data more accessible in all sides of the relation. (see config)
- Another addition is allowing links in arrays and sets, which allows M-N relations without using multi-value observations attributes.
- The schema defined by
db_entities
is now tracked in the database and will prompt users on conflicting changes on platform startup. - The required DB changes can be applied using
dp3 schema-update
and should require no manual interaction with the database.
Internal changes:
Snapshooter
only links entities for making snapshot when necessary (previously unused relations were loaded regardless).Snapshooter
andGarbageCollector
link cache collections have been merged into one managed byLinkManager
.- Links to and from deleted entities are now deleted from master records.
- Various minor logging changes and other fixes.
Release 0.6.0 - Entity Lifetimes
This release adds two new mechanisms to handle entity lifetimes, i.e. ways to scope when entities will be deleted:
- TTL tokens (entities with specified lifetime timers)
- Reference counting (weak entities)
Both of these are implemented a new core module called GarbageCollector
. There is also a third lifetime, immortal, which means the platform will not delete the entity by itself, just like it was before. This is the default for backwards compatibility reasons.
A new endpoint has been added to the API, which enables sending TTL tokens to specified entities and extending their lifetime.
Delete API
There are also additions to the API to delete selected entities to both the secondary modules and the external API. The delete action will remove the master record and all snapshots.
Filter Empty Snapshots
An additional option was added to SnapShooter
, whether to keep_empty
entity records as snapshots, or to filter them out in processing.
Docs
Entity Lifetimes
describes the new mechanics in detail- API changes: Added delete endpoint and TTL token endpoint
- Better Configuration docs (Newly added
API
,Control
,GarbageCollector
)
Other changes:
- BUGFIX -
SnapShooter
running hooks on multiple linked entities no longer runs only on one entity per entity type. - An additional reconnect attempt added to
TaskQueueReader
. - Snapshots returned from
/{etype}
endpoint will be sorted by the entity ID. - A few minor changes in logging messages, severity, and length.
- Deployments using
gunicorn
will now also include access logs.
v0.5.0
Major bugfix
- Fixed Snapshooter thread crashing after DB replica set primary change.
Configuration cleaning
- Extra fields in configuration will now cause an error - previously, users could set entries that would be in the configuration, but be ignored by the platform, leading to confusion. Now, any specified configuration field that is not expected will cause an error.
- These configuration fields were removed, as long unused by the platform. If you have any of them in your configuration, you can safely remove them (an error will be thrown otherwise):
entityspec.key_data_type
entityspec.auto_create_record
attrspec.probability
attrspec.color
attrspec.categories
Better record and datapoint history management
- Improved aggregation of multi-value attributes. Plain attributes are now properly archived. The combination of these changes means great savings in database used disk space, which was over 40% in one of our deployments.
CLI Improvements
dp3 check
now gives better errors description, printing each piece of source code only once.- The deployment setup executable
dp3 config
has also been improved, making installs easier. The supervisorservice.ini
config has been fine-tuned. Python installation directories detection now accounts for installations even outside a virtual environment.
API improvements & bugfixes
GET /entity/{etype}
endpoint now has an optional filter feature.- The entity overview request response now includes a document count.
Many small documentation improvements, also a new History management page.
v0.4.0: Schedule Configuring Update
This release adds options to fully configure the history management of your application. Previously, all functions of HistoryManager had at most a "tick_rate" specified in minutes, which is excessive for some applications receiving data several times a day. Similarly, the archivation age could only be specified in days, where some applications may need to remove data much sooner.
The new configuration format is fully documented here and the specifics of cron expressions here.
To use the update, please update your history_manager.yml
configuration file - this is how the default config looks now:
aggregation_schedule:
minute: "*/10"
datapoint_cleaning_schedule:
minute: "*/30"
snapshot_cleaning:
schedule: {minute: "15,45"}
older_than: 7d
datapoint_archivation:
schedule: {hour: 2, minute: 0}
older_than: 7d
archive_dir: "data/datapoints/"