Skip to content

Commit

Permalink
Add enum for duplication policy, updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
codepr committed Mar 10, 2024
1 parent ca3ceed commit 79c53c5
Show file tree
Hide file tree
Showing 4 changed files with 43 additions and 13 deletions.
35 changes: 25 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,13 @@ structure applications.

### Basics

Still at the very early stages, the main concepts are
Still at a very early stage, the main concepts are

- Fixed size records, to keep things simple each record is represented by just
a timestamp with nanoseconds precision and a double
- In memory segments: Data is stored in timeseries format, allowing efficient
querying and retrieval based on timestamps, with the last slice of data in
memory, composed by 2 segments (currently covering 15 minutes of data each)
memory, composed by two segments (currently covering 15 minutes of data each)
- The last 15 minutes of data
- The previous 15 minutes for records out of order, totalling 30 minutes
- Commit Log: Persistence is achieved using a commit log at the base, ensuring
Expand All @@ -22,15 +24,28 @@ Still at the very early stages, the main concepts are

### TODO

- Adopt an arena for allocations
- Text based protocol
- TCP server
- Memory mapped indexes
- Duplicate points policy
- CRC32 of records for data integrity
- Adopt an arena for memory allocations
- Memory mapped indexes, above a threshold enable binary search
- Schema definitions
- Server: Text based protocol, a simplified SQL-like would be cool

### Usage

At the current stage, no server attached, just a tiny library with some crude APIs.
At the current stage, no server attached, just a tiny library with some crude APIs;

- `tsdb_init(1)` creates a new database
- `tsdb_close(1)` closes the database
- `ts_create(3)` creates a new timeseries in a given database
- `ts_get(2)` retrieve an existing timeseries from a database
- `ts_insert(3)` inserts a new point into the timeseries
- `ts_find(3)` finds a point inside the timeseries
- `ts_range(4)` finds a range of points in the timeseries, returning a vector
with the results
- `ts_close(1)` closes a timeseries

Plus a few other helpers.

#### As a library

Expand All @@ -48,9 +63,9 @@ In the target project, a generic hello world

int main(void) {
// Example usage of timeseries library functions
Timeseries *ts = ts_create("example_ts");
Timeseries_DB *db = tsdb_init("example_ts");
// Use timeseries functions...
ts_destroy(ts);
tsdb_close(db);
return 0;
}

Expand All @@ -74,7 +89,7 @@ int main() {
abort();

// Create a timeseries, retention is not implemented yet
Timeseries *ts = ts_create(db, "temperatures", 0);
Timeseries *ts = ts_create(db, "temperatures", 0, IGNORE);
if (!ts)
abort();

Expand Down
14 changes: 13 additions & 1 deletion include/timeseries.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,17 @@
extern const size_t TS_FLUSH_SIZE;
extern const size_t TS_BATCH_OFFSET;

/*
* Enum defining the rules to apply when a duplicate point is
* inserted in the timeseries.
*
* It currently just support
* - IGNORE drops the point, returning a failure at insert attempt
* - INSERT just appends the point
* - UPDATE updates the point with the new value
*/
typedef enum dup_policy { IGNORE, INSERT } Duplication_Policy;

/*
* Simple record struct, wrap around a column inside the database, defined as a
* key-val couple alike, though it's used only to describe the value of each
Expand Down Expand Up @@ -70,6 +81,7 @@ typedef struct timeseries {
Timeseries_Chunk prev;
Partition partitions[TS_MAX_PARTITIONS];
size_t partition_nr;
Duplication_Policy policy;
} Timeseries;

extern int ts_init(Timeseries *ts);
Expand All @@ -93,7 +105,7 @@ extern Timeseries_DB *tsdb_init(const char *data_path);
extern void tsdb_close(Timeseries_DB *tsdb);

extern Timeseries *ts_create(const Timeseries_DB *tsdb, const char *name,
int64_t retention);
int64_t retention, Duplication_Policy policy);

extern Timeseries *ts_get(const Timeseries_DB *tsdb, const char *name);

Expand Down
2 changes: 1 addition & 1 deletion src/main.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ int main(void) {
abort();
}

Timeseries *ts = ts_create(db, "temperatures", 0);
Timeseries *ts = ts_create(db, "temperatures", 0, IGNORE);
/* Timeseries *ts = ts_get(db, "temperatures"); */
if (!ts) {
log_error("Panic: mkdir");
Expand Down
5 changes: 4 additions & 1 deletion src/timeseries.c
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Timeseries_DB *tsdb_init(const char *data_path) {
void tsdb_close(Timeseries_DB *tsdb) { free(tsdb); }

Timeseries *ts_create(const Timeseries_DB *tsdb, const char *name,
int64_t retention) {
int64_t retention, Duplication_Policy policy) {
if (!tsdb || !name)
return NULL;

Expand All @@ -56,6 +56,7 @@ Timeseries *ts_create(const Timeseries_DB *tsdb, const char *name,

ts->retention = retention;
ts->partition_nr = 0;
ts->policy = policy;
for (int i = 0; i < TS_MAX_PARTITIONS; ++i)
memset(&ts->partitions[i], 0x00, sizeof(ts->partitions[i]));

Expand Down Expand Up @@ -96,6 +97,8 @@ Timeseries *ts_get(const Timeseries_DB *tsdb, const char *name) {
snprintf(ts->db_data_path, DATA_PATH_SIZE, "%s", tsdb->data_path);
snprintf(ts->name, TS_NAME_MAX_LENGTH, "%s", name);

// TODO consider adding some metadata file which saves TS info such as the
// duplication policy
if (ts_init(ts) < 0) {
ts_close(ts);
return NULL;
Expand Down

0 comments on commit 79c53c5

Please sign in to comment.