Skip to content

Commit

Permalink
Merge pull request #324 from sshanks-kx/refactor
Browse files Browse the repository at this point in the history
refactor & editorial changes.
  • Loading branch information
natalietanner authored Jul 18, 2024
2 parents 9ede98b + 24028bd commit 61de8b9
Show file tree
Hide file tree
Showing 33 changed files with 414 additions and 496 deletions.
45 changes: 22 additions & 23 deletions docs/architecture/index.md
Original file line number Diff line number Diff line change
@@ -1,37 +1,35 @@
---
title: Architecture | Documentation for q and kdb+
description: How to construct systems from kdb+ processes
keywords: hdb, kdb+, q, rdb, tick, tickerplant, streaming
---
# Architecture of kdb+ systems

A kdb-tick based archecture can be used to capture, process and analyse vasts amount of real-time and historical data.




_Applications that use kdb+ typically comprise multiple processes_
The following diagram illustrates the components that are often found in a vanilla kdb-tick setup:

![architecture](../img/architecture.png)

The small footprint of the q interpreter, the [interprocess communication](../basics/ipc.md) baked into q, and the range of [interfaces](../interfaces/index.md) available make it straightforward to incorporate kdb+ into a multi-process application architecture.

Certain kinds of process recur across applications.
## Components

### Data feed

## Data feed
This is a source of real-time data; for example, financial quotes and trades from Bloomberg or Refinitiv, or readings from a network of sensors.

This is a source of real-time data; for example, financial quotes and trades from [Bloomberg](https://www.bloomberg.com/professional/solution/content-and-data/) or [Refinitiv](https://www.refinitiv.com/), or readings from a network of sensors


## Feedhandler
### Feedhandler

Parses data from the data feed to a format that can be ingested by kdb+.

Multiple feed handlers can be used to gather data from a number of different sources and feed it to the kdb+ system for storage and analysis.

KX’s [Fusion interfaces](../interfaces/index.md#fusion-interfaces) connect kdb+ to a range of other technologies, such as [R](../interfaces/r.md), Apache Kafka, Java, Python and [C](../interfaces/c-client-for-q.md).


## Tickerplant
### Tickerplant (TP)

Captures the initial data feed, writes it to the log file and [publishes](../kb/publish-subscribe.md) these messages to any registered subscribers.
A kdb+ processing acting as a TP (tickerplant) captures the initial data feed, writes it to the log file and [publishes](../kb/publish-subscribe.md) these messages to any registered subscribers.
Aims for zero-latency.
Includes ingesting data in batch mode.

Expand All @@ -46,14 +44,12 @@ Handles end-of-day (EOD) processing.
For best resilience, and to avoid core resource competition, run them on their own cores.


## Log file
#### TP Log

This is the file to which the Tickerplant logs the q messages it receives from the feedhandler. It is used for recovery: if the RDB has to restart, the log file is replayed to return to the current state.

!!! tip "Best practices for log files"

The logging process can run on any hardware and OS, from a RaspberryPi to a cloud server.

Store the file on a fast local disk to minimize publication delay and I/O waits.

:fontawesome-regular-map:
Expand All @@ -63,9 +59,11 @@ This is the file to which the Tickerplant logs the q messages it receives from t
[Linux production notes](../kb/linux-production.md)


## Real-time database
### Real-time database (RDB)

A kdb+ processing acting as a RDB (real-time database) subscribes to messages from the Tickerplant, stores them in memory, and allows this data to be queried intraday.

Subscribes to messages from the Tickerplant, stores them in memory, and allows this data to be queried intraday.
At startup, the RDB sends a message to the tickerplant and receives a reply containing the data schema, the location of the log file, and the number of lines to read from the log file. It then receives subsequent updates from the TP as they are published.

At end of day usually writes intraday data to the Historical Database, and sends it a new EOD message.

Expand All @@ -86,9 +84,10 @@ At end of day usually writes intraday data to the Historical Database, and sends
[Intraday writedown solutions](../wp/intraday-writedown/index.md)


## Real-time subscriber
### Real-time engine/subscriber (RTE/RTS)

Subscribes to the intraday messages and typically performs some additional function on receipt of new data – e.g. calculating an order book or maintaining a subtable with the latest price for each instrument.
A kdb+ processing acting as a RTE (real-time engine) subscribes to the intraday messages and typically performs some additional function on receipt of new data – e.g. calculating an order book or maintaining a subtable with the latest price for each instrument.
A RTE is sometimes referred to as a RTS (real-time subscriber).

!!! tip "Best practices for real-time subscribers"

Expand All @@ -104,9 +103,9 @@ Subscribes to the intraday messages and typically performs some additional funct



## Historical database
### Historical database (HDB)

Provides a queryable data store of historical data;
A kdb+ processing acting as a HDB (historical database) provides a queryable data store of historical data;
for example, for creating customer reports on order execution times, or sensor failure analyses.

Large tables are usually stored on disk partitioned by date, with each column stored as its own file.
Expand Down Expand Up @@ -135,7 +134,7 @@ The dates are referred to as _partitions_ and this on-disk structure contributes
[Compression in kdb+](../wp/compress/index.md)


## Gateway
### Gateway

The entry point into the kdb+ system. Responsible for routing incoming queries to the appropriate processes, and returning their results.

Expand Down
2 changes: 1 addition & 1 deletion docs/basics/datatypes.md
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ q)0w + 5
q)-0Wh
-32767h

Integer promotion is documented for [Add](../../ref/add/#range-and-domains).
Integer promotion is documented for [Add](../ref/add.md#range-and-domains).

Integer infinities

Expand Down
2 changes: 1 addition & 1 deletion docs/basics/errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ bad lambda
[](){#badmsg}
badmsg

: Failure in [IPC validator](../releases/ChangesIn2.7/#ipc-message-validator)
: Failure in [IPC validator](../releases/ChangesIn2.7.md#ipc-message-validator)

bad meta data in file

Expand Down
2 changes: 1 addition & 1 deletion docs/basics/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ Applying a value to its argument/s or indexes by writing it to the left of an ar

## Chained tickerplant

A [chained tickerplant](../kb/chained-tickerplant.md) subscribes to the master tickerplant and receives updates like any other subscriber, and then serves that data to its subscribers in turn.
A [chained tickerplant](../kb/kdb-tick.md#chained-tickerplants) subscribes to the master tickerplant and receives updates like any other subscriber, and then serves that data to its subscribers in turn.


## Character constant
Expand Down
58 changes: 34 additions & 24 deletions docs/basics/internal.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,25 +10,25 @@ description: The operator ! with a negative integer as left-argument calls an in
The operator `!` with a negative integer as left argument calls an internal function.

<div markdown="1" class="typewriter">
[0N!x](#0nx-show) show Replaced:
[-4!x](#-4x-tokens) tokens -1! [hsym](../ref/hsym.md)
[-8!x](#-8x-to-bytes) to bytes -2! [attr](../ref/attr.md)
[-9!x](#-9x-from-bytes) from bytes -3! [.Q.s1](../ref/dotq.md#s1-string-representation)
[-10!x](#-10x-type-enum) type enum -5! [parse](../ref/parse.md)
[-11!](#-11-streaming-execute) streaming execute -6! [eval](../ref/eval.md)
[-14!x](#-14x-quote-escape) quote escape -7! [hcount](../ref/hcount.md)
[-16!x](#-16x-ref-count) ref count -12! [.Q.host](../ref/dotq.md#host-hostname)
[-18!x](#-18x-compress-byte) compress byte -13! [.Q.addr](../ref/dotq.md#addr-ip-address)
[-21!x](#-21x-compression-stats) compression stats -15! [md5](../ref/md5.md)
[-22!x](#-22x-uncompressed-length) uncompressed length -19! [set](../ref/get.md#set)
[-23!x](#-23x-memory-map) memory map -20! [.Q.gc](../ref/dotq.md#gc-garbage-collect)
[-25!x](#-25x-async-broadcast) async broadcast -24! [reval](../ref/eval.md#reval)
[-26!x](#-26x-ssl) SSL -29! [.j.k](../ref/dotj.md#jk-deserialize)
[-27!(x;y)](#-27xy-format) format -31! [.j.jd](../ref/dotj.md#jjd-serialize-infinity)
[-30!x](#-30x-deferred-response) deferred response -32! [.Q.btoa](../ref/dotq.md#btoa-b64-encode)
[-33!x](#-33x-sha-1-hash) SHA-1 hash -34! [.Q.ts](../ref/dotq.md#ts-time-and-space)
[-36!](#-36-load-master-key) load master key -35! [.Q.gz](../ref/dotq.md#gz-gzip)
[-38!x](#-38x-socket-table) socket table -37! [.Q.prf0](../ref/dotq.md#prf0-code-profiler)
[0N!x](#0nx-show) show Replaced:
[-4!x](#-4x-tokens) tokens -1! [hsym](../ref/hsym.md)
[-8!x](#-8x-to-bytes) to bytes -2! [attr](../ref/attr.md)
[-9!x](#-9x-from-bytes) from bytes -3! [.Q.s1](../ref/dotq.md#s1-string-representation)
[-10!x](#-10x-type-enum) type enum -5! [parse](../ref/parse.md)
[-11!](#-11-streaming-execute) streaming execute -6! [eval](../ref/eval.md)
[-14!x](#-14x-quote-escape) quote escape -7! [hcount](../ref/hcount.md)
[-16!x](#-16x-ref-count) ref count -12! [.Q.host](../ref/dotq.md#host-hostname)
[-18!x](#-18x-compress-byte) compress byte -13! [.Q.addr](../ref/dotq.md#addr-ip-address)
[-21!x](#-21x-compressionencryption-stats) compression/encryption stats -15! [md5](../ref/md5.md)
[-22!x](#-22x-uncompressed-length) uncompressed length -19! [set](../ref/get.md#set)
[-23!x](#-23x-memory-map) memory map -20! [.Q.gc](../ref/dotq.md#gc-garbage-collect)
[-25!x](#-25x-async-broadcast) async broadcast -24! [reval](../ref/eval.md#reval)
[-26!x](#-26x-ssl) SSL -29! [.j.k](../ref/dotj.md#jk-deserialize)
[-27!(x;y)](#-27xy-format) format -31! [.j.jd](../ref/dotj.md#jjd-serialize-infinity)
[-30!x](#-30x-deferred-response) deferred response -32! [.Q.btoa](../ref/dotq.md#btoa-b64-encode)
[-33!x](#-33x-sha-1-hash) SHA-1 hash -34! [.Q.ts](../ref/dotq.md#ts-time-and-space)
[-36!](#-36-load-master-key) load master key -35! [.Q.gz](../ref/dotq.md#gz-gzip)
[-38!x](#-38x-socket-table) socket table -37! [.Q.prf0](../ref/dotq.md#prf0-code-profiler)
[-120!x](#-120x-memory-domain) memory domain
</div>

Expand Down Expand Up @@ -150,7 +150,7 @@ Where `n` is a non-negative integer and `x` is a logfile handle
In replaying, if the logfile references an undefined function, the function name is signalled as an error.

:fontawesome-solid-graduation-cap:
[Replaying logfiles](../kb/replay-log.md)
[Log files](../kb/logging.md)


## `-14!x` (quote escape)
Expand Down Expand Up @@ -208,13 +208,14 @@ q)get[`:test]~get`:ztest
[File compression](../kb/file-compression.md)
<br>
:fontawesome-solid-book:
[`.z.zd` zip defaults](../ref/dotz.md#zzd-zip-defaults)
[`.z.zd` zip defaults](../ref/dotz.md#zzd-compressionencryption-defaults)
-->

## `-21!x` (compression stats)
[](){#-21x-compression-stats}
## `-21!x` (compression/encryption stats)

Where `x` is a file symbol, returns a dictionary of compression statistics for it.
The dictionary is empty if the file is not compressed.
Where `x` is a file symbol, returns a dictionary of compression/encryption statistics for it. Encryption available since 4.0 2019.12.12.
The dictionary is empty if the file is not compressed/encrypted.

```q
q)-21!`:ztest / compressed
Expand All @@ -226,13 +227,22 @@ zipLevel | 6i
q)-21!`:test / not compressed
q)count -21!`:test
0
q)-21!`:ztest / encrypted
compressedLength | 40088
uncompressedLength| 40008
algorithm | 16i
logicalBlockSize | 17i
zipLevel | 6i
```

:fontawesome-solid-book:
[`set`](../ref/get.md#set)
<br>
:fontawesome-solid-database:
[File compression](../kb/file-compression.md)
<br>
:fontawesome-solid-database:
[Data at rest encryption (DARE)](../kb/dare.md)


## `-22!x` (uncompressed length)
Expand Down
2 changes: 1 addition & 1 deletion docs/basics/ipc.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ Finer grained authorization can be implemented by tracking user information with
```q
q)\p 5000
q)allowedFns:(`func1;`func2;`func3;+;-) / list of allowed function/ops to call
q)checkFn:{if[not x in allowedFns;'(-3!x)," not allowed"];}
q)checkFn:{if[not x in allowedFns;'(.Q.s1 x)," not allowed"];}
q)validatePT:{if[0h=t:type x;if[(not 0h=type first x)&1=count first x;checkFn first x;];.z.s each x where 0h=type each x;];}
q).z.pg:{if[10h=type x;x:parse x;];validatePT x;eval x}
```
Expand Down
2 changes: 1 addition & 1 deletion docs/basics/syscmds.md
Original file line number Diff line number Diff line change
Expand Up @@ -338,7 +338,7 @@ There is no garbage since q uses reference counting. As soon as there are no ref

During that return of memory, q checks if the capacity of the object is ≥64MB. If it is and `\g` is 1, the memory is returned immediately to the OS; otherwise, the memory is returned to the thread-local heap for reuse.

Executing [`.Q.gc[]`](../ref/dotq/#qgc-garbage-collect) additionally attempts to coalesce pieces of the heap into their original allocation units and returns any units ≥64MB to the OS.
Executing [`.Q.gc[]`](../ref/dotq.md#qgc-garbage-collect) additionally attempts to coalesce pieces of the heap into their original allocation units and returns any units ≥64MB to the OS.

Since V3.3 2015.08.23 (Linux only) unused pages in the heap are dropped from RSS during `.Q.gc[]`.

Expand Down
51 changes: 0 additions & 51 deletions docs/kb/chained-tickerplant.md

This file was deleted.

4 changes: 2 additions & 2 deletions docs/kb/dare.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ Individual files can be encrypted as e.g.
(`:ztest;17;16;6) set asc 10000?`3 / encrypt an individual file
```

Or use [`.z.zd`](../ref/dotz.md#zzd-zip-defaults) for a process-wide default setting for all qualifying files.
Or use [`.z.zd`](../ref/dotz.md#zzd-compressionencryption-defaults) for a process-wide default setting for all qualifying files.

```q
.z.zd:17 2 6 / zlib compression
Expand All @@ -170,7 +170,7 @@ Or use [`.z.zd`](../ref/dotz.md#zzd-zip-defaults) for a process-wide default set

When using the global setting `.z.zd`, files which do not qualify for encryption are filenames with an extension. e.g. `abc.bin`, `.d`.

Encryption adds a small amount of data, depending on the logical block size chosen, amounting to less than 2% of the overall size for typical DB files. The encoded size is reported via the command [`-21!filename`](../basics/internal.md#-21x-compression-stats).
Encryption adds a small amount of data, depending on the logical block size chosen, amounting to less than 2% of the overall size for typical DB files. The encoded size is reported via the command [`-21!filename`](../basics/internal.md#-21x-compressionencryption-stats).

## File locking

Expand Down
8 changes: 5 additions & 3 deletions docs/kb/file-compression.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Q operators and keywords read both compressed and uncompressed files.
## Write compressed files

Use [`set`](../ref/get.md#set) with a left argument that specifies the file or splay target, and the [compression parameters](#compression-parameters).
(For a splayed table, you can [specify the compression of each column](../ref/get.md#compression).)
(For a splayed table, you can [specify the compression of each column](../ref/get.md#compressionencryption).)

```q
q)`:a set 1000#enlist asc 1000?10 / uncompressed file
Expand Down Expand Up @@ -76,6 +76,8 @@ alg algorithm level since

!!! detail "Level 0 for `lz4hc` default compression; level>16 behaves the same as 16"

!!! note "Algorithm is also used to specifiy the [encryption](dare.md#encryption) algorithm which can be [used with compression](dare.md#compression-with-encryption)"


### Selective compression

Expand All @@ -87,7 +89,7 @@ So files that do not compress well, or have an access pattern that does not perf

### Compression statistics

The [`-21!` internal function](../basics/internal.md#-21x-compression-stats) returns a dictionary of compression statistics, or an empty dictionary if the file is not compressed.
The [`-21!` internal function](../basics/internal.md#-21x-compressionencryption-stats) returns a dictionary of compression statistics, or an empty dictionary if the file is not compressed.

[`hcount`](../ref/hcount.md) returns the uncompressed file length.

Expand All @@ -96,7 +98,7 @@ The [`-21!` internal function](../basics/internal.md#-21x-compression-stats) ret

kdb+ can write compressed files by default.

This is governed by the [zip defaults `.z.zd`](../ref/dotz.md#zzd-zip-defaults).
This is governed by the [zip defaults `.z.zd`](../ref/dotz.md#zzd-compressionencryption-defaults).
Set this as an integer vector, e.g.

```q
Expand Down
Loading

0 comments on commit 61de8b9

Please sign in to comment.