Skip to content

Commit

Permalink
Add architecture notes
Browse files Browse the repository at this point in the history
  • Loading branch information
ohsayan committed Dec 10, 2023
1 parent 2cf0ce5 commit 24549fb
Showing 1 changed file with 28 additions and 0 deletions.
28 changes: 28 additions & 0 deletions docs/4.architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,34 @@ That's why, every component in Skytable has been engineered from the ground up,

And all of that, without you having to be an expert, and with the least maintenance that you can expect.

## Fundamental differences from SQL

BlueQL kind of looks and feels like SQL and that's the ultimate goal, so that developers from the SQL world find it easier to migrate. But Skytable's evaluation and execution of queries is fundamentally different from SQL counterparts and even NoSQL engines. Here are some key differences:

- All DML queries are point queries and **not** range queries:
- This means that they will either return atleast one row or error
- If you intend to do a multi-row query, then it won't work unless you add `ALL`. `ALL` by itself indicates that you're applying (or selecting) a large range and can be inefficient
- Multi-row DML queries are slow and inefficient and are discouraged
- You can **only** query on the primary index, once again because of speed (and the problem with scaling multiple indexes) with a fixed set of operators.
- **Remember, in NoSQL systems we denormalize.** Hence, no `JOIN`s or foreign keys as with many other NoSQL systems
- A different transactional model:
- All DDL and DCL queries are ACID transactions
- However, DML transactions are not ACID and instead are efficiently batched and are automatically made part of a batch
transaction. The engine decides *when* it will execute them, for example based on the pressure on cache. That's because our
focus is to maximize throughput
- All these differences mean that **DDL and DCL transactions are ACID transactions** while **DML queries are ACI and *eventually* D** (we call it a *delayed durability transaction*). This delay however can be adjusted to make sure that the DML
queries *emulate* ACID transactions but that defeats the point of the eventually durable system which aims to heavily increase throughput.
- The idea of eventually durable transactions relies on the idea that hardware failure even though prominent is still rare,
thanks to the extreme hard work that cloud vendors put into reliability engineering. If you plan to run on unreliable hardware, then the delay setting (reliability service) is what you need to change.
- For extremely unreliable hardware on the other hand, we're working on a new storage driver `rtsyncblock` that is expected to be released in Q1'24
- The transactional engine powering DDL and DCL queries might often choose to demote a transaction to a virtual transaction which makes sure that the transactions is obviously durable but not necessarily actually executed but is eventually executed. If the system crashes, the engine will still be able to actually execute the transaction (even if it crashed halfway)

:::tip
We believe you can now hopefully see how Skytable's workings are fundamentally different from traditional engines. And, we know
that you might have a lot of questions! If you do, please reach out. We're here to help.
:::


## Data model

Just like SQL has `DATABASE`s, Skytable has `SPACE`s which are collections of what we call data containers like tables.
Expand Down

0 comments on commit 24549fb

Please sign in to comment.