Skip to content

Commit

Permalink
Merge pull request #63 from starskey-io/starskey-43
Browse files Browse the repository at this point in the history
- DeleteByFilter, DeleteByRange to return deleted count
- DeleteByRange, DeleteByFilter, DeleteByPrefix to use seen keys as other methods
- open sstable vlog and klog do not need to be synced on reopen
- BackgroundFSync config #60
- BackgroundFSyncInterval config #60
- More logging on Open method
- Correct prior old logic regarding skipping bloom page.
- Read me refinements, further explainations, and better descriptions #62
- Added discord #61
- Comment updates, corrections and additions
- Remove redunant max from t-tree
- Made OptionalConfig and allow for optional configurations for internal system configurations.
- Added TestOpenOptionalInternalConfig test
  • Loading branch information
guycipher authored Feb 19, 2025
2 parents dfe81bd + 14ecdef commit 6bfb6bd
Show file tree
Hide file tree
Showing 6 changed files with 383 additions and 111 deletions.
55 changes: 37 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,26 @@
[![Go Reference](https://pkg.go.dev/badge/github.com/starskey-io/starskey.svg)](https://pkg.go.dev/github.com/starskey-io/starskey)

Starskey is a fast embedded key-value store package for GO! Starskey implements a multi-level, durable log structured merge tree.
Starskey is optimized for write and read efficiency.

## Features
- **Levelled partial merge compaction** Compactions occur on writes, if any disk level reaches it's max size half of the sstables are merged into a new sstable and placed into the next level. This algorithm is recursive until last level. At last level if full we merge all sstables into a new sstable. During merge operations tombstones(deleted keys) are removed.
- **Simple API** with Put, Get, Delete, Range, FilterKeys, Update (for txns)
- **Acid transactions** You can group multiple operations into a single atomic transaction. If any operation fails the entire transaction is rolled back. Only committed operations within a transaction roll back. These transactions would be considered fully serializable.
- **Configurable options** You can configure many options such as max levels, memtable threshold, bloom filter, logging, compression and more.
- **Levelled partial merge compaction** Compactions occur on writes, if any disk level reaches it's max size, then half of the sstables are merged into a new sstable and placed into the next level. This algorithm is recursive until last level. At last level if full we merge all sstables into a new sstable. During merge operations tombstones(deleted keys) are removed when a key reaches the last level.
- **Simple API** with Put, Get, Delete, Range, FilterKeys, Update (for txns), PrefixSearch, LongestPrefixSearch, DeleteByRange, DeleteByFilter, DeleteByPrefix.
- **Acid transactions** You can group multiple operations into a single atomic transaction. If any operation fails the entire transaction is rolled back. Only committed operations within a transaction roll back. These transactions would be considered fully serializable. Transactions themselves are also thread safe so you can add operations to them safety.
- **Configurable options** You can configure many options such as max levels, memtable threshold, bloom filter,succinct range filters, logging, compression and more.
- **WAL with recovery** Starskey uses a write ahead log to ensure durability. Memtable is replayed if a flush did not occur prior to shutdown. On sorted runs to disk the WAL is truncated.
- **Key value separation** Keys and values are stored separately for sstables within a klog and vlog respectively.
- **Bloom filters** Each sstable has an in memory bloom filter to reduce disk reads.
- **Succinct Range Filters** If enabled will speed up range, prefix queries. Will use more memory than bloom filters. Only a bloom filter OR a SuRF filter can be enabled.
- **Bloom filters** Each sstable has an in memory bloom filter to reduce disk reads. Bloom filters are used to check if a key exists in an SST instead of scanning it entirely.
- **Succinct Range Filters** If enabled, each sstable will use a SuRF instead of a bloom filter; This will speed up range, prefix queries. Will use more memory than bloom filters. Only a bloom filter OR a SuRF filter can be enabled.
- **Fast** up to 400k+ ops per second.
- **Compression** S2, and Snappy compression is available.
- **Logging** Logging to file is available. Will write to standard out if not enabled.
- **Thread safe** Starskey is thread safe. Multiple goroutines can read and write to Starskey concurrently. Starskey uses one global lock to keep things consistent.
- **T-Tree memtable** the memory table is a balanced in-memory tree data structure, designed as an alternative to AVL trees and B-Trees for main-memory.

## Discord
Chat everything Starskey [Server](https://discord.gg/HVxkhyys3R)

## Bench
Use the benchmark program at [bench](https://github.com/starskey-io/bench) to compare Starskey with other popular key value stores/engines.

Expand All @@ -39,24 +43,29 @@ func main() {
skey, err := starskey.Open(&starskey.Config{
Permission: 0755, // Dir, file permission
Directory: "db_dir", // Directory to store data
FlushThreshold: (1024 * 1024) * 24, // 24mb Flush threshold in bytes
FlushThreshold: (1024 * 1024) * 24, // 24mb Flush threshold in bytes, for production use 64mb or higher
MaxLevel: 3, // Max levels number of disk levels
SizeFactor: 10, // Size factor for each level. Say 10 that's 10 * the FlushThreshold at each level. So level 1 is 10MB, level 2 is 100MB, level 3 is 1GB.
BloomFilter: false, // If you want to use bloom filters
SuRF: false, // If enabled will speed up range queries as we check if an sstable has the keys we are looking for.
Logging: true, // Enable logging to file
Compression: false, // Enable compression
CompressionOption: starskey.NoCompression, // Or SnappyCompression, S2Compression

// Internal options
// Optional: &OptionalConfig{
// BackgroundFSync: .. If you don't want to fsync writes to disk (default is true)
// BackgroundFSyncInterval: .. Interval for background fsync, if configured true (default is 256ms)
// TTreeMin: .. Minimum degree of the T-Tree
// TTreeMax: .. Maximum degree of the T-Tree
// PageSize: .. Page size for internal pagers
// BloomFilterProbability: .. Bloom filter probability
// },
}) // Config cannot be nil**
if err != nil {
// ..handle error
}

// Close starskey
if err := skey.Close(); err != nil {
// ..handle error
}

key := []byte("some_key")
value := []byte("some_value")

Expand All @@ -77,12 +86,18 @@ func main() {
}

fmt.Println(string(key), string(v))

// Close starskey
if err := skey.Close(); err != nil {
// ..handle error
}

}

```

## Range Keys
You can range over a min and max key to retrieve values.
You can provide a start and end key to retrieve a range of keys.
```go
results, err := skey.Range([]byte("key900"), []byte("key980"))
if err != nil {
Expand Down Expand Up @@ -110,7 +125,9 @@ if err != nil {
```

## Prefix Searches
### LongestPrefixSearch
Starskey supports optimized prefix searches.

### Longest Prefix Search
You can search for the longest prefix of a key.
```go
result, n, err := skey.LongestPrefixSearch([]byte("key"))
Expand All @@ -119,7 +136,7 @@ if err != nil {
}
```

### PrefixSearch
### Prefix Search
You can search for a prefix of a key.
```go
results, err := skey.PrefixSearch([]byte("ke"))
Expand All @@ -129,7 +146,7 @@ if err != nil {
```

## Acid Transactions
Using atomic transactions to group multiple operations into a single atomic transaction. If any operation fails the entire transaction is rolled back. Only committed transactions roll back.
Using atomic transactions to group multiple operations into a single atomic transaction. If any operation fails the entire transaction is rolled back. Only committed operations roll back.
```go
txn := skey.BeginTxn()
if txn == nil {
Expand Down Expand Up @@ -167,9 +184,10 @@ if err := skey.Delete([]byte("key")); err != nil {

### Delete by range
```go
if err := skey.DeleteByRange([]byte("startKey"), []byte("endKey")); err != nil {
if n, err := skey.DeleteByRange([]byte("startKey"), []byte("endKey")); err != nil {
// ..handle error
}
// n is amount of keys deleted
```

### Delete by filter
Expand All @@ -179,9 +197,10 @@ compareFunc := func(key []byte) bool {
return bytes.HasPrefix(key, []byte("c"))
}

if err := skey.DeleteByFilter(compareFunc); err != nil {
if n, err := skey.DeleteByFilter(compareFunc); err != nil {
// ..handle error
}
// n is amount of keys deleted
```

### Delete by key prefix
Expand Down
10 changes: 9 additions & 1 deletion pager/pager.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"math"
"os"
"sync"
"sync/atomic"
"time"
)

Expand All @@ -40,6 +41,7 @@ type Pager struct {
wg *sync.WaitGroup // WaitGroup for background fsync
syncInterval time.Duration // Fsync interval
sync bool // To sync or not to sync
closed atomic.Bool // Add this field to track if already closed
}

// Iterator is the iterator struct used for
Expand All @@ -66,6 +68,10 @@ func Open(filename string, flag int, perm os.FileMode, pageSize int, syncOn bool
if !pager.sync {
return pager, nil
}

// Initialize closed to false
pager.closed.Store(false)

// Start background sync
pager.wg.Add(1)
go pager.backgroundSync()
Expand All @@ -82,7 +88,9 @@ func (p *Pager) Close() error {
if p.file == nil {
return nil
}
if p.sync {

// Only close channel if sync is enabled and we haven't closed before
if p.sync && !p.closed.Swap(true) {
close(p.syncQuit)
p.wg.Wait()
}
Expand Down
Loading

0 comments on commit 6bfb6bd

Please sign in to comment.