Skip to content

Commit

Permalink
Update storage docs
Browse files Browse the repository at this point in the history
  • Loading branch information
jacoblee93 committed Jul 30, 2024
1 parent f4b4e0c commit dc07248
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 178 deletions.
23 changes: 23 additions & 0 deletions docs/core_docs/docs/concepts.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -513,6 +513,29 @@ Retrievers accept a string query as input and return an array of `Document`s as

For specifics on how to use retrievers, see the [relevant how-to guides here](/docs/how_to/#retrievers).

### Key-value stores

For some techniques, such as [indexing and retrieval with multiple vectors per document](/docs/how_to/multi_vector/), having some sort of key-value (KV) storage is helpful.

LangChain includes a [`BaseStore`](https://api.js.langchain.com/classes/langchain_core_stores.BaseStore.html) interface,
which allows for storage of arbitrary data. However, LangChain components that require KV-storage accept a
more specific `BaseStore<string, Uint8Array>` instance that stores binary data (referred to as a `ByteStore`), and internally take care of
encoding and decoding data for their specific needs.

This means that as a user, you only need to think about one type of store rather than different ones for different types of data.

#### Interface

All [`BaseStores`](https://api.js.langchain.com/classes/langchain_core_stores.BaseStore.html) support the following interface. Note that the interface allows
for modifying **multiple** key-value pairs at once:

- `mget(keys: string[]): Promise<(undefined | Uint8Array)[]>`: get the contents of multiple keys, returning `None` if the key does not exist
- `mset(keyValuePairs: [string, Uint8Array][]): Promise<void>`: set the contents of multiple keys
- `mdelete(keys: string[]): Promise<void>`: delete multiple keys
- `yieldKeys(prefix?: string): AsyncGenerator<string>`: yield all keys in the store, optionally filtering by a prefix

For key-value store implementations, see [this section](/docs/integrations/stores/).

### Tools

<span data-heading-keywords="tool,tools"></span>
Expand Down
181 changes: 4 additions & 177 deletions docs/core_docs/docs/integrations/stores/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -2,183 +2,10 @@
sidebar_class_name: hidden
---

# Stores
# Key-value stores

Storing data in key value format is quick and efficient, and can be a powerful tool for LLM applications. The `BaseStore` class provides a simple interface for getting, setting, deleting and iterating over lists of key value pairs.
[Key-value stores](/docs/concepts/#key-value-stores) are used by other LangChain components to store and retrieve data.

The public API of `BaseStore` in LangChain JS offers four main methods:
import DocCardList from "@theme/DocCardList";

```typescript
abstract mget(keys: K[]): Promise<(V | undefined)[]>;

abstract mset(keyValuePairs: [K, V][]): Promise<void>;

abstract mdelete(keys: K[]): Promise<void>;

abstract yieldKeys(prefix?: string): AsyncGenerator<K | string>;
```

The `m` prefix stands for multiple, and indicates that these methods can be used to get, set and delete multiple key value pairs at once.
The `yieldKeys` method is a generator function that can be used to iterate over all keys in the store, or all keys with a given prefix.

It's that simple!

So far LangChain.js has two base integrations for `BaseStore`:

- [`InMemoryStore`](/docs/integrations/stores/in_memory)
- [`LocalFileStore`](/docs/integrations/stores/file_system) (Node.js only)

## Use Cases

### Chat history

If you're building web apps with chat, the `BaseStore` family of integrations can come in very handy for storing and retrieving chat history.

### Caching

The `BaseStore` family can be a useful alternative to our other caching integrations.
For example the [`LocalFileStore`](/docs/integrations/stores/file_system) allows for persisting data through the file system. It also is incredibly fast, so your users will be able to access cached data in a snap.

See the individual sections for deeper dives on specific storage providers.

## Reading Data

### In Memory

Reading data is simple with KV stores. Below is an example using the [`InMemoryStore`](/docs/integrations/stores/in_memory) and the `.mget()` method.
We'll also set our generic value type to `string` so we can have type safety setting our strings.

Import the [`InMemoryStore`](/docs/integrations/stores/in_memory) class.

```typescript
import { InMemoryStore } from "langchain/storage/in_memory";
```

Instantiate a new instance and pass `string` as our generic for the value type.

```typescript
const store = new InMemoryStore<string>();
```

Next we can call `.mset()` to write multiple values at once.

```typescript
const data: [string, string][] = [
["key1", "value1"],
["key2", "value2"],
];

await store.mset(data);
```

Finally, call the `.mget()` method to retrieve the values from our store.

```typescript
const data = await store.mget(["key1", "key2"]);

console.log(data);
/**
* ["value1", "value2"]
*/
```

### File System

When using the file system integration we need to instantiate via the `fromPath` method. This is required because it needs to preform checks to ensure the directory exists and is readable/writable.
You also must use a directory when using [`LocalFileStore`](/docs/integrations/stores/file_system) because each entry is stored as a unique file in the directory.

```typescript
import { LocalFileStore } from "langchain/storage/file_system";
```

```typescript
const pathToStore = "./my-store-directory";
const store = await LocalFileStore.fromPath(pathToStore);
```

To do this we can define an encoder for initially setting our data, and a decoder for when we retrieve data.

```typescript
const encoder = new TextEncoder();
const decoder = new TextDecoder();
```

```typescript
const data: [string, Uint8Array][] = [
["key1", encoder.encode(new Date().toDateString())],
["key2", encoder.encode(new Date().toDateString())],
];

await store.mset(data);
```

```typescript
const data = await store.mget(["key1", "key2"]);

console.log(data.map((v) => decoder.decode(v)));
/**
* [ 'Wed Jan 03 2024', 'Wed Jan 03 2024' ]
*/
```

## Writing Data

### In Memory

Writing data is simple with KV stores. Below is an example using the [`InMemoryStore`](/docs/integrations/stores/in_memory) and the `.mset()` method.
We'll also set our generic value type to `Date` so we can have type safety setting our dates.

Import the [`InMemoryStore`](/docs/integrations/stores/in_memory) class.

```typescript
import { InMemoryStore } from "langchain/storage/in_memory";
```

Instantiate a new instance and pass `Date` as our generic for the value type.

```typescript
const store = new InMemoryStore<Date>();
```

Finally we can call `.mset()` to write multiple values at once.

```typescript
const data: [string, Date][] = [
["date1", new Date()],
["date2", new Date()],
];

await store.mset(data);
```

### File System

When using the file system integration we need to instantiate via the `fromPath` method. This is required because it needs to preform checks to ensure the directory exists and is readable/writable.
You also must use a directory when using [`LocalFileStore`](/docs/integrations/stores/file_system) because each entry is stored as a unique file in the directory.

```typescript
import { LocalFileStore } from "langchain/storage/file_system";
```

```typescript
const pathToStore = "./my-store-directory";
const store = await LocalFileStore.fromPath(pathToStore);
```

When defining our data we must convert the values to `Uint8Array` because the file system integration only supports binary data.

To do this we can define an encoder for initially setting our data, and a decoder for when we retrieve data.

```typescript
const encoder = new TextEncoder();
const decoder = new TextDecoder();
```

```typescript
const data: [string, Uint8Array][] = [
["key1", encoder.encode(new Date().toDateString())],
["key2", encoder.encode(new Date().toDateString())],
];

await store.mset(data);
```
<DocCardList />
2 changes: 1 addition & 1 deletion docs/core_docs/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -332,7 +332,7 @@ module.exports = {
},
{
type: "category",
label: "Stores",
label: "Key-value stores",
collapsed: true,
items: [
{
Expand Down

0 comments on commit dc07248

Please sign in to comment.