diff --git a/README.md b/README.md index ccf3017a..8f0a0de2 100644 --- a/README.md +++ b/README.md @@ -34,10 +34,12 @@ You can search HuggingFace for available models (Keyword: [`GGUF`](https://huggi For create a GGUF model manually, for example in Llama 2: Download the Llama 2 model + 1. Request access from [here](https://ai.meta.com/llama) 2. Download the model from HuggingFace [here](https://huggingface.co/meta-llama/Llama-2-7b-chat) (`Llama-2-7b-chat`) Convert the model to ggml format + ```bash # Start with submodule in this repo (or you can clone the repo https://github.com/ggerganov/llama.cpp.git) yarn && yarn bootstrap @@ -76,26 +78,53 @@ const context = await initLlama({ // embedding: true, // use embedding }) -// Do completion -const { text, timings } = await context.completion( +const stopWords = ['', '<|end|>', '<|eot_id|>', '<|end_of_text|>', '<|im_end|>', '<|EOT|>', '<|END_OF_TURN_TOKEN|>', '<|end_of_turn|>', '<|endoftext|>'] + +// Do chat completion +const msgResult = await context.completion( + { + messages: [ + { + role: 'system', + content: 'This is a conversation between user and assistant, a friendly chatbot.', + }, + { + role: 'user, + content: 'Hello!', + }, + ], + n_predict: 100, + stop: stopWords, + // ...other params + }, + (data) => { + // This is a partial completion callback + const { token } = data + }, +) +console.log('Result:', msgResult.text) +console.log('Timings:', msgResult.timings) + +// Or do text completion +const textResult = await context.completion( { prompt: 'This is a conversation between user and llama, a friendly chatbot. respond in simple markdown.\n\nUser: Hello!\nLlama:', n_predict: 100, - stop: ['', 'Llama:', 'User:'], - // n_threads: 4, + stop: [...stopWords, 'Llama:', 'User:'], + // ...other params }, (data) => { // This is a partial completion callback const { token } = data }, ) -console.log('Result:', text) -console.log('Timings:', timings) +console.log('Result:', textResult.text) +console.log('Timings:', textResult.timings) ``` The binding’s deisgn inspired by [server.cpp](https://github.com/ggerganov/llama.cpp/tree/master/examples/server) example in llama.cpp, so you can map its API to LlamaContext: -- `/completion`: `context.completion(params, partialCompletionCallback)` +- `/completion` and `/chat/completions`: `context.completion(params, partialCompletionCallback)` - `/tokenize`: `context.tokenize(content)` - `/detokenize`: `context.detokenize(tokens)` - `/embedding`: `context.embedding(content)` @@ -110,6 +139,7 @@ Please visit the [Documentation](docs/API) for more details. You can also visit the [example](example) to see how to use it. Run the example: + ```bash yarn && yarn bootstrap @@ -142,7 +172,9 @@ You can see [GBNF Guide](https://github.com/ggerganov/llama.cpp/tree/master/gram ```js import { initLlama, convertJsonSchemaToGrammar } from 'llama.rn' -const schema = { /* JSON Schema, see below */ } +const schema = { + /* JSON Schema, see below */ +} const context = await initLlama({ model: 'file://', @@ -153,7 +185,7 @@ const context = await initLlama({ grammar: convertJsonSchemaToGrammar({ schema, propOrder: { function: 0, arguments: 1 }, - }) + }), }) const { text } = await context.completion({ @@ -171,80 +203,81 @@ console.log('Result:', text) { oneOf: [ { - type: "object", - name: "get_current_weather", - description: "Get the current weather in a given location", + type: 'object', + name: 'get_current_weather', + description: 'Get the current weather in a given location', properties: { function: { - const: "get_current_weather", + const: 'get_current_weather', }, arguments: { - type: "object", + type: 'object', properties: { location: { - type: "string", - description: "The city and state, e.g. San Francisco, CA", + type: 'string', + description: 'The city and state, e.g. San Francisco, CA', }, unit: { - type: "string", - enum: ["celsius", "fahrenheit"], + type: 'string', + enum: ['celsius', 'fahrenheit'], }, }, - required: ["location"], + required: ['location'], }, }, }, { - type: "object", - name: "create_event", - description: "Create a calendar event", + type: 'object', + name: 'create_event', + description: 'Create a calendar event', properties: { function: { - const: "create_event", + const: 'create_event', }, arguments: { - type: "object", + type: 'object', properties: { title: { - type: "string", - description: "The title of the event", + type: 'string', + description: 'The title of the event', }, date: { - type: "string", - description: "The date of the event", + type: 'string', + description: 'The date of the event', }, time: { - type: "string", - description: "The time of the event", + type: 'string', + description: 'The time of the event', }, }, - required: ["title", "date", "time"], + required: ['title', 'date', 'time'], }, }, }, { - type: "object", - name: "image_search", - description: "Search for an image", + type: 'object', + name: 'image_search', + description: 'Search for an image', properties: { function: { - const: "image_search", + const: 'image_search', }, arguments: { - type: "object", + type: 'object', properties: { query: { - type: "string", - description: "The search query", + type: 'string', + description: 'The search query', }, }, - required: ["query"], + required: ['query'], }, }, }, ], } ``` +
@@ -268,6 +301,7 @@ string ::= "\"" ( 2 ::= "{" space "\"function\"" space ":" space 2-function "," space "\"arguments\"" space ":" space 2-arguments "}" space root ::= 0 | 1 | 2 ``` +
## Mock `llama.rn` @@ -281,12 +315,14 @@ jest.mock('llama.rn', () => require('llama.rn/jest/mock')) ## NOTE iOS: + - The [Extended Virtual Addressing](https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_developer_kernel_extended-virtual-addressing) capability is recommended to enable on iOS project. - Metal: - We have tested to know some devices is not able to use Metal ('params.n_gpu_layers > 0') due to llama.cpp used SIMD-scoped operation, you can check if your device is supported in [Metal feature set tables](https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf), Apple7 GPU will be the minimum requirement. - It's also not supported in iOS simulator due to [this limitation](https://developer.apple.com/documentation/metal/developing_metal_apps_that_run_in_simulator#3241609), we used constant buffers more than 14. Android: + - Currently only supported arm64-v8a / x86_64 platform, this means you can't initialize a context on another platforms. The 64-bit platform are recommended because it can allocate more memory for the model. - No integrated any GPU backend yet. diff --git a/docs/API/README.md b/docs/API/README.md index b998afd3..cf9c9447 100644 --- a/docs/API/README.md +++ b/docs/API/README.md @@ -43,17 +43,17 @@ llama.rn #### Defined in -[index.ts:43](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L43) +[index.ts:51](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L51) ___ ### CompletionParams -Ƭ **CompletionParams**: `Omit`<`NativeCompletionParams`, ``"emit_partial_completion"``\> +Ƭ **CompletionParams**: `Omit`<`NativeCompletionParams`, ``"emit_partial_completion"`` \| ``"prompt"``\> & { `messages?`: `RNLlamaOAICompatibleMessage`[] ; `prompt?`: `string` } #### Defined in -[index.ts:41](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L41) +[index.ts:43](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L43) ___ @@ -63,7 +63,7 @@ ___ #### Defined in -[index.ts:39](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L39) +[index.ts:41](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L41) ___ @@ -80,7 +80,7 @@ ___ #### Defined in -[index.ts:29](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L29) +[index.ts:31](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L31) ## Functions @@ -104,7 +104,7 @@ ___ #### Defined in -[grammar.ts:824](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L824) +[grammar.ts:824](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L824) ___ @@ -124,7 +124,7 @@ ___ #### Defined in -[index.ts:166](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L166) +[index.ts:191](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L191) ___ @@ -138,7 +138,7 @@ ___ #### Defined in -[index.ts:182](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L182) +[index.ts:211](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L211) ___ @@ -158,4 +158,4 @@ ___ #### Defined in -[index.ts:162](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L162) +[index.ts:187](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L187) diff --git a/docs/API/classes/LlamaContext.md b/docs/API/classes/LlamaContext.md index 984bc1d0..4da652f8 100644 --- a/docs/API/classes/LlamaContext.md +++ b/docs/API/classes/LlamaContext.md @@ -21,6 +21,7 @@ - [completion](LlamaContext.md#completion) - [detokenize](LlamaContext.md#detokenize) - [embedding](LlamaContext.md#embedding) +- [getFormattedChat](LlamaContext.md#getformattedchat) - [loadSession](LlamaContext.md#loadsession) - [release](LlamaContext.md#release) - [saveSession](LlamaContext.md#savesession) @@ -41,7 +42,7 @@ #### Defined in -[index.ts:62](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L62) +[index.ts:72](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L72) ## Properties @@ -51,7 +52,7 @@ #### Defined in -[index.ts:56](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L56) +[index.ts:64](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L64) ___ @@ -61,7 +62,7 @@ ___ #### Defined in -[index.ts:54](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L54) +[index.ts:62](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L62) ___ @@ -69,9 +70,15 @@ ___ • **model**: `Object` = `{}` +#### Type declaration + +| Name | Type | +| :------ | :------ | +| `isChatTemplateSupported?` | `boolean` | + #### Defined in -[index.ts:60](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L60) +[index.ts:68](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L68) ___ @@ -81,7 +88,7 @@ ___ #### Defined in -[index.ts:58](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L58) +[index.ts:66](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L66) ## Methods @@ -104,7 +111,7 @@ ___ #### Defined in -[index.ts:135](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L135) +[index.ts:162](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L162) ___ @@ -125,7 +132,7 @@ ___ #### Defined in -[index.ts:90](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L90) +[index.ts:109](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L109) ___ @@ -145,7 +152,7 @@ ___ #### Defined in -[index.ts:127](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L127) +[index.ts:154](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L154) ___ @@ -165,7 +172,27 @@ ___ #### Defined in -[index.ts:131](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L131) +[index.ts:158](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L158) + +___ + +### getFormattedChat + +▸ **getFormattedChat**(`messages`): `Promise`<`string`\> + +#### Parameters + +| Name | Type | +| :------ | :------ | +| `messages` | `RNLlamaOAICompatibleMessage`[] | + +#### Returns + +`Promise`<`string`\> + +#### Defined in + +[index.ts:98](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L98) ___ @@ -187,7 +214,7 @@ Load cached prompt & completion state from a file. #### Defined in -[index.ts:77](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L77) +[index.ts:82](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L82) ___ @@ -201,7 +228,7 @@ ___ #### Defined in -[index.ts:157](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L157) +[index.ts:182](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L182) ___ @@ -225,7 +252,7 @@ Save current cached prompt & completion state to a file. #### Defined in -[index.ts:86](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L86) +[index.ts:91](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L91) ___ @@ -239,7 +266,7 @@ ___ #### Defined in -[index.ts:119](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L119) +[index.ts:146](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L146) ___ @@ -259,4 +286,4 @@ ___ #### Defined in -[index.ts:123](https://github.com/mybigday/llama.rn/blob/f95f600/src/index.ts#L123) +[index.ts:150](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/index.ts#L150) diff --git a/docs/API/classes/SchemaGrammarConverter.md b/docs/API/classes/SchemaGrammarConverter.md index 8b9a535f..09cecb43 100644 --- a/docs/API/classes/SchemaGrammarConverter.md +++ b/docs/API/classes/SchemaGrammarConverter.md @@ -46,7 +46,7 @@ #### Defined in -[grammar.ts:211](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L211) +[grammar.ts:211](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L211) ## Properties @@ -56,7 +56,7 @@ #### Defined in -[grammar.ts:201](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L201) +[grammar.ts:201](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L201) ___ @@ -66,7 +66,7 @@ ___ #### Defined in -[grammar.ts:203](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L203) +[grammar.ts:203](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L203) ___ @@ -76,7 +76,7 @@ ___ #### Defined in -[grammar.ts:199](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L199) +[grammar.ts:199](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L199) ___ @@ -90,7 +90,7 @@ ___ #### Defined in -[grammar.ts:207](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L207) +[grammar.ts:207](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L207) ___ @@ -100,7 +100,7 @@ ___ #### Defined in -[grammar.ts:209](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L209) +[grammar.ts:209](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L209) ___ @@ -114,7 +114,7 @@ ___ #### Defined in -[grammar.ts:205](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L205) +[grammar.ts:205](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L205) ## Methods @@ -135,7 +135,7 @@ ___ #### Defined in -[grammar.ts:693](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L693) +[grammar.ts:693](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L693) ___ @@ -156,7 +156,7 @@ ___ #### Defined in -[grammar.ts:224](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L224) +[grammar.ts:224](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L224) ___ @@ -179,7 +179,7 @@ ___ #### Defined in -[grammar.ts:710](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L710) +[grammar.ts:710](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L710) ___ @@ -200,7 +200,7 @@ ___ #### Defined in -[grammar.ts:312](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L312) +[grammar.ts:312](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L312) ___ @@ -220,7 +220,7 @@ ___ #### Defined in -[grammar.ts:518](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L518) +[grammar.ts:518](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L518) ___ @@ -241,7 +241,7 @@ ___ #### Defined in -[grammar.ts:323](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L323) +[grammar.ts:323](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L323) ___ @@ -255,7 +255,7 @@ ___ #### Defined in -[grammar.ts:813](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L813) +[grammar.ts:813](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L813) ___ @@ -276,7 +276,7 @@ ___ #### Defined in -[grammar.ts:247](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L247) +[grammar.ts:247](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L247) ___ @@ -297,4 +297,4 @@ ___ #### Defined in -[grammar.ts:529](https://github.com/mybigday/llama.rn/blob/f95f600/src/grammar.ts#L529) +[grammar.ts:529](https://github.com/mybigday/llama.rn/blob/ad7e0a5/src/grammar.ts#L529)