Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running initLlama with lora / lora_scaled set, fails to load context and crashes app. #86

Open
tom-lewis-code opened this issue Nov 15, 2024 · 3 comments

Comments

@tom-lewis-code
Copy link

I'm successfully managing to initLlama and inference on it, without lora:

    await initLlama({
        model: 'file://' + file.uri,
        n_ctx: 1024, // Adjust based on model requirements
        n_batch: 1, // Adjust based on device capabilities
        n_threads: 4, // Adjust based on device capabilities
        n_gpu_layers: 1, // Enable if device supports GPU acceleration
        use_mmap: true,
        use_mlock: true,
      })

But if I add lora / lora_scaled, it fails to load and crashes without erroring.

      lora: 'file://' + file.lora,
      lora_scaled: 0,

Any help would be greatly appreciated - running on Android. I'm loading the files in from assets/models then moving them to DocumentDirectoryPath and calling them from there. 🥸

@jhen0409
Copy link
Member

The file:// prefix is unnecessary.

For model it will remove the prefix, but there is no removal of lora. This can be added later.

if (path.startsWith('file://')) path = path.slice(7)

@tom-lewis-code
Copy link
Author

Thanks for incredibly quick help, really appreciate it!

I've tried different ways of running initLlama e.g with/without the 'file://' extension on both model and lora. But I can't find a configuration of it that will work without closing the app, sorry if I'm missing anything obvious here.

In this example file.uri and file.lora are:

/data/user/0/com.rnllamaexample/files/my-lora.gguf
/data/user/0/com.rnllamaexample/files/my-model.gguf

This is my current setup for initLlama().

await initLlama({
    model: file.uri,
    lora: file.lora,
    lora_scaled: 1,
    n_ctx: 1024
    n_batch: 1, 
    n_threads: 4, 
    n_gpu_layers: 1,
    use_mmap: true,
    use_mlock: true,
  })

Thanks again!

jhen0409 added a commit that referenced this issue Nov 16, 2024
@jhen0409
Copy link
Member

jhen0409 commented Nov 17, 2024

Tested with bartowski/Meta-Llama-3.1-8B-Instruct-GGUF as the base model and grimjim/Llama-3-Instruct-abliteration-LoRA-8B (converted) as the lora adapter, no issue on my Android device (Pixel 6).

Could you share what model & lora you are using? Also, Android hardware info that may be helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants