Text generation (e.g.with Llama 2) #28

do-me · 2023-07-24T08:05:18Z

do-me
Jul 24, 2023
Maintainer

Maybe with the recent efforts in bringing LLMs to the browser we could think about a POC text generation demo based on page content.
Here's the link to a working Llama 2 of ggerganov's implementation running in the browser: https://twitter.com/ggerganov/status/1683174252990660610?t=SghA57AGQTQ4n660HuJ9lg&s=19

There are a few things to test:

how many results should be piped to the prompt? How to determine this number? It would be hard-coded in the beginning and after there could be some heuristic.
what is the most effective prompt for llama 2? How to provide the context efficiently? How to wrap the user's question?
how to avoid hallucinations if the semantic search results doesn't provide the right context for the user's question?

@VarunNSrivastava this would be very interesting for the browser plugin #15 too! "Chat with the web"- style, it would enable chatting with any (single page) website.

@lizozom did you follow-up on the use of web GPU so far? That would be a great addition (even though atm the c web implementation can't use it) However, I think that we should probably wait for transformers.js to add the feature.

Update: I just saw that distilgpt2 for text generation is already integrated in transformers.js with 122Mb. I'll play with it next week and try to figure out whether the results are satisfying.

varunneal · 2023-07-25T21:33:23Z

varunneal
Jul 25, 2023
Collaborator

Take a look at question answering and the document question answering tasks for this use case instead of just a generic text generation model. There are only a couple question answering models available to transformers.js currently. However, the most popular document-qa model, layloutlm-document-qa seems small enough that, quantized, it should easily fit on browser. So we can upload it to HF in onnx weights for transformers.js.

1 reply

do-me Jul 26, 2023
Maintainer Author

Thanks for the links, I'll check them out! For the beginning and the POC, English-only might be ok, but for the future some multi-language model would be nice. Another approach would be to use dedicated models (if available) for single languages to reduce file size.

Unfortunately it seems like Llama2 is not really well-suited for multi-language use due to ~90% of English training data. I'll take a stab at it anyway and see how well it performs e.g. for German and Italian.

do-me · 2023-08-04T12:04:47Z

do-me
Aug 4, 2023
Maintainer Author

A quick heads-up for some models already integrated in transformers.js
I just gave distilbert-base-uncased-distilled-squad (66 MB) a quick try with the top 3 results from SemanticFinder for the IPCC report.

The result is not great but also not terrible. I guess users would expect a more detailed and nuanced answer, but the model is trained on providing short answers. Maybe the context is also too long.

Instead of searching with What role does food play for climate change? in SemanticFinder, I tried just the general term food and got more nuanced results.

The question answering model might be easy to integrate but I'm not sure about the quality. Also it highly depends on the quality of search results.

Summarization instead might be nice to integrate and seems to work ok-ish.

I'll continue testing with other models.

0 replies

varunneal · 2023-08-04T20:37:15Z

varunneal
Aug 4, 2023
Collaborator

Very cool! To reduce the context window, one idea is you can first use SemanticFinder to first get the k most relevant excerpts from the text, then you can ask the question about those.

1 reply

varunneal Aug 5, 2023
Collaborator

I just reread your reply and realized this is what you're already doing! Sorry about that, carry on!

do-me · 2023-08-07T15:55:32Z

do-me
Aug 7, 2023
Maintainer Author

By the way, I just found the company axilla, building a frontend for LLMs and its necessary settings (top-k documents / chunk retrieval etc.). There is a screenshot in the repo.
and a short video. Maybe this can serve as inspiration to our GUI.

0 replies

do-me · 2023-08-10T07:48:06Z

do-me
Aug 10, 2023
Maintainer Author

FYI: Llama2 support just landed. I might find some time next week to try some things.

0 replies

do-me · 2023-08-17T06:18:13Z

do-me
Aug 17, 2023
Maintainer Author

Google just announced AI summaries in Chrome but there are lots of open questions about privacy, quality etc.
I'd say an open source customizable tool as both webapp and browser extension still holds quite some potential.

0 replies

do-me · 2023-08-23T07:52:57Z

do-me
Aug 23, 2023
Maintainer Author

I think the summary function of distilbart-cnn-6-6 is pretty good at least for non-fictional text.

Example from IPCC summary:

Query

role of agriculture for climate change

Top 3 results

the retreat of glaciers, or the changes
in some mountain (medium confidence) and Arctic ecosystems driven by permafrost thaw (high confidence).
{2.1.2, Figure 2.3} (Figure SPM.1)
A.2.4 Climate change has reduced food security and affected water security, hindering efforts to meet
Sustainable Development Goals (high confidence). Although overall agricultural productivity has increased,
climate change has slowed this growth over the past 50 years globally (medium confidence), with related
negative impacts mainly in mid- and low latitude regions but positive impacts in some high latitude regions
(high confidence). Ocean warming and ocean acidification have adversely affected food production

climate policies and
planning processes (high confidence). {2.2.3}
A.3.2 Effectiveness15 of adaptation in reducing climate risks16 is documented for specific contexts, sectors and
regions (high confidence). Examples of effective adaptation options include: cultivar improvements, on-farm
water management and storage, soil moisture conservation, irrigation, agroforestry, community-based
adaptation, farm and landscape level diversification in agriculture, sustainable land management approaches,
use of agroecological principles and practices and other approaches that work with natural processes (high
confidence). Ecosystem-based adaptation
17 approaches such as urban greening, restoration of

afforestation or production of biomass crops can have adverse socio-economic and environmental
impacts, including on biodiversity, food and water security, local livelihoods and the rights of Indigenous
Peoples, especially if implemented at large scales and where land tenure is insecure. Modelled pathways that
assume using resources more efficiently or that shift global development towards sustainability include fewer
challenges, such as less dependence on CDR and pressure on land and biodiversity. (high confidence) {3.4.1}
[START FIGURE SPM.5 HERE]
47 CCS is an option to reduce emissions from large-scale fossil-based energy and industry sources provided geological storage is
available. When

Summary

Climate change has reduced food security and affected water security, hindering efforts to meet Sustainable Development Goals (high confidence) Climate change slowed this growth over the past 50 years globally. Ocean warming and ocean acidification have adversely affected food production.

I'll take a stab at it and think about how to best (optionally) integrate it in SemanticFinder.

0 replies

do-me · 2023-08-23T20:37:17Z

do-me
Aug 23, 2023
Maintainer Author

Added the functionality in 9c0b66c. Works quite well with longer, non-fictional texts. Takes 1-2 mins for every run, so it would be great to look for an alternative model.

Integration with the progress bar is yet missing.

1 reply

varunneal Aug 24, 2023
Collaborator

Cool stuff!

do-me · 2023-12-21T19:18:55Z

do-me
Dec 21, 2023
Maintainer Author

I tested https://github.com/Mozilla-Ocho/llamafile last week and it's pretty cool! One binary file running on localhost and that's it.
It would be cool if this could be plugged together with SemanticFinder's frontend and the llamafile as backend to have more power (and instant access to GPU/CUDA etc. if available). It seems that there is a hard limit of what modern browser can handle atm regarding file size etc. so having this simple solution would be quite cool. If I find some time I'll run some tests.

0 replies

do-me · 2023-12-23T11:38:05Z

do-me
Dec 23, 2023
Maintainer Author

Alright, so I just tested llamafile and the good thing is, it provides a super convenient API with a readableStream object. If you have the binary running locally, you can use the API in JS with this code, logging every token to the console:

// Your fetch request
const url = "http://127.0.0.1:8080/completion";
const headers = {
  "accept": "text/event-stream",
  "accept-language": "de-DE,de;q=0.9,en-US;q=0.8,en;q=0.7",
  "cache-control": "no-cache",
  "content-type": "application/json",
  "pragma": "no-cache",
  "sec-ch-ua": "\"Not_A Brand\";v=\"8\", \"Chromium\";v=\"120\", \"Google Chrome\";v=\"120\"",
  "sec-ch-ua-mobile": "?0",
  "sec-ch-ua-platform": "\"Windows\"",
  "sec-fetch-dest": "empty",
  "sec-fetch-mode": "cors",
  "sec-fetch-site": "same-origin",
  // ... other headers
};

const body = {
  "stream": true,
  "n_predict": 400,
  "temperature": 0.7,
  "stop": ["</s>", "Llama:", "User:"],
  "repeat_last_n": 256,
  "repeat_penalty": 1.18,
  "top_k": 40,
  "top_p": 0.5,
  "tfs_z": 1,
  "typical_p": 1,
  "presence_penalty": 0,
  "frequency_penalty": 0,
  "mirostat": 0,
  "mirostat_tau": 5,
  "mirostat_eta": 0.1,
  "grammar": "",
  "n_probs": 0,
  "image_data": [],
  "cache_prompt": true,
  "slot_id": 0,
  "prompt": "This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.\n\nUser:What is the meaning of life?\nLlama:"
};

fetch(url, {
  method: "POST",
  headers: headers,
  body: JSON.stringify(body),
  referrer: "http://127.0.0.1:8080/",
  //referrerPolicy: "strict-origin-when-cross-origin",
  mode: "cors",
  credentials: "omit",
  // ... other options
})
  .then(response => {
    const reader = response.body.getReader();

    // Read the stream and process data as it comes
    return reader.read().then(function processText({ done, value }) {
      if (done) {
        console.log("Stream completed");
        return;
      }

      // Assuming the value is a Uint8Array, convert it to text
      const text = new TextDecoder().decode(value);
      
         // Check if the text contains "data: "
  if (text.startsWith("data: ")) {
    // Remove the "data: " prefix
    const jsonDataString = text.substring("data: ".length);

    try {
      // Parse the response text as JSON
      const jsonData = JSON.parse(jsonDataString);
      
      // Log the response JSON object
      console.log(jsonData.content);
    } catch (error) {
      console.error("Error parsing JSON:", error);
    }
  }

      // Continue reading the stream
      return reader.read().then(processText);
    });
  })
  .catch(error => {
    console.error("Error:", error);
  });

VM577:75  The
VM577:75  meaning
VM577:75  of
VM577:75  life
VM577:75  is
...

However, the only thing, stopping me from connecting it to SemanticFinder is CORS. I guess, this is something that should be optional as a flag on llamafile side. Will ask the folks over there whether it would be in scope to offer CORS policy modification as an option for web experiments.

0 replies

do-me · 2024-03-05T18:58:20Z

do-me
Mar 5, 2024
Maintainer Author

Just integrated Ollama. It has a huge community and provides exactly what I had in mind!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text generation (e.g.with Llama 2) #28

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 11 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Text generation (e.g.with Llama 2) #28

do-me Jul 24, 2023 Maintainer

Replies: 11 comments · 3 replies

varunneal Jul 25, 2023 Collaborator

do-me Jul 26, 2023 Maintainer Author

do-me Aug 4, 2023 Maintainer Author

varunneal Aug 4, 2023 Collaborator

varunneal Aug 5, 2023 Collaborator

do-me Aug 7, 2023 Maintainer Author

do-me Aug 10, 2023 Maintainer Author

do-me Aug 17, 2023 Maintainer Author

do-me Aug 23, 2023 Maintainer Author

Query

Top 3 results

Summary

do-me Aug 23, 2023 Maintainer Author

varunneal Aug 24, 2023 Collaborator

do-me Dec 21, 2023 Maintainer Author

do-me Dec 23, 2023 Maintainer Author

do-me Mar 5, 2024 Maintainer Author

do-me
Jul 24, 2023
Maintainer

Replies: 11 comments 3 replies

varunneal
Jul 25, 2023
Collaborator

do-me Jul 26, 2023
Maintainer Author

do-me
Aug 4, 2023
Maintainer Author

varunneal
Aug 4, 2023
Collaborator

varunneal Aug 5, 2023
Collaborator

do-me
Aug 7, 2023
Maintainer Author

do-me
Aug 10, 2023
Maintainer Author

do-me
Aug 17, 2023
Maintainer Author

do-me
Aug 23, 2023
Maintainer Author

do-me
Aug 23, 2023
Maintainer Author

varunneal Aug 24, 2023
Collaborator

do-me
Dec 21, 2023
Maintainer Author

do-me
Dec 23, 2023
Maintainer Author

do-me
Mar 5, 2024
Maintainer Author