Augmenting TEA with open-source LLMs #414
Replies: 5 comments
-
Those are great first results! I took a deep look at Mistral response, and I was really impressed by two of its suggestions:
However, it also produced some imprecisions. Which is expected, given we're using the model out-of-the-box:
I can see two ways to improve this: 1) we can fine-tune the model, and/or 2) we can implement Retrieval-Augmented Generation. In both cases, we would need to develop a dataset that suits our needs. I would also suggest, for PoC purposes, to focus on a more modest instruction: "describe" and "improve" seem very ambitious. Maybe, "what evidence can I use for property {X}"? |
Beta Was this translation helpful? Give feedback.
-
These are great suggestions, @cptanalatriste. One of the other reasons I want to have the TEA schema developed is so that we can be more precise with keywords such as these. With LangChain and AutoGen, you can also split the instructions down further by customising specific prompt templates for different agents and allowing them to access skills that would provide further structure. |
Beta Was this translation helpful? Give feedback.
-
So, here are some updates for Please summarise this assurance case. Focus on the top-level goal and the key strategies. Identify any relevant property claims to help explain the respective strategies.
Are there any missing property claims that could be added to Strategy S3?
Please suggest relevant forms of evidence for all of the property claims associated with S3, including the new ones you have proposed.
|
Beta Was this translation helpful? Give feedback.
-
@chrisdburr these look great! Specially the "missing property claims". However, the "summary" question ignored a strategy, and I'm not convinced about its suggestion for P10 (asking for users if their privacy is respected feels strange). I do believe using RAG can limit the number of allucinations. If you agree, I can spend a couple of weeks building a PoC and see if its an avenue worth pursuing. |
Beta Was this translation helpful? Give feedback.
-
For fine-tuning, this may be a good option: https://unsloth.ai/blog/llama3 Notebook is here: https://colab.research.google.com/drive/1mPw6P52cERr93w3CMBiJjocdTnyPiKTX#scrollTo=6bZsfBuZDeCL |
Beta Was this translation helpful? Give feedback.
-
Opening this discussion about a possible research avenue to explore. Please review the following note to see the results of an early experiment using open-source LLMs to evaluate an assurance case:
https://hackmd.io/@tea-platform/HkjWQnQeR
Research Questions
Beta Was this translation helpful? Give feedback.
All reactions