Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

oj-sec · 2024-10-26T23:24:01Z

The bug
Updating from guidance==0.1.16 to prerelease guidance==0.2.0rc1 causes model.log_prob() to return 0 rather than the true log probs for a generation when using the llama.cpp backend. I have tested GGUF quants of models based on Llama, Mistral and Gemma and observed this behaviour to be model agnostic.

To Reproduce
Reproduction Colab notebook here - involves uninstalling and reinstalling Guidance, but the change in output between installs is:

# With guidance==0.1.16
from guidance import models, gen, select
import math

llm = models.LlamaCpp(f"./models/{model}", n_gpu_layers=40, n_ctx=2000, compute_log_probs=True)
output = llm + "You flip a coin. The result is: " + gen(name="coinflip", regex="(heads|tails)")
logprobs = output.log_prob("coinflip")
prob = round(math.exp(logprobs), 5)
print(f"Output:{output['coinflip']}\nLP: {logprobs}\nP: {prob}")

You flip a coin. The result is: heads
Output:heads
LP: -1.1534752799652015
P: 0.31554

# With guidance==0.2.0rc1
from guidance import models, gen, select
import math

llm = models.LlamaCpp(f"./models/{model}", n_gpu_layers=40, n_ctx=2000, compute_log_probs=True)
output = llm + "You flip a coin. The result is: " + gen(name="coinflip", regex="(heads|tails)")
logprobs = output.log_prob("coinflip")
prob = round(math.exp(logprobs), 5)
print(f"Output:{output['coinflip']}\nLP: {logprobs}\nP: {prob}")

You flip a coin. The result is: heads
Output:heads
LP: 0.0
P: 1.0

System info (please complete the following information):

OS (e.g. Ubuntu, Windows 11, Mac OS, etc.): Google Colab, Ubuntu 22.04
Guidance Version (guidance.__version__): 0.2.0rc1 (both from pypi and from https://github.com/microsoft/guidance.git@77fc3999e1545c10f17e6da1b6cbd1feeaa1ca1a)

If I can provide any further info please let me know. Huge thanks for this amazing library.

The text was updated successfully, but these errors were encountered:

hudson-ai · 2024-11-06T17:56:56Z

Hi @oj-sec thanks for bringing this up, and our apologies if it's impacting your workflow. The new parser that we're using in 0.2.0rc1 is considerably faster than what was running in previous versions, but it currently has a few limitations that we need to continue working on (probability outputs is probably the primary one). So, it's on our radar.

Thank you for submitting the issue!

woitee · 2025-01-14T15:42:00Z

@hudson-ai Since 0.2.0 has released and the visualization seems to display probabilities, can we expect this issue to get fixed (soon)?

hudson-ai · 2025-02-03T17:42:03Z

Hi @woitee, sorry for the late reply. We're currently displaying probabilities for outputted tokens, but we don't yet have a satisfactory solution for mapping these back to probabilities for a given capture (which is a string that may or may not align to token boundaries). Getting this working again is on the roadmap, but I don't yet have a timeline for you.

Would you mind giving me an idea of how you like to use this feature in practice? More example usages may help motivate a solution :)

woitee · 2025-02-03T20:42:19Z

@hudson-ai Thanks for the reply! The main thing I need this for is categorization, I use guidance for classifying e-mails and free-text forms into categories, and I used to provide users with information on how "certain" AI is in these, based on probabilities. If it worked only for select captures, it would solve 90% of usage for me. But I assume that still has to deal with token boundaries correctly and is far from trivial.

Can I access the token probabilities? I might be able to arrange my systems such that a token definitely ends before the select clause and different tokens are at the starts of options - and create some work arounds :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

oj-sec commented Oct 26, 2024 •

edited

Loading

hudson-ai commented Nov 6, 2024

woitee commented Jan 14, 2025

hudson-ai commented Feb 3, 2025

woitee commented Feb 3, 2025

Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

Prerelease 0.2.0rc1 no longer returns log probs with llama.cpp #1064

Comments

oj-sec commented Oct 26, 2024 • edited Loading

hudson-ai commented Nov 6, 2024

woitee commented Jan 14, 2025

hudson-ai commented Feb 3, 2025

woitee commented Feb 3, 2025

oj-sec commented Oct 26, 2024 •

edited

Loading