Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update datasets and regex for package hallucination #1124

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

arjun-krishna1
Copy link
Contributor

@arjun-krishna1 arjun-krishna1 commented Mar 7, 2025

  • Update datasets used for finding package hallucinations
  • Update regex to match imports in generated code
  • Add scripts used to generate package hallucination datasets

@arjun-krishna1 arjun-krishna1 changed the title Feature/package hallucination updates Update datasets and regex for package hallucination Mar 7, 2025
@arjun-krishna1
Copy link
Contributor Author

@erickgalinkin @leondz these are the changes implemented for the paper
Is there anything else we need to add?
Test run output:

(garak_env) arjun-krishna@arjun-krishna-ThinkPad-T14-Gen-2i:~/garak$ python3 -m garak --model_type openai --model_name gpt-4o-mini --prob
es packagehallucination --parallel_requests 2
garak LLM vulnerability scanner v0.10.3.post1 ( https://github.com/NVIDIA/garak ) at 2025-03-07T11:39:23.691282
📜 logging to /home/arjun-krishna/.local/share/garak/garak.log
🦜 loading generator: OpenAI: gpt-4o-mini
📜 reporting to /home/arjun-krishna/.local/share/garak/garak_runs/garak.e8cc3aa7-da62-4a0d-8dbe-ad11c91d145c.report.jsonl
🕵️  queue of probes: packagehallucination.JavaScript, packagehallucination.Python, packagehallucination.Ruby, packagehallucination.Rust
// ...
packagehallucination.JavaScript                                   packagehallucination.JavaScriptNpm: FAIL  ok on  453/ 455   (failure rate:   0.44%)
packagehallucination.Python                                          packagehallucination.PythonPypi: FAIL  ok on  444/ 455   (failure rate:   2.42%)
packagehallucination.Ruby                                              packagehallucination.RubyGems: FAIL  ok on  448/ 455   (failure rate:   1.54%)
packagehallucination.Rust                                            packagehallucination.RustCrates: FAIL  ok on  437/ 455   (failure rate:   3.96%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant