Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/multilingual #943

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
b6464ea
Add Feature: translation functionality
masayaOgushi Oct 9, 2024
f94bb2e
Add Feature: probes add translation function
masayaOgushi Oct 9, 2024
2238d18
Add Feature: detector add translation capabilities
masayaOgushi Oct 9, 2024
7202e19
Add Feature: Enhance command-line interface with new translation options
masayaOgushi Oct 9, 2024
1105bb1
chore: Update dependencies in requirements.txt, pyproject.toml
masayaOgushi Oct 9, 2024
6bb7da3
docs: Add translation documentation
masayaOgushi Oct 9, 2024
717f0ff
Merge branch 'leondz:main' into feature/multilingual
SnowMasaya Oct 9, 2024
b35cc1e
Update Feature: Translator
masayaOgushi Oct 23, 2024
bbb6c76
Update Feature: Probes
masayaOgushi Oct 23, 2024
51baeb2
Update Feature: Detectors
masayaOgushi Oct 23, 2024
dc3a4ab
Update Feature: cli
masayaOgushi Oct 23, 2024
ee82261
Update Feature: config
masayaOgushi Oct 23, 2024
7cb8acc
Update Feature: conftest
masayaOgushi Oct 23, 2024
ec9b40a
Remove: library
masayaOgushi Oct 23, 2024
d50d19e
Update Doc
masayaOgushi Oct 23, 2024
808f34a
Merge branch 'feature/multilingual' of https://github.com/SnowMasaya/…
masayaOgushi Oct 23, 2024
8a41c95
Merge branch 'main' into feature/multilingual
SnowMasaya Oct 23, 2024
2fc2dd5
Fix test
masayaOgushi Oct 23, 2024
8283b65
Merge branch 'feature/multilingual' of https://github.com/SnowMasaya/…
masayaOgushi Oct 23, 2024
395840d
Update Feature Translation
masayaOgushi Oct 31, 2024
73363f9
Add Feature Probes
masayaOgushi Oct 31, 2024
57d14e5
Update Feature Detectors
masayaOgushi Oct 31, 2024
3b3b60a
Update test
masayaOgushi Oct 31, 2024
bae54d7
Add library
masayaOgushi Oct 31, 2024
022b821
Remove test code
masayaOgushi Dec 12, 2024
ad475ba
Add Feature
masayaOgushi Dec 12, 2024
a816836
Remove translation check
masayaOgushi Dec 12, 2024
e7363de
Update reverse translation
masayaOgushi Dec 12, 2024
563060b
Remove translation function
masayaOgushi Dec 12, 2024
a3922e7
Add detector test
masayaOgushi Dec 12, 2024
a3dd8de
Update probes
masayaOgushi Dec 12, 2024
4c4ad68
Update harness base
masayaOgushi Dec 12, 2024
641b851
Add probe test code
masayaOgushi Dec 12, 2024
8a0ee80
Update Translation
masayaOgushi Dec 12, 2024
b877b97
Update test translation
masayaOgushi Dec 12, 2024
5820f95
Update doc
masayaOgushi Dec 12, 2024
e53e7d2
Merge 'main' into feature/multilingual
jmartin-tech Feb 7, 2025
e5a08c7
Streamline translation use case
jmartin-tech Nov 7, 2024
6780578
Merge pull request #1 from jmartin-tech/feature/multilingual-translation
SnowMasaya Feb 14, 2025
5da27d2
cleanup imports and tests
jmartin-tech Feb 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/configurable.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,8 @@ such as ``show_100_pass_modules``.
* ``seed`` - An optional random seed
* ``eval_threshold`` - At what point in the 0..1 range output by detectors does a result count as a successful attack / hit
* ``user_agent`` - What HTTP user agent string should garak use? ``{version}`` can be used to signify where garak version ID should go
docs/source/configurable.rst* ``lang_spec`` - A single bcp47 value the target application for LLM accepts as prompt and output
* ``translators`` - A list of configurations representing translators for converting from probe bcp47 language to land_spec target bcp47 languages

``plugins`` config items
""""""""""""""""""""""""
Expand Down
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ Code reference
payloads
_config
_plugins
translator

Plugin structure
^^^^^^^^^^^^^^^^
Expand Down
190 changes: 190 additions & 0 deletions docs/source/translator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
The ``translator.py`` module in the Garak framework is designed to handle text translation tasks using various translation services and models.
It provides several classes, each implementing different translation strategies and models, including both cloud-based services,
like `DeepL<https://www.deepl.com/>`_ and `NVIDIA Riva<https://build.nvidia.com/nvidia/megatron-1b-nmt>`_, and local models like facebook/m2m100 available on `Hugging Face<https://huggingface.co/>`_.

garak.translator
================

.. automodule:: garak.translator
:members:
:undoc-members:
:show-inheritance:

Translation support
===================

This module adds translation support for probe and detector keywords and triggers.
Allowing testing of models that accept and produce text in languages other than the language the plugin was written for.

* limitations:
- This functionality is strongly coupled to ``bcp47`` code "en" for sentence detection and structure at this time.
- Reverse translation is required for snowball probes, and Huggingface detectors due to model load formats.
- Huggingface detectors primarily load English models. Requiring a target language NLI model for the detector.
- If probes or detectors fail to load, you need may need to choose a smaller local translation model or utilize a remote service.
- Translation may add significant execution time to the run depending on resources available.

Supported translation services
------------------------------

- Huggingface
- This project supports usage of the following translation models:
- `Helsinki-NLP/opus-mt-{<source_lang>-<target_lang>} <https://huggingface.co/docs/transformers/model_doc/marian>`_
- `facebook/m2m100_418M <https://huggingface.co/facebook/m2m100_418M>`_
- `facebook/m2m100_1.2B <https://huggingface.co/facebook/m2m100_1.2B>`_
- `DeepL <https://www.deepl.com/docs-api>`_
- `NVIDIA Riva <https://build.nvidia.com/nvidia/megatron-1b-nmt>`_

API KEY Requirements
--------------------

To use use DeepL API or Riva API to translate probe and detector keywords and triggers from cloud services an API key must be supplied.

API keys for the preferred service can be obtained in following locations:
- `DeepL <https://www.deepl.com/en/pro-api>`_
- `Riva <https://build.nvidia.com/nvidia/megatron-1b-nmt>`_

Supported languages for remote services:
- `DeepL <https://developers.deepl.com/docs/resources/supported-languages>`_
- `Riva <https://docs.nvidia.com/nim/riva/nmt/latest/getting-started.html#supported-languages>`_

API keys can be stored in environment variables with the following commands:

DeepL
~~~~~

.. code-block:: bash

export DEEPL_API_KEY=xxxx

RIVA
~~~

.. code-block:: bash

export RIVA_API_KEY=xxxx

Configuration file
------------------

Translation function is configured in the `run` section of a configuration with the following keys:

lang_spec - A single `bcp47` entry designating the language of the target under test. "ja", "fr", "jap" etc.
translators - A list of language pair designated translator configurations.

* Note: The `Helsinki-NLP/opus-mt-{source}-{target}` case uses different language formats. The language codes used to name models are inconsistent.
Two-digit codes can usually be found `here<https://developers.google.com/admin-sdk/directory/v1/languages>`_, while three-digit codes require
a search such as “language code {code}". More details can be found `here <https://github.com/Helsinki-NLP/OPUS-MT-train/tree/master/models>`_.

A translator configuration is provided using the project's configurable pattern with the following required keys:

* ``language`` - A `-` separated pair of `bcp47` entires describing translation format provided by the configuration
* ``model_type`` - the module and optional instance class to be instantiated. local, remote, remote.DeeplTranslator etc.
* ``model_name`` - (optional) the model name loaded for translation, required for ``local`` translator model_type

(Optional) Model specific parameters defined by the translator model type may exist.

* Note: local translation support loads a model and is not designed to support crossing the multi-processing boundary.

The translator configuration can be written to a file and the path passed, with the ``--config`` cli option.

An example template is provided below.

.. code-block:: yaml
run:
lang_spec: {target language code}
translators:
- language: {source language code}-{target language code}
api_key: {your API key}
model_type: {translator module or module.classname}
model_name: {huggingface model name}
- language: {target language code}-{source language code}
api_key: {your API key}
model_type: {translator module or module.classname}
model_name: {huggingface model name}

* Note: each translator is configured for a single translation pair and specification is required in each direction for a run to proceed.

Examples for translation configuration
--------------------------------------

DeepL
~~~~~

To use DeepL translation in garak, run the following command:
You use the following yaml config.

.. code-block:: yaml
run:
lang_spec: {target language code}
translator:
- language: {source language code}-{target language code}
model_type: remote.DeeplTranslator
- language: {target language code}-{source language code}
model_type: remote.DeeplTranslator


.. code-block:: bash

export DEEPL_API_KEY=xxxx
python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config {path to your yaml config file}


Riva
~~~~

For Riva, run the following command:
You use the following yaml config.

.. code-block:: yaml

run:
translation:
- language: {source language code}-{target language code}
model_type: remote
- language: {target language code}-{source language code}
model_type: remote


.. code-block:: bash

export RIVA_API_KEY=xxxx
python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config {path to your yaml config file}


Local
~~~~~

For local translation, use the following command:
You use the following yaml config.

.. code-block:: yaml
run:
lang_spec: ja
translators:
- language: en-ja
model_type: local
model_name: facebook/m2m100_418M
- language: jap-en
model_type: local
model_name: facebook/m2m100_418M


.. code-block:: bash

python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config {path to your yaml config file}


.. code-block:: yaml
run:
lang_spec: jap
translators:
- language: en-jap
model_type: local
model_name: Helsinki-NLP/opus-mt-{}
- language: jap-en
model_type: local
model_name: Helsinki-NLP/opus-mt-{}

.. code-block:: bash

python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config {path to your yaml config file}
2 changes: 2 additions & 0 deletions garak/_config.py
jmartin-tech marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,8 @@ def _nested_dict():

# this is so popular, let's set a default. what other defaults are worth setting? what's the policy?
run.seed = None
run.lang_spec = "en"
run.translators = []

# placeholder
# generator, probe, detector, buff = {}, {}, {}, {}
Expand Down
10 changes: 10 additions & 0 deletions garak/attempt.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,8 @@ def __init__(
detector_results=None,
goal=None,
seq=-1,
bcp47=None, # language code for prompt as sent to the target
reverse_translator_outputs=None,
) -> None:
self.uuid = uuid.uuid4()
self.messages = []
Expand All @@ -86,6 +88,10 @@ def __init__(
self.seq = seq
if prompt is not None:
self.prompt = prompt
self.bcp47 = bcp47
self.reverse_translator_outputs = (
{} if reverse_translator_outputs is None else reverse_translator_outputs
)

def as_dict(self) -> dict:
"""Converts the attempt to a dictionary."""
Expand All @@ -103,6 +109,10 @@ def as_dict(self) -> dict:
"notes": self.notes,
"goal": self.goal,
"messages": self.messages,
"bcp47": self.bcp47,
"reverse_translator_outputs": {
k: list(v) for k, v in self.reverse_translator_outputs.items()
},
}

@property
Expand Down
41 changes: 40 additions & 1 deletion garak/detectors/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,13 +62,22 @@ def __init__(self, config_root=_config):
)

logging.info(f"detector init: {self}")
self.reverse_translator = self.get_reverse_translator()

self.reverse_translator_outputs = []

def detect(self, attempt: garak.attempt.Attempt) -> Iterable[float]:
"""Takes a list of Attempts; classifies them; returns a list of results
in the range 0.0-1.0 where 0.0 represents no hit, 1.0 represents successful hit
(i.e. vulnerability found)."""
return [0.5] * len(attempt.all_outputs)

def get_reverse_translator(self):
from garak.translator import get_translator

translator_instance = get_translator(self.bcp47, True)
return translator_instance


class HFDetector(Detector, HFCompatible):
"""Detector using a Hugging Face model"""
Expand Down Expand Up @@ -129,13 +138,23 @@ def __init__(self, config_root=_config):

self.graceful_fail = False

self.reverse_translator = self.get_reverse_translator()

def detect(self, attempt: garak.attempt.Attempt) -> List[float]:
# goal: skip None outputs
# don't adjust attempt.outputs

non_none_outputs = [
v for k, v in enumerate(attempt.all_outputs) if v is not None
]
self.reverse_translator_outputs = []
if hasattr(self, "reverse_translator"):
if self.reverse_translator is not None and attempt.bcp47 != "en":
non_none_outputs = self.reverse_translator.translate_prompts(
non_none_outputs,
reverse_translate_judge=True,
)
self.reverse_translator_outputs = non_none_outputs
# non_none_offsets = [k for k,v in enumerate(attempt.all_outputs) if v is not None] # we'll want this when outputs+scores need to align
try:
detector_raw_results = self.detector(
Expand Down Expand Up @@ -177,10 +196,20 @@ def detect(
self, attempt: garak.attempt.Attempt, case_sensitive=False
) -> Iterable[float]:
detector_results = []
self.reverse_translator_outputs = []
for output in attempt.all_outputs:
match = False
if output is None:
continue
match = False
if hasattr(self, "reverse_translator"):
if self.reverse_translator is not None and attempt.bcp47 != "en":
output = "\n".join(
self.reverse_translator.translate_prompts(
output.split("\n"),
reverse_translate_judge=True,
)
)
self.reverse_translator_outputs.append(output)
for s in self.substrings:
if not case_sensitive:
s, output = s.lower(), output.lower()
Expand Down Expand Up @@ -210,13 +239,23 @@ def detect(
self, attempt: garak.attempt.Attempt, case_sensitive=False
) -> Iterable[float]:
detector_results = []
self.reverse_translator_outputs = []
if "triggers" in attempt.notes:
triggers = attempt.notes["triggers"]
if isinstance(triggers, str):
triggers = [triggers]
for output in attempt.all_outputs:
if output is None:
continue # trigger is absent
if hasattr(self, "reverse_translator"):
if self.reverse_translator is not None and attempt.bcp47 != "en":
output = "\n".join(
self.reverse_translator.translate_prompts(
output.split("\n"),
reverse_translate_judge=True,
)
)
self.reverse_translator_outputs.append(output)

match = False
for trigger in triggers:
Expand Down
12 changes: 12 additions & 0 deletions garak/detectors/encoding.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,22 @@ def _ngram_match_func(self, target, text, case_sensitive=False, n=3):

def detect(self, attempt: Attempt, case_sensitive=False, n=4):
detector_results = []
self.reverse_translator_outputs = []
# why would this detector need to support translation?
# The encoded values should be in self.substrings and does not look to have been translated when injected.
for q in attempt.all_outputs:

if q is None:
continue
if hasattr(self, "reverse_translator"):
if self.reverse_translator is not None and attempt.bcp47 != "en":
q = "\n".join(
self.reverse_translator.translate_prompts(
q.split("\n"),
reverse_translate_judge=True,
)
)
self.reverse_translator_outputs.append(q)

best_match = 0.0
for s in self.substrings:
Expand Down
10 changes: 10 additions & 0 deletions garak/detectors/leakreplay.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,21 @@ class StartsWith(Detector):

def detect(self, attempt: Attempt, case_sensitive=False) -> List[float]:
detector_results = []
self.reverse_translator_outputs = []
if "triggers" in attempt.notes:
triggers = attempt.notes["triggers"]
for output in attempt.all_outputs:
if output is None:
continue
if hasattr(self, "reverse_translator"):
if self.reverse_translator is not None and attempt.bcp47 != "en":
output = "\n".join(
self.reverse_translator.translate_prompts(
output.split("\n"),
reverse_translate_judge=True,
)
)
self.reverse_translator_outputs.append(output)

match = False
output = output.strip()
Expand Down
Loading