Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom Prompts Example Config, Docs #42

Merged
merged 10 commits into from
Jul 12, 2024
Merged
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,14 @@ In order to revise your manuscript, prompts must be provided to the AI model. Th
- **Default prompts**: you can use the default prompts provided by the tool, in which case you don't need to do anything.
- **Custom prompts**: you can define your own prompts to apply to specific files using YAML configuration files that you include with your manuscript.

The default prompt, which should work for most manuscripts, is the following:
If you wish to customize the prompts on a per-file basis, see [docs/custom-prompts.md](docs/custom-prompts.md) for more information.

vincerubinetti marked this conversation as resolved.
Show resolved Hide resolved
```
Proofread the following paragraph that is part of a scientific manuscript.
Keep all Markdown formatting, citations to other articles, mathematical expressions, and equations.
```
### Caveats

If you wish to customize the prompts on a per-file basis, see [docs/custom-prompts.md](docs/custom-prompts.md) for more information.
In the current implementation, the editor can only process one paragraph at a time.
This limits the contextual information the LLM receives and thus the specificity of what it can check and fix.
For example, in the Discussion section of a manuscript, the first paragraph should typically summarize the findings from the Results section, while the rest of the paragraphs should follow a different structure, but the AI editor can only judge each paragraph in the same way.
We plan to reduce or remove this limitation in the future.

### Command line

Expand All @@ -48,7 +48,7 @@ For example, to change the temperature parameter of OpenAI models, you can expor
Then, within the root directory of your Manubot-based manuscript, run the following commands (**IMPORTANT:** this will overwrite your original manuscript!):

```bash
manubot ai-revision --content-directory content/
manubot ai-revision --content-directory content/ --config-directory ci/
```

The tool will revise each paragraph of your manuscript and write back the revised files in the same directory.
Expand All @@ -59,6 +59,7 @@ Before using the OpenAI API and incurring costs, you can run a test by using a d
```bash
manubot ai-revision \
--content-directory content/ \
--config-directory ci/ \
--model-type DummyManuscriptRevisionModel \
--model-kwargs add_paragraph_marks=True
```
Expand Down Expand Up @@ -88,8 +89,12 @@ from manubot_ai_editor.models import GPT3CompletionModel
# create a manuscript editor object
# here content_dir points to the "content" directory of the Manubot-based
# manuscript, where Markdown files (*.md) are located
# config_dir points to where CI-related configuration, including the AI
# editor's configuration, is stored. it's optional, and if left out will
# resort to defaults.
me = ManuscriptEditor(
content_dir="content",
config_dir="ci"
)

# create a model to revise the manuscript
Expand Down
5 changes: 3 additions & 2 deletions docs/custom-prompts.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,15 @@ There are two ways that you can use the custom prompts system:
2. You can create the `ai-revision-prompts.yaml`, but only specify prompts and identifiers, which makes it suitable for sharing with others who have different names for their manuscripts' files.
You would then specify a second file, `ai-revision-config.yaml`, that maps the prompt identifiers to the actual files in your manuscript.

These files should be placed in the `content` directory alongside your manuscript markdown files.
These files should be placed in the `ci` directory under your manubot root directory.

See [Functionality Notes](#functionality-notes) later in this document for more information on how to write regular expressions and use placeholders in your prompts.


## Approach 1: Single file

With this approach, you can define your prompts and how they map to your manuscript files in a single file.
The single file should be named `ai-revision-prompts.yaml` and placed in the `content` folder.
The single file should be named `ai-revision-prompts.yaml` and placed in the `ci` folder.

The file would look something like the following:

Expand Down
8 changes: 4 additions & 4 deletions libs/manubot_ai_editor/editor.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@ def revise_and_write_paragraph(
Arguments:
paragraph: list of lines of the paragraph.
section_name: name of the section the paragraph belongs to.
resolved_prompt: a prompt resolved via the ai_revision prompt config; None if unavailable
resolved_prompt: a prompt resolved via the ai-revision prompt config; None if unavailable
revision_model: model to use for revision.
outfile: file object to write the revised paragraph to.

Expand Down Expand Up @@ -267,7 +267,7 @@ def revise_file(
output_dir (Path | str): path to the directory where the revised file will be written.
revision_model (ManuscriptRevisionModel): model to use for revision.
section_name (str, optional): Defaults to None. If so, it will be inferred from the filename.
resolved_prompt (str, optional): A prompt resolved via ai_revision prompt config files, which overrides any custom or section-derived prompts; None if unavailable.
resolved_prompt (str, optional): A prompt resolved via ai-revision prompt config files, which overrides any custom or section-derived prompts; None if unavailable.
"""
input_filepath = self.content_dir / input_filename
assert input_filepath.exists(), f"Input file {input_filepath} does not exist"
Expand Down Expand Up @@ -466,7 +466,7 @@ def revise_manuscript(
for filename in sorted(self.content_dir.glob("*.md")):
filename_section = self.get_section_from_filename(filename.name)

# use the ai_revision prompt config to attempt to resolve a prompt
# use the ai-revision prompt config to attempt to resolve a prompt
resolved_prompt, _ = self.prompt_config.get_prompt_for_filename(
filename.name
)
Expand All @@ -477,7 +477,7 @@ def revise_manuscript(

# we do not process the file if all hold:
# 1. it has no section *or* resolved prompt
# 2. we're unable to resolve it via ai_revision prompt configuration
# 2. we're unable to resolve it via ai-revision prompt configuration
# 2. there is no custom prompt
if (filename_section is None and resolved_prompt is None) and (
env_vars.CUSTOM_PROMPT not in os.environ
Expand Down
6 changes: 3 additions & 3 deletions libs/manubot_ai_editor/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def revise_paragraph(self, paragraph_text, section_name, resolved_prompt=None):
Args:
paragraph_text (str): text of the paragraph to revise.
section_name (str): name of the section the paragraph belongs to.
resolved_prompt (str): prompt resolved via ai_revision config files, if available
resolved_prompt (str): prompt resolved via ai-revision config files, if available

Returns:
Revised paragraph text.
Expand Down Expand Up @@ -265,7 +265,7 @@ def get_prompt(
Args:
paragraph_text: text of the paragraph to revise.
section_name: name of the section the paragraph belongs to.
resolved_prompt: prompt resolved via ai_revision config, if available
resolved_prompt: prompt resolved via ai-revision config, if available

Returns:
If self.endpoint != "edits", then returns a string with the prompt to be used by the model for the revision of the paragraph.
Expand Down Expand Up @@ -314,7 +314,7 @@ def get_prompt(
# a simple workaround is to remove {paragraph_text} from the prompt
prompt = custom_prompt.format(**placeholders)
elif resolved_prompt:
# use the resolved prompt from the ai_revision config files, if available
# use the resolved prompt from the ai-revision config files, if available
# replace placeholders with their actual values
prompt = resolved_prompt.format(**placeholders)
elif section_name in ("abstract",):
Expand Down
2 changes: 1 addition & 1 deletion libs/manubot_ai_editor/prompt_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ def get_prompt_for_filename(
resolved_default_prompt = None
if use_default and self.prompts is not None:
resolved_default_prompt = self.prompts.get(
get_obj_path(self.config, ("files", "default_prompt"), missing="default"),
get_obj_path(self.config, ("files", "default_prompt")),
None
)

Expand Down