Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change multiple models to Multimodel #51

Merged
merged 6 commits into from
Dec 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/ci-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,10 @@ jobs:
cd tmp
mkdir -p basic/data
mkdir -p arfi/data
mkdir -p multiple_models/data
mkdir -p multimodel/data
cp ../.github/workflows/test_data/labels.csv basic/data/labels.csv
cp ../.github/workflows/test_data/labels.csv arfi/data/labels.csv
cp ../.github/workflows/test_data/labels.csv multiple_models/data/labels.csv
cp ../.github/workflows/test_data/labels.csv multimodel/data/labels.csv
- name: Test makita templates
run: |
cd tmp/basic
Expand All @@ -36,8 +36,8 @@ jobs:
cd ../arfi
asreview makita template arfi | tee output.txt
grep -q "ERROR" output.txt && exit 1 || true
cd ../multiple_models
asreview makita template multiple_models | tee output.txt
cd ../multimodel
asreview makita template multimodel | tee output.txt
grep -q "ERROR" output.txt && exit 1 || true
- name: Run ShellCheck
if: ${{ matrix.os != 'windows-latest' }}
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,9 +156,9 @@ optional arguments:
--stop_if STOP_IF The number of label actions to simulate. Default 'min' will stop simulating when all relevant records are found.
```

### Multiple models template
### Multimodel template

command: `multiple_models`
command: `multimodel`

The multiple model template prepares a script for running a simulation study comparing multiple models for one dataset and a fixed set of priors (one relevant and one irrelevant record; identical across models).

Expand Down Expand Up @@ -191,7 +191,7 @@ want to exclude the combinations of `nb` with `doc2vec` and `logistic` with
`tfidf`, use the following command:

```console
asreview makita template multiple_models --classifiers logistic nb --feature_extractors tfidf doc2vec --impossible_models nb,doc2vec logistic,tfidf
asreview makita template multimodel --classifiers logistic nb --feature_extractors tfidf doc2vec --impossible_models nb,doc2vec logistic,tfidf
```

## Advanced usage
Expand Down Expand Up @@ -250,7 +250,7 @@ The following scripts are available:

#### Time to Discovery Tables

The 'merge_tds.py' script creates a table of the time to discovery (TD) values for each dataset, with each row corresponding to each record ID of the relevant records in a dataset, and the columns correspond to each simulation run (e.g, for the multiple models template each column corresponds to a simualtion run with each active learning model). Additionally, the tables includes the average-record-TD values (the average of the TD values for a record across multiple simulation runs), and the average-simulation-TD values (the average of the TD values across all records for a single simulation run).
The 'merge_tds.py' script creates a table of the time to discovery (TD) values for each dataset, with each row corresponding to each record ID of the relevant records in a dataset, and the columns correspond to each simulation run (e.g, for the Multimodel template each column corresponds to a simualtion run with each active learning model). Additionally, the tables includes the average-record-TD values (the average of the TD values for a record across multiple simulation runs), and the average-simulation-TD values (the average of the TD values across all records for a single simulation run).

#### Run Makita via Docker

Expand Down
18 changes: 11 additions & 7 deletions asreviewcontrib/makita/entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from asreviewcontrib.makita.config import TEMPLATES_FP
from asreviewcontrib.makita.template_arfi import render_jobs_arfi
from asreviewcontrib.makita.template_basic import render_jobs_basic
from asreviewcontrib.makita.template_multiple_models import render_jobs_multiple_models
from asreviewcontrib.makita.template_multimodel import render_jobs_multimodel
from asreviewcontrib.makita.utils import FileHandler


Expand Down Expand Up @@ -88,7 +88,7 @@ def execute(self, argv): # noqa: C901
"--n_runs",
type=int,
default=1,
help="Number of runs. Only for templates 'basic' and 'multiple_models'. "
help="Number of runs. Only for templates 'basic' and 'multimodel'. "
"Default: 1.",
)
parser_template.add_argument(
Expand Down Expand Up @@ -149,21 +149,21 @@ def execute(self, argv): # noqa: C901
"--classifiers",
nargs="+",
default=["logistic", "nb", "rf", "svm"],
help="Classifiers to use. Only for template 'multiple_models'. "
help="Classifiers to use. Only for template 'multimodel'. "
"Default: ['logistic', 'nb', 'rf', 'svm']",
)
parser_template.add_argument(
"--feature_extractors",
nargs="+",
default=["doc2vec", "sbert", "tfidf"],
help="Feature extractors to use. Only for template 'multiple_models'. "
help="Feature extractors to use. Only for template 'multimodel'. "
"Default: ['doc2vec', 'sbert', 'tfidf']",
)
parser_template.add_argument(
"--impossible_models",
nargs="+",
default=["nb,doc2vec", "nb,sbert"],
help="Model combinations to exclude. Only for template 'multiple_models'. "
help="Model combinations to exclude. Only for template 'multimodel'. "
"Default: ['nb,doc2vec', 'nb,sbert']",
)

Expand Down Expand Up @@ -194,6 +194,10 @@ def _template_cli(self, args):
def _template(self, args):
"""Generate a template."""

# backwards compatibility for 'multiple_models'
if args.name == "multiple_models":
args.name = "multimodel"

# check if a custom template is used, otherwise use the default template
fp_template = args.template or (args.name and _get_template_fp(args.name))
_is_valid_template(fp_template)
Expand Down Expand Up @@ -252,9 +256,9 @@ def _template(self, args):
platform_sys=args.platform,
)

elif args.name in ["multiple_models"]:
elif args.name in ["multimodel"]:
# render jobs
job = render_jobs_multiple_models(
job = render_jobs_multimodel(
datasets,
output_folder=Path(args.o),
create_wordclouds=args.no_wordclouds,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Render multiple_models template."""
"""Render multimodel template."""

import os
import platform
Expand All @@ -11,7 +11,7 @@
from asreviewcontrib.makita.utils import check_filename_dataset


def render_jobs_multiple_models(
def render_jobs_multimodel(
datasets,
output_folder="output",
n_runs=1,
Expand Down Expand Up @@ -88,7 +88,7 @@ def render_jobs_multiple_models(
"doc",
datasets=datasets,
template_name=template.name
if template.name == "multiple_models"
if template.name == "multimodel"
else "custom",
template_name_long=template.name_long,
template_scripts=template.scripts,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
name: multiple_models
name: multimodel
name_long: Basic simulation for every possible combination of selected models

scripts:
Expand All @@ -12,7 +12,7 @@ docs:
- README.md
---

{# This is a template for the multiple_models method #}
{# This is a template for the multimodel method #}
# version {{ version }}

# Create folder structure. By default, the folder 'output' is used to store output.
Expand Down
6 changes: 3 additions & 3 deletions examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ synergy_dataset get -d van_de_Schoot_2018 Smid_2020 -o examples/basic_example/da
cd examples/basic_example
asreview makita template basic
cd ../..
synergy_dataset get -d van_de_Schoot_2018 Smid_2020 -o examples/multiple_models_example/data -l
cd examples/multiple_models_example
asreview makita template multiple_models
synergy_dataset get -d van_de_Schoot_2018 Smid_2020 -o examples/multimodel_example/data -l
cd examples/multimodel_example
asreview makita template multimodel
cd ../..
```
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

*This project was rendered with ASReview-Makita version 0.0.0.*

This project was rendered from the Makita-multiple_models template. See [asreview/asreview-makita#templates](https://github.com/asreview/asreview-makita#templates) for template rules and formats.
This project was rendered from the Makita-multimodel template. See [asreview/asreview-makita#templates](https://github.com/asreview/asreview-makita#templates) for template rules and formats.

The template is described as: 'Basic simulation for every possible combination of selected models'.

Expand Down