Skip to content

Commit

Permalink
Merge pull request #23 from one1note/master
Browse files Browse the repository at this point in the history
Changed name of the dataset from 110kDBRD to DBRD.
  • Loading branch information
iPieter authored May 19, 2021
2 parents c97032b + 1ad69ee commit 428990f
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ Using RobBERT's `model.pt`, this method allows you to use all other functionalit
All experiments are described in more detail in our [paper](https://arxiv.org/abs/2001.06286), with the code in [our GitHub repository](https://github.com/iPieter/RobBERT).

### Sentiment analysis
Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/110kDBRD).
Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/DBRD).

| Model | Accuracy [%] |
|-------------------|--------------------------|
Expand Down Expand Up @@ -234,7 +234,7 @@ In this section we describe how to use the scripts we provide to fine-tune model

#### Sentiment analysis using the Dutch Book Review Dataset

- Download the Dutch book review dataset from [https://github.com/benjaminvdb/110kDBRD](https://github.com/benjaminvdb/110kDBRD), and save it to `data/raw/110kDBRD`
- Download the Dutch book review dataset from [https://github.com/benjaminvdb/DBRD](https://github.com/benjaminvdb/DBRD), and save it to `data/raw/DBRD`
- Run `src/preprocess_dbrd.py` to prepare the dataset.
- To not be blind during training, we recommend to keep aside a small evaluation set from the training set. For this run `src/split_dbrd_training.sh`.
- Follow the notebook `notebooks/finetune_dbrd.ipynb` to finetune the model.
Expand Down
2 changes: 1 addition & 1 deletion model_cards/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ You can then use most of [HuggingFace's BERT-based notebooks](https://huggingfac
All experiments are described in more detail in our [paper](https://arxiv.org/abs/2001.06286), with the code in [our GitHub repository](https://github.com/iPieter/RobBERT).

### Sentiment analysis
Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/110kDBRD).
Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/DBRD).

| Model | Accuracy [%] |
|-------------------|--------------------------|
Expand Down
2 changes: 1 addition & 1 deletion src/preprocess_dbrd.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ def create_arg_parser():
parser = argparse.ArgumentParser(
description="Preprocess the Dutch Book Reviews Dataset corpus for the sentiment analysis tagging task."
)
parser.add_argument("--path", help="Path to the corpus folder.", metavar="path", default="data/raw/110kDBRD/")
parser.add_argument("--path", help="Path to the corpus folder.", metavar="path", default="data/raw/DBRD/")

return parser

Expand Down

0 comments on commit 428990f

Please sign in to comment.