diff --git a/README.md b/README.md index 8ff7f84..163a1e9 100644 --- a/README.md +++ b/README.md @@ -99,7 +99,7 @@ Using RobBERT's `model.pt`, this method allows you to use all other functionalit All experiments are described in more detail in our [paper](https://arxiv.org/abs/2001.06286), with the code in [our GitHub repository](https://github.com/iPieter/RobBERT). ### Sentiment analysis -Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/110kDBRD). +Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/DBRD). | Model | Accuracy [%] | |-------------------|--------------------------| @@ -234,7 +234,7 @@ In this section we describe how to use the scripts we provide to fine-tune model #### Sentiment analysis using the Dutch Book Review Dataset -- Download the Dutch book review dataset from [https://github.com/benjaminvdb/110kDBRD](https://github.com/benjaminvdb/110kDBRD), and save it to `data/raw/110kDBRD` +- Download the Dutch book review dataset from [https://github.com/benjaminvdb/DBRD](https://github.com/benjaminvdb/DBRD), and save it to `data/raw/DBRD` - Run `src/preprocess_dbrd.py` to prepare the dataset. - To not be blind during training, we recommend to keep aside a small evaluation set from the training set. For this run `src/split_dbrd_training.sh`. - Follow the notebook `notebooks/finetune_dbrd.ipynb` to finetune the model. diff --git a/model_cards/README.md b/model_cards/README.md index 2cbf073..403fc34 100644 --- a/model_cards/README.md +++ b/model_cards/README.md @@ -74,7 +74,7 @@ You can then use most of [HuggingFace's BERT-based notebooks](https://huggingfac All experiments are described in more detail in our [paper](https://arxiv.org/abs/2001.06286), with the code in [our GitHub repository](https://github.com/iPieter/RobBERT). ### Sentiment analysis -Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/110kDBRD). +Predicting whether a review is positive or negative using the [Dutch Book Reviews Dataset](https://github.com/benjaminvdb/DBRD). | Model | Accuracy [%] | |-------------------|--------------------------| diff --git a/src/preprocess_dbrd.py b/src/preprocess_dbrd.py index a756f55..6605be0 100644 --- a/src/preprocess_dbrd.py +++ b/src/preprocess_dbrd.py @@ -8,7 +8,7 @@ def create_arg_parser(): parser = argparse.ArgumentParser( description="Preprocess the Dutch Book Reviews Dataset corpus for the sentiment analysis tagging task." ) - parser.add_argument("--path", help="Path to the corpus folder.", metavar="path", default="data/raw/110kDBRD/") + parser.add_argument("--path", help="Path to the corpus folder.", metavar="path", default="data/raw/DBRD/") return parser