Skip to content

Commit

Permalink
small tweaks
Browse files Browse the repository at this point in the history
  • Loading branch information
soldni committed Dec 30, 2024
1 parent 2b6b5f7 commit f5f1224
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions configs/cc-news/mix-deupe-by-year.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@ streams:
attributes: &attributes
- dedupe_ngrams_20_1
output: &output
max_size_in_bytes: 2_500_000_000
path: ${oc.env:HOME}/ai2-llm/pretraining-data/sources/cc-news/v1-resiliparse-year-dedupe/documents
max_size_in_bytes: 3_814_697_265
path: ${oc.env:HOME}/ai2-llm/pretraining-data/sources/cc-news/v2-resiliparse-year_dedup/documents
filter: &filter
include:
- >-
Expand Down

0 comments on commit f5f1224

Please sign in to comment.