Update references to model

allenai · Nov 26, 2024 · 973b34d · 973b34d
1 parent d867ced
commit 973b34d
Showing 1 changed file with 5 additions and 5 deletions.
diff --git a/README.md b/README.md
@@ -54,8 +54,8 @@ You can utilize our Hugging Face integration to run inference on the OLMo Transf
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0724-hf")
-tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-0724-hf")
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-7B")
+tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-2-1124-7B")
 
 message = ["Language modeling is "]
 inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
@@ -67,21 +67,21 @@ Alternatively, with the Hugging Face pipeline abstraction:
 
 ```python
 from transformers import pipeline
-olmo_pipe = pipeline("text-generation", model="allenai/OLMo-7B-0724-hf")
+olmo_pipe = pipeline("text-generation", model="allenai/OLMo-2-1124-7B")
 print(olmo_pipe("Language modeling is"))
 ```
 
 ### Inference on finetuned checkpoints
 After fine-tuning the model using the code in the [Fine-tuning](#fine-tuning) section, you can use the conversion script to convert a native OLMo checkpoint to a HuggingFace-compatible format.
 
 ```bash
-python scripts/convert_olmo_to_hf_new.py --input_dir /path/to/olmo/checkpoint --output_dir /path/to/hf/checkpoint/ --tokenizer_json_path tokenizers/allenai_gpt-neox-olmo-dolma-v1_5.json
+python scripts/convert_olmo_to_hf_new.py --input_dir /path/to/olmo/checkpoint --output_dir /path/to/hf/checkpoint/ --tokenizer_json_path tokenizers/allenai_dolma2.json
 ```
 
 ### Quantization
 
 ```python
-olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0724-hf", torch_dtype=torch.float16, load_in_8bit=True)  # requires bitsandbytes
+olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-7B", torch_dtype=torch.float16, load_in_8bit=True)  # requires bitsandbytes
 ```
 
 The quantized model is sensitive to input types and CUDA handling. To avoid potential issues, we recommend explicitly converting input IDs to CUDA using: `inputs.input_ids.to('cuda')`