Technical requirements for LLM model #238

sellmon · 2024-07-12T07:40:02Z

I have a few questions about the technical requirements for Codellama model:

What are the minimum and recommended amounts of RAM needed to run the model 70b effectively? How does the amount of RAM affect the performance of the model? For example, if the model takes up 70GB of memory, will having 96GB of RAM provide the same speed as 256GB? Or would more RAM potentially increase performance?
Is there any specific GPU configuration that would be optimal for this model? If I connect 5 GPUs using a rig, but the rig is connected to my computer via 4x PCI x1 and 1x PCI x16 slots, will it improve performance significantly? I do not plan to train the model, I only need it for answers + also taking into account the context from additional documents
Do I need to use PyTorch for this setup or can I use another framework like TensorFlow?

Provide feedback