Deploying on Google Cloud run #97

jruokola · 2024-10-11T04:32:08Z

Managed to deploy to GCP cloud run using your example. Question what kind of payload does the model accept? I can see traffic going and things happening in the cloud run side but no response to this:

curl -X 'POST'
'https://lxxxx.us-central1.run.app/v1/chat/completions'
-H 'accept: application/json'
-H 'Content-Type: application/json'
-d '{
"model": "nvidia/llama-3-8b-instruct-l4:1.0",
"messages": [{"role":"user", "content":"Write a limerick about the wonders of GPU computing."}],
"max_tokens": 64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying on Google Cloud run #97

Deploying on Google Cloud run #97

jruokola commented Oct 11, 2024 •

edited

Loading

Deploying on Google Cloud run #97

Deploying on Google Cloud run #97

Comments

jruokola commented Oct 11, 2024 • edited Loading

jruokola commented Oct 11, 2024 •

edited

Loading