[Bug] pydantic_core._pydantic_core.ValidationError: 1 validation error for Response #328
Open
3 tasks done
Labels
bug
report bugs that need to be fixed
Checked other resources
Describe your current environment
I run the main.py
$ CUDA_VISIBLE_DEVICES=3 python main.py --llm_name /mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B --max_gpu_memory '{"0": "40GB"}' --eval_device "cuda:0" --max_new_tokens 256 --use_backend vllm
Describe the bug
Main ID is: 473127
INFO 11-19 14:48:03 llm_engine.py:174] Initializing an LLM engine (v0.5.4) with config: model='/mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B', speculative_config=None, tokenizer='/mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, rope_scaling=None, rope_theta=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=8192, download_dir='/home/zhen1.zhang/2024_LLM_MA/hf_home', load_format=LoadFormat.AUTO, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=None, enforce_eager=False, kv_cache_dtype=auto, quantization_param_path=None, device_config=cuda, decoding_config=DecodingConfig(guided_decoding_backend='outlines'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None), seed=0, served_model_name=/mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B, use_v2_block_manager=False, enable_prefix_caching=False)
INFO 11-19 14:48:03 model_runner.py:720] Starting to load model /mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B...
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:00<00:01, 2.62it/s]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:00<00:00, 2.34it/s]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:00<00:00, 3.34it/s]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:01<00:00, 3.04it/s]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [00:01<00:00, 2.93it/s]
INFO 11-19 14:48:05 model_runner.py:732] Loading model weights took 14.9595 GB
INFO 11-19 14:48:06 gpu_executor.py:102] # GPU blocks: 28092, # CPU blocks: 2048
INFO 11-19 14:48:07 model_runner.py:1024] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 11-19 14:48:07 model_runner.py:1028] CUDA graphs can take additional 1~3 GiB memory per GPU. If you are running out of memory, consider decreasing
gpu_memory_utilization
or enforcing eager mode. You can also reduce themax_num_seqs
as needed to decrease memory usage.INFO 11-19 14:48:21 model_runner.py:1225] Graph capturing finished in 14 secs.
[🤖/mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B] AIOS has been successfully initialized.
[example/academic_agent] Tell me what is the prollm paper mainly about?
[Scheduler] example/academic_agent is executing.
[🤖/mnt/nas/home/haiyu.bai/haiyu/2024/llama/llama3/Meta-Llama-3-8B] example/academic_agent is switched to executing.
Processed prompts: 100%|████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:03<00:00, 3.18s/it, est. speed input: 95.78 toks/s, output: 80.39 toks/s]
***** Result:
user[Thinking]: The workflow generated for the problem is [{"action_type": "tool_use", "action": "Search for relevant papers", "tool_use": ["arxiv/arxiv"]}, {"action_type": "chat", "action": "Provide responses based on the user's query", "tool_use": []}]. Follow the workflow to solve the problem step by step.
assistant
userAt step 1, you need to: Search for relevant papers. In and only in current step, you need to call tools. Available tools are: [{"type": "function", "function": {"name": "arxiv/arxiv", "description": "Query articles or topics in arxiv", "parameters": {"type": "object", "properties": {"query": {"type": "string", "description": "Input query that describes what to search in arxiv"}}, "required": ["query"]}}}]Must call functions that are available. To call a function, respond immediately and only with a list of JSON object of the following format:{[{"name":"function_name_value","parameters":{"parameter_name1":"parameter_value1","parameter_name2":"parameter_value2"}}]}
At step *****
Traceback (most recent call last):
File "/home/zhen1.zhang/2024_LLM_MA/AIOS/aios/scheduler/fifo_scheduler.py", line 53, in run_llm_syscall
response = self.llm.address_syscall(llm_syscall)
File "/home/zhen1.zhang/2024_LLM_MA/AIOS/aios/llm_core/llms.py", line 78, in address_syscall
return self.model.address_syscall(llm_syscall, temperature)
File "/home/zhen1.zhang/2024_LLM_MA/AIOS/aios/llm_core/llm_classes/vllm.py", line 90, in address_syscall
Response(
File "/home/zhen1.zhang/miniconda3/envs/aios_py3.10/lib/python3.10/site-packages/pydantic/main.py", line 175, in init
self.pydantic_validator.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Response
finished
Field required [type=missing, input_value={'response_message': None...', 'type': 'function'}]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.7/v/missing
^C[rank0]: Traceback (most recent call last):
[rank0]: File "/home/zhen1.zhang/2024_LLM_MA/AIOS/main.py", line 93, in
[rank0]: main()
[rank0]: File "/home/zhen1.zhang/2024_LLM_MA/AIOS/main.py", line 87, in main
[rank0]: await_agent_execution(agent_id)
[rank0]: File "/home/zhen1.zhang/2024_LLM_MA/AIOS/aios/hooks/modules/agent.py", line 74, in awaitAgentExecution
[rank0]: return future.result()
[rank0]: File "/home/zhen1.zhang/miniconda3/envs/aios_py3.10/lib/python3.10/concurrent/futures/_base.py", line 453, in result
[rank0]: self._condition.wait(timeout)
[rank0]: File "/home/zhen1.zhang/miniconda3/envs/aios_py3.10/lib/python3.10/threading.py", line 320, in wait
[rank0]: waiter.acquire()
[rank0]: KeyboardInterrupt
The text was updated successfully, but these errors were encountered: