镜像拉取xinference后，glm-4v-transformer-9b出错 #2536

Erincrying · 2024-11-11T06:54:16Z

System Info / 系統信息

镜像部署v16.3.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

docker / docker
pip install / 通过 pip install 安装
installation from source / 从源码安装

Version info / 版本信息

v16.3.0

The command used to start Xinference / 用以启动 xinference 的命令

网页进行部署
glm-4v/transformer/9b/4-bit

Reproduction / 复现过程

1.镜像部署最新版本xinference
2.网页运行glm-4v，可成功跑起来，gpu可查看到
3.dify内模型调用失败

报错信息如下
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 270, in getattr
return self.data[item]
KeyError: 'images'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/transformers/core.py", line 696, in prepare_batch_inference
r.full_prompt = self._get_full_prompt(r.prompt, tools)
File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/transformers/glm4v.py", line 228, in _get_full_prompt
"images": inputs.images.squeeze(0),
File "/usr/local/lib/python3.10/dist-packages/transformers/tokenization_utils_base.py", line 272, in getattr
raise AttributeError
AttributeError
Destroy generator b90554989ff511efb6660242ac640003 due to an error encountered.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 419, in xoscar_next
r = await asyncio.create_task(_async_wrapper(gen))
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 409, in _async_wrapper
return await _gen.anext() # noqa: F821
File "/usr/local/lib/python3.10/dist-packages/xinference/core/model.py", line 474, in _to_async_gen
async for v in gen:
File "/usr/local/lib/python3.10/dist-packages/xinference/core/model.py", line 669, in _queue_consumer
raise RuntimeError(res[len(XINFERENCE_STREAMING_ERROR_FLAG) :])
RuntimeError
2024-11-10 22:25:19,198 xinference.api.restful_api 1 ERROR Chat completion stream got an error: [address=0.0.0.0:46020, pid=344]
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/xinference/api/restful_api.py", line 1974, in stream_results
async for item in iterator:
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 340, in anext
return await self._actor_ref.xoscar_next(self._uid)
File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/context.py", line 231, in send
return self._process_result_message(result)
File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/context.py", line 102, in _process_result_message
raise message.as_instanceof_cause()
File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/pool.py", line 659, in send
result = await self._run_coro(message.message_id, coro)
File "/usr/local/lib/python3.10/dist-packages/xoscar/backends/pool.py", line 370, in _run_coro
return await coro
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 384, in on_receive
return await super().on_receive(message) # type: ignore
File "xoscar/core.pyx", line 558, in on_receive
raise ex
File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive
async with self._lock:
File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive
with debug_async_timeout('actor_lock_timeout',
File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.on_receive
result = await result
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 431, in xoscar_next
raise e
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 419, in xoscar_next
r = await asyncio.create_task(_async_wrapper(gen))
File "/usr/local/lib/python3.10/dist-packages/xoscar/api.py", line 409, in _async_wrapper
return await _gen.anext() # noqa: F821
File "/usr/local/lib/python3.10/dist-packages/xinference/core/model.py", line 474, in _to_async_gen
async for v in gen:
File "/usr/local/lib/python3.10/dist-packages/xinference/core/model.py", line 669, in _queue_consumer
raise RuntimeError(res[len(XINFERENCE_STREAMING_ERROR_FLAG) :])
RuntimeError: [address=0.0.0.0:46020, pid=344]

Expected behavior / 期待表现

可正常使用

qinxuye · 2024-11-13T04:21:24Z

是 100% 复现吗？

Erincrying · 2024-11-13T04:23:06Z

是 100% 复现吗？

更新transformer之后不报错这个了，但是100%复现#2523

qinxuye · 2024-11-13T04:36:57Z

那我们先 close 这个 issue。

XprobeBot added the gpu label Nov 11, 2024

XprobeBot added this to the v0.16 milestone Nov 11, 2024

qinxuye closed this as completed Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

镜像拉取xinference后，glm-4v-transformer-9b出错 #2536

镜像拉取xinference后，glm-4v-transformer-9b出错 #2536

Erincrying commented Nov 11, 2024

qinxuye commented Nov 13, 2024

Erincrying commented Nov 13, 2024

qinxuye commented Nov 13, 2024

镜像拉取xinference后，glm-4v-transformer-9b出错 #2536

镜像拉取xinference后，glm-4v-transformer-9b出错 #2536

Comments

Erincrying commented Nov 11, 2024

System Info / 系統信息

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

Version info / 版本信息

The command used to start Xinference / 用以启动 xinference 的命令

Reproduction / 复现过程

Expected behavior / 期待表现

qinxuye commented Nov 13, 2024

Erincrying commented Nov 13, 2024

qinxuye commented Nov 13, 2024