Add Assertion Class Action #29

Hua-Wen · 2025-01-23T07:34:39Z

Feature request / 功能建议

At present, there is only the action of the execution class, and there is no action for page state judgment. Integration with the test framework can be achieved if there are assertion actions

Motivation / 动机

Integration with automated testing frameworks for large model-based automated testing

Your contribution / 您的贡献

It seems that only fine-tuning can be done, and the registration of action through code is not seen. If there is related code, please tell me, I can complete this part of the code.

zRzRzRzRzRzRzR · 2025-01-24T06:30:22Z

Yes, this model is not capable of judgment actions, all outputs are executions. I am not quite sure what kind of operation you mean by "page state judgment"?

leeaction · 2025-01-24T07:35:40Z

意思是不是用来断言页面的状态，比如当前处于什么页面是不是跳转到了正确的页面或者页面里是不是有哪些元素这些元素是不是在正确的位置上等等。。。

Hua-Wen · 2025-01-24T08:10:36Z

Yes.
For example, my task executed action1,action2,action3, and then LLM output end(). The whole process ends. However, this does not necessarily mean that my task was successfully executed.

According to the actual scenario, task: enter the account password and log in,
action1: enter the account xxx,
Action2: enter the password,
Action3: click log in.
The model then outputs end. (Even if the page does not really land successfully)

I hope to actively let the model check, for example, the page should appear "login success", the page should jump to the login success page, the model judges and returns bool type results, and the code throws an exception of assertion failure at the appropriate time.

zRzRzRzRzRzRzR · 2025-01-25T03:24:21Z

Oh, the issue you mentioned indeed was not perfectly resolved when we released this version of the model. Although as the documentation states, you can use the continuation feature, it does not solve the problem well. We will continue to optimize this issue in future model updates.
We have marked it.

jasonnoy · 2025-01-25T03:24:23Z

Yes. For example, my task executed action1,action2,action3, and then LLM output end(). The whole process ends. However, this does not necessarily mean that my task was successfully executed.

According to the actual scenario, task: enter the account password and log in, action1: enter the account xxx, Action2: enter the password, Action3: click log in. The model then outputs end. (Even if the page does not really land successfully)

I hope to actively let the model check, for example, the page should appear "login success", the page should jump to the login success page, the model judges and returns bool type results, and the code throws an exception of assertion failure at the appropriate time.

Thank you for your advice, we are currently working on similar feature you mentioned, which requires more powerful general capabilities from CogAgent. Please stay tuned on our coming updates:)

zRzRzRzRzRzRzR self-assigned this Jan 24, 2025

zRzRzRzRzRzRzR added the enhancement New feature or request label Jan 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Assertion Class Action #29

Add Assertion Class Action #29

Hua-Wen commented Jan 23, 2025

zRzRzRzRzRzRzR commented Jan 24, 2025

leeaction commented Jan 24, 2025

Hua-Wen commented Jan 24, 2025

zRzRzRzRzRzRzR commented Jan 25, 2025

jasonnoy commented Jan 25, 2025

Add Assertion Class Action #29

Add Assertion Class Action #29

Comments

Hua-Wen commented Jan 23, 2025

Feature request / 功能建议

Motivation / 动机

Your contribution / 您的贡献

zRzRzRzRzRzRzR commented Jan 24, 2025

leeaction commented Jan 24, 2025

Hua-Wen commented Jan 24, 2025

zRzRzRzRzRzRzR commented Jan 25, 2025

jasonnoy commented Jan 25, 2025