Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Assertion Class Action #29

Open
Hua-Wen opened this issue Jan 23, 2025 · 5 comments
Open

Add Assertion Class Action #29

Hua-Wen opened this issue Jan 23, 2025 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@Hua-Wen
Copy link

Hua-Wen commented Jan 23, 2025

Feature request / 功能建议

At present, there is only the action of the execution class, and there is no action for page state judgment. Integration with the test framework can be achieved if there are assertion actions

Motivation / 动机

Integration with automated testing frameworks for large model-based automated testing

Your contribution / 您的贡献

It seems that only fine-tuning can be done, and the registration of action through code is not seen. If there is related code, please tell me, I can complete this part of the code.

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jan 24, 2025
@zRzRzRzRzRzRzR
Copy link
Member

Yes, this model is not capable of judgment actions, all outputs are executions. I am not quite sure what kind of operation you mean by "page state judgment"?

@leeaction
Copy link

意思是不是用来断言页面的状态, 比如当前处于什么页面 是不是跳转到了正确的页面 或者页面里是不是有哪些元素 这些元素是不是在正确的位置上 等等。。。

@Hua-Wen
Copy link
Author

Hua-Wen commented Jan 24, 2025

Yes.
For example, my task executed action1,action2,action3, and then LLM output end(). The whole process ends. However, this does not necessarily mean that my task was successfully executed.

According to the actual scenario, task: enter the account password and log in,
action1: enter the account xxx,
Action2: enter the password,
Action3: click log in.
The model then outputs end. (Even if the page does not really land successfully)

I hope to actively let the model check, for example, the page should appear "login success", the page should jump to the login success page, the model judges and returns bool type results, and the code throws an exception of assertion failure at the appropriate time.

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR added the enhancement New feature or request label Jan 25, 2025
@zRzRzRzRzRzRzR
Copy link
Member

Oh, the issue you mentioned indeed was not perfectly resolved when we released this version of the model. Although as the documentation states, you can use the continuation feature, it does not solve the problem well. We will continue to optimize this issue in future model updates.
We have marked it.

@jasonnoy
Copy link

Yes. For example, my task executed action1,action2,action3, and then LLM output end(). The whole process ends. However, this does not necessarily mean that my task was successfully executed.

According to the actual scenario, task: enter the account password and log in, action1: enter the account xxx, Action2: enter the password, Action3: click log in. The model then outputs end. (Even if the page does not really land successfully)

I hope to actively let the model check, for example, the page should appear "login success", the page should jump to the login success page, the model judges and returns bool type results, and the code throws an exception of assertion failure at the appropriate time.

Thank you for your advice, we are currently working on similar feature you mentioned, which requires more powerful general capabilities from CogAgent. Please stay tuned on our coming updates:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants