Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

experiment difference in paper #6

Open
lyklly opened this issue Nov 25, 2024 · 2 comments
Open

experiment difference in paper #6

lyklly opened this issue Nov 25, 2024 · 2 comments

Comments

@lyklly
Copy link

lyklly commented Nov 25, 2024

what's the difference between the experiment chapter 3.3 Multi-Objective Alignment Evaluation and experiment chapter 4.1 Performance Trade-off Evaluation.
notice that they both evaluate on MT-Bench, HaluEval 2.0, HackaPrompt, respectively, and they both use the preference token (e.g. Harmlessness:5) in the prompt, and they both evaluate by GPT-4. why the performance of CPO in figure 4 is different from it in Table 1

Table 1
1732521897318

figure 4
1732521924609

@YijuGuo
Copy link
Collaborator

YijuGuo commented Nov 26, 2024

Table 1 test
The experimental setup for Mistral-7B-CPO-Harmful is to append the preference token "Harmlessness:5" before MT-Bench, HaluEval 2.0, and HackaPrompt, and then evaluate through GPT-4.

Figure 4 test
Figure 4(a) appends <Helpfulness: 5> before the instruction on MT-Bench, and tests the obtained response in three dimensions: Helpful, Honesty, and Harmlessness, using GPT-4.
Similarly, Figure 4(b) appends Honesty:5 before the instruction on HaluEval2.0, and Figure 4(c) appends <Harmlessness: 5> before the instruction on HackaPrompt.

Due to the difference in the appended preference tokens in Table 1 and Figure 4, the corresponding performance also varies accordingly.

@lyklly
Copy link
Author

lyklly commented Nov 26, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants