Comments and tips for the prompt #55

SalvatoreRa · 2023-07-30T14:49:53Z

Hi,

very solid and useful works.

In this repository they suggest an approach similar to tree-of-thoughts but which should be done in one prompt

an example of this type of prompt:

Imagine three different experts are answering this question. All experts will write down 1 step of their thinking, then share it with the group. Then all experts will go on to the next step, etc. If any expert realises they're wrong at any point then they leave. The question is...

Another interesting approach has been described in this paper: Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models where they have collected impressive datasets about college questions. The authors decided to test different techniques such as self-critique, the chain of thought, and few-shot to see how these impact. In addition, the authors decided to test a new approach the authors call expert prompting. In short, the authors ask the model to nominate experts for a question and what the response of these experts would be. Finally, based on these responses make a collective decision.

example of expert prompting:

# from the official repository: https://github.com/idrori/MITQ/blob/main/code/experts.py
generic_expert = f"an MIT Professor of {department} teaching the {course_name} course"
You are " + generic_expert + f". Give an educated guess of who are three experts most capable of solving the following question.
\n Question: {question}.\n Return a comma-separated list of three names."

#example from the article

E = You are an MIT Professor of Computer Science and Mathematics teaching Calculus I.
P3 = Give an educated guess of the three experts most capable of solving this question.
System: You are E.
User: Solve Q

About CoT an interesting article about CoT has just been published by anthropic [here](Measuring Faithfulness in Chain-of-Thought Reasoning) that could be interesting to include in the review

The text was updated successfully, but these errors were encountered:

EliverQ · 2023-08-02T12:18:40Z

Thank you very much for your recognition of our work. We will make revisions in the next version.
We would like to include you in the acknowledgments. Could you please provide your name?

SalvatoreRa · 2023-08-02T12:53:09Z

thank you very much, my name is Salvatore Raieli

SalvatoreRa · 2023-10-22T13:06:34Z

Thank you, I have seen the updated version and I would suggest some new research that could be interesting to mention:

Mistral 7B has been released with the technical article. It claims it has better performance than LLaMA version 2 (7B and 13B) parameters. It is interesting to note they are used for reducing inference costs. They created this prompt as Guardrail enforcement (to avoid that the model generated dangerous answer):

Always assist with care, respect, and truth. Respond with utmost utility yet securely. Avoid harmful, unethical, prejudiced, or negative content. Ensure replies promote fairness and positivity.

They claim this is not impacting performance while avoiding the model is not responding to unsafe prompts:
Ref: https://arxiv.org/pdf/2310.06825.pdf

Another interesting topic that could be incorporated is machine unlearning (which is important since many LLMs are trained on copyright data, harmful content, or personal data). Microsoft recently showed how to make “forget” Harry Potter to LLaMA 7B. The authors used a reinforced model that is trained further on the target data to better identify the data you want your model to forget (the tokens related to the topic). Then they compare the logits between the baseline model and the reinforced model for the target data. They replace idiosyncratic expressions in the target data with generic counterparts (using GPT-4 for the mapping in a dictionary) and use the model to generate alternative labels for the tokens. Lastly, they fine-tune the model on these alternative labels, erasing the memory of target data from the model. This approach seems to work without affecting the model performance on the reasoning benchmark
Ref: https://arxiv.org/pdf/2310.02238.pdf

Another interesting article from Microsoft shows that there are some surprising failure of generalization for LLM: a model trained on a sentence of the form “A is B”, often fail to generalize to the reverse direction “B is A”
Ref: https://arxiv.org/abs/2309.12288v2

The author of this work finds an approach that finds a suffix that, when attached to a wide range of queries for an LLM to produce objectionable content, aims to maximize the probability that the model produces an affirmative response. The approached worked with different models (ChatGPT, Bard, and Claude but also open-source LLaMA-2-Chat, Pythia, Falcon). They developed an algorithm able to hijack the model constraint without the need of manual engineering
Ref: https://arxiv.org/abs/2307.15043

StevenTang1998 · 2023-10-22T13:12:22Z

Thanks again for your continued interest and valuable suggestions regarding our survey. We are currently undergoing a new revision, and it is expected to be updated in nearly one month.

SalvatoreRa · 2023-10-22T13:15:05Z

your work is actually amazing and thank you for keeping updated that incredible survey.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments and tips for the prompt #55

Comments and tips for the prompt #55

SalvatoreRa commented Jul 30, 2023

EliverQ commented Aug 2, 2023 •

edited

Loading

SalvatoreRa commented Aug 2, 2023

SalvatoreRa commented Oct 22, 2023

StevenTang1998 commented Oct 22, 2023

SalvatoreRa commented Oct 22, 2023

Comments and tips for the prompt #55

Comments and tips for the prompt #55

Comments

SalvatoreRa commented Jul 30, 2023

EliverQ commented Aug 2, 2023 • edited Loading

SalvatoreRa commented Aug 2, 2023

SalvatoreRa commented Oct 22, 2023

StevenTang1998 commented Oct 22, 2023

SalvatoreRa commented Oct 22, 2023

EliverQ commented Aug 2, 2023 •

edited

Loading