diff --git a/images/Week5/Hallucination_Solution_&_Benefits_page_16.png b/images/Week5/Hallucination_Solution_&_Benefits_page_16.png new file mode 100644 index 0000000..3a18c8c Binary files /dev/null and b/images/Week5/Hallucination_Solution_&_Benefits_page_16.png differ diff --git a/images/Week5/Hallucination_Solution_&_Benefits_page_21.png b/images/Week5/Hallucination_Solution_&_Benefits_page_21.png new file mode 100644 index 0000000..554eb57 Binary files /dev/null and b/images/Week5/Hallucination_Solution_&_Benefits_page_21.png differ diff --git a/images/Week5/Hallucination_Solution_&_Benefits_page_28.png b/images/Week5/Hallucination_Solution_&_Benefits_page_28.png new file mode 100644 index 0000000..65f5783 Binary files /dev/null and b/images/Week5/Hallucination_Solution_&_Benefits_page_28.png differ diff --git a/images/Week5/Hallucination_Solution_&_Benefits_page_4.png b/images/Week5/Hallucination_Solution_&_Benefits_page_4.png new file mode 100644 index 0000000..857332b Binary files /dev/null and b/images/Week5/Hallucination_Solution_&_Benefits_page_4.png differ diff --git a/images/Week5/Hallucination_Solution_&_Benefits_page_5.png b/images/Week5/Hallucination_Solution_&_Benefits_page_5.png new file mode 100644 index 0000000..d6bb400 Binary files /dev/null and b/images/Week5/Hallucination_Solution_&_Benefits_page_5.png differ diff --git a/images/Week5/Hallucination_Solution_&_Benefits_page_7.png b/images/Week5/Hallucination_Solution_&_Benefits_page_7.png new file mode 100644 index 0000000..0d1bd95 Binary files /dev/null and b/images/Week5/Hallucination_Solution_&_Benefits_page_7.png differ diff --git a/images/Week5/Hallucination_page_4.png b/images/Week5/Hallucination_page_4.png new file mode 100644 index 0000000..e5bbf6f Binary files /dev/null and b/images/Week5/Hallucination_page_4.png differ diff --git a/images/Week5/Hallucination_page_6.png b/images/Week5/Hallucination_page_6.png new file mode 100644 index 0000000..1d4c1a5 Binary files /dev/null and b/images/Week5/Hallucination_page_6.png differ diff --git a/index.html b/index.html index 34b00ec..71b28d9 100644 --- a/index.html +++ b/index.html @@ -112,6 +112,12 @@ Recent Posts +
+ Week 5: Hallucination + + +
+
Week 4: Capabilities of LLMs @@ -217,12 +223,249 @@ + + + +

Week 5: Hallucination

+
+ + +
+ +
+

(see bottom for assigned readings and questions)

+

Hallucination (Week 5)

+

Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei

+

Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh

+

Wednesday, September 27th: Intro to Hallucination

+ + + + + + + +
People Hallucinate Too +

+ In general, hallucinations refer to the propagation of false information and/or misinformation. One common example of hallucinations is the Mandela Effect, where incorrect memories are shared by a large group of people. For instance, a paranormal researcher, Fiona Broome, reported a widespread misremembering of a tragedy that Mandela died in prison in the 1980’s, which was untrue. +

+
+ + + + + +
Hallucination Definition
+
+

+ In the context of LLMs, hallucinations are a phenomenon that refer to a model’s seemingly plausible generated output, usually presented in a confident tone which makes users more susceptible to believing the result. +

There are three types of hallucinations according to the “Siren’s Song +in the AI Ocean” paper: (1) input-conflict, (2) +context-conflict, and (3) fact-conflict. In class, there +seemed to be several dissenting opinions about the +definition of hallucination regarding LLMs. One classmate +argued how alignment-based hallucination should not be +considered as part of the discussion scope, as the model +would still be doing what it was intended to be doing +(i.e. aligning with the user and/or aligning with the +trainer).

  • Input-conflict: This +subcategory of hallucinations deviates from user +input. Input from the user can be separated into a task +instruction and a task input. An example of a task +instruction is a user prompting a model to make a +presentation for them. In this example, a task input could +be the research papers the user wanted the presentation to +be based on.
  • Context-conflict: +Context-conflict hallucinations occur when a model +generates contradicting information within a response. A +simple example of this would be replacing someone’s name +(ex. Silver) for another name (ex. Stern).
  • +
  • Fact-conflict: This is the most common +subcategory of hallucination, thus making it the most +recent focus of research. An example of this could be +returning the wrong date for a historical event.
  • +

+ + + + + + +

+
Sources of Hallucination
+
+

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models +

+ + + + + +
Hallucination Risks +
+
+

Group Activity: Engage with ChatGPT to Explore Its Hallucinations (Three Groups Focusing on Different Hallucination Types)

+
+

+Group 1 focused on "Input-conflict Hallucination". One member narrated a story involving two characters, where one character murdered the other. Contrarily, ChatGPT presented an opposite conclusion. Another member tried to exploit different languages, using two distinct languages that possess similar words. +

+

+Group 2 concentrated on "Counter-conflict Hallucination". They described four to five fictitious characters, detailing their interrelationships. Some relationships were deducible, yet the model frequently failed to make a complete set of deductions until explictely prompted to be more complete. +

+

+Group 3 delved into "Fact-conflict Hallucination". An illustrative example was when ChatGPT was queried with the fraction "⅓". It offered "0.333" as an approximation. However, when subsequently asked to multiply "0.3333" by "3", it confidently responded with "1". Additional tests included translations between two languages. +

+
+

Wednesday, October 4th: Hallucination Solutions

+ + + + + + +

+
Mitigation Strategies +

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models +
+

+During the pretraining phase, it is essential to feasibly construct high-quality data. A potential solution is to filter out machine-generated sources, especially when tokens are uncommon. However, we can only use heuristic rules, which are not always effective at removing fake content. +

+

+For supervised fine-tuning, there is typically a limited amount of data available for the instruction set. Some of the recommended solutions include manually removing problematic instructions and employing an honest-oriented SFT approach. The term “honesty” can be misleading as it is sometimes used to capture a much broader range of desired behaviors by the trainer. +

+

+RLHF is an important alignment method which can also be used to mitigate hallucinations including through the human labelers focusing more on “honest” answers. +

+

+For inference, one strategy is to reduce the snowballing of hallucinations by designing a dynamic p-value. The p-value should start off large and shrink as more tokens are generated. Furthermore, introducing new or external knowledge can be done at two different positions: before and after generation. +

+
+ + + + + + +
+Decoding Contrasting Layers
+

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

+
+

+ Based on evolving trends, the concept of contrastive decoding is introduced. For example, one might ask, "How do we decide between Seattle or Olympia?" When considering the last layer as a mature layer, it is beneficial to contrast the differences between the preceding layers, which can be deemed as premature. For each of these layers, it is possible to calculate the difference between each probability distribution by comparing mature and premature layers, a process that utilizes the Jensen-Shannon Divergence. Such an approach permits the amplification of the factual knowledge that the model has acquired, thereby enhancing its output generation. +

+
+ + + + + + +
+

In-Context Retrieval-Augmented Language Models

+

+ The model parameters are kept frozen. Instead of directly inputting text into the model, the approach first uses retrieval to search for relevant documents from external sources. The findings from these sources are then concatenated with the original text. Re-ranking results from the retrieval model also provides benefits; the exact perplexities can be referred to in the slide. It has been observed that smaller strides can enhance performance, albeit at the cost of increased runtime. The authors have noticed that the information at the end of a query is typically more relevant for output generation. In general, shorter queries tend to outperform longer ones. +

+
+ + + + + + + +
Benefits of Hallucinations +
+

Discussion: Based on the sources of hallucination, what methods can be employed to mitigate the hallucination issue?

+

+ Group 2: + Discussed two papers from this week's reading which highlighted the use of semantic search and the introduction of external context to aid the model. This approach, while useful for diminishing hallucination, heavily depends on external information, which is not effective in generic cases. + Further strategies discussed were automated prompt engineering, optimization of user-provided context (noting that extensive contexts can induce hallucination), and using filtering or attention mechanisms to limit the tokens the model processes. + From the model's perspective, it is beneficial to employ red-teaming, explore corner cases, and pinpoint domains where hallucinations are prevalent. + Notably, responses can vary for an identical prompt. A proposed solution is to generate multiple responses to the same prompt and amalgamate them, perhaps through a majority voting system, to eliminate low-probability hallucinations. +

+

+ Group 1: + Discussed the scarcity of alternatives to the current training dataset. + Like Group 2, they also explored the idea of generating multiple responses but suggested allowing the user to select from the array of choices. + Another approach discussed was the model admitting uncertainty, stating "I don’t know", rather than producing a hallucination. +

+

+ Group 3: Addressed inconsistencies in the training data. + Emphasized the importance of fine-tuning and ensuring the use of contemporary data. + However, they noted that fine-tuning doesn't ensure exclusion of outdated data. + It was also advised to source data solely from credible sources. + An interesting perspective discussed was utilizing a larger model to verify the smaller model's hallucinations. But a caveat arises: How can one ensure the larger model's accuracy? And if the larger model is deemed superior, why not employ it directly? +

+

Discussion: What are the potential advantages of hallucinations in Large Language Models (LLMs)?

+

+ One advantage discussed was that hallucinations "train" users to not blindly trust the model outputs. If such models are blindly trusted, there is a much greater risk associated with their use. + If users can conclusively discern, however, that the produced information is fictitious, it could assist in fostering new ideas or fresh perspectives on a given topic. + Furthermore, while fake data has potential utility in synthetic data generation, there's a pressing need to remain vigilant regarding the accuracy and plausibility of the data produced. +

+

Readings

+

For the first class (9/27)

+

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang et al. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. September 2023. https://arxiv.org/abs/2309.01219

+

For the second class (10/4)

+

Choose one (or more) of the following papers to read:

+ +

Optional Additional Readings

+

Overview of hallucination

+ +

How to reduce hallucination: Retrieval-augmented LLM

+ +

How to reduce hallucination: Decoding strategy

+ +

Hallucination is not always harmful: Possible use cases of hallucination

+ +

Discussion Questions

+

Everyone who is not in either the lead or blogging team for the week should post (in the comments below) an answer to at least one of the four questions in each section, or a substantive response to someone else’s comment, or something interesting about the readings that is not covered by these questions.

+

Don’t post duplicates - if others have already posted, you should read their responses before adding your own. Please post your responses to different questions as separate comments.

+

First section (1 - 4): Before 5:29pm on Tuesday 26 September.
+Second section (5 - 9): Before 5:29pm on Tuesday 3 October.

+

Questions for the first class (9/27)

+
    +
  1. What are the risks of hallucinations, especially when LLMs are used in critical applications such as autonomous vehicles, medical diagnosis, or legal analysis?
  2. +
  3. What are some potential long-term consequences of allowing LLMs to generate fabricated information without proper detection and mitigation measures in place?
  4. +
  5. How can we distinguish between legitimate generalization or “creative writing” and hallucination? Where is the line between expanding on existing knowledge and creating entirely fictional information, and what are the consequences on users?
  6. +
+

Questions for the second class (10/4)

+
    +
  1. The required reading presents two methods for reducing hallucinations, i.e., introducing external knowledge and designing better decoding strategies. Can you brainstorm or refer to optional readings to explore ways to further mitigate hallucinations? If so, could you elaborate more on your ideas and also discuss the challenges and risks associated with them?
  2. +
  3. Among all the mitigation strategies for addressing hallucination (including those introduced in the reading material from the first class), which one do you find most promising, and why?
  4. +
  5. Do retrieval-augmented LLMs pose any risks or potential negative consequences despite their ability to mitigate LLM hallucinations through the use of external knowledge?
  6. +
  7. The method proposed by DoLa seems quite simple but effective. Where do you think the authors of DoLa get the inspiration for their idea?
  8. +
+ +
+
+

Week 4: Capabilities of LLMs

@@ -233,7 +476,8 @@

Week 4: Capabilities of LLMs

-

Capabilities of LLMs (Week 4)

+

(see bottom for assigned readings and questions)

+

Capabilities of LLMs (Week 4)

Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan

Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce

Monday, September 18

@@ -366,6 +610,51 @@

Discussion

How can we refine and improve LLMs like Med-PaLM2 to be more effective in healthcare applications?

+

Readings

+

Monday:

+
    +
  1. +

    Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712. [PDF]

    +
  2. +
  3. +

    OpenAI. GPT-4 Technical Report. March 2023. https://arxiv.org/abs/2303.08774 [PDF]

    +
  4. +
+

Optionally, also explore https://openai.com/blog/chatgpt-plugins.

+

Wednesday:

+
    +
  1. Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, Vivek Natarajan. Towards Expert-Level Medical Question Answering with Large Language Models +https://arxiv.org/abs/2305.09617 [PDF]
  2. +
+

Optional Readings:

+ +

Discussion for Monday:

+

Everyone who is not in either the lead or blogging team for the week should post (in the comments below) an answer to at least one of the questions in this section, or a substantive response to someone else’s comment, or something interesting about the readings that is not covered by these questions. Don’t post duplicates - if others have already posted, you should read their responses before adding your own. Please post your responses to different questions as separate comments.

+

You should post your initial response before 5:29pm on Sunday, September 17, but feel free (and encouraged!) to continue the discussion after that, including responding to any responses by others to your comments.

+
    +
  1. Based on the criterions shown in Figure 2 of [1], imagine a practical scenario and explain why you would choose or not choose using LLMs for your scenario.
  2. +
  3. Are plug-ins the future of AGI? Do you think that a company should only focus on building powerful AI systems that does not need any support from plug-ins, or they should only focus on the core system and involve more plug-ins into the ecosystem?
  4. +
+

Discussion for Wednesday:

+

You should post your initial response to one of the questions below or something interesting related to the Wednesday readings before 5:29pm on Tuesday, September 19.

+
    +
  1. +

    What should we do before deploying LLMs in medical diagnosis applications? What (if any) regulations should control or limit how they would be used?

    +
  2. +
  3. +

    With LLMs handling sensitive medical information, how can patient privacy and data security be maintained? What policies and safeguards should be in place to protect patient data?

    +
  4. +
  5. +

    The paper discusses the progress of LLMs towards achieving physician-level performance in medical question answering. What are the potential implications of LLMs reaching or surpassing human expertise in medical knowledge?

    +
  6. +
  7. +

    The paper mentions the importance of safety and minimizing bias in LLM-generated medical information, and the optional reading reports on some experiments that show biases in GPT’s medical diagnoses. Should models be tuned to ignore protected attributes? Should we prevent models from being used in medical applications until these problems can be solved?

    +
  8. +

    diff --git a/index.xml b/index.xml index 5a16520..2e57ec0 100644 --- a/index.xml +++ b/index.xml @@ -8,17 +8,29 @@ en-us evans@virginia.edu (David Evans) evans@virginia.edu (David Evans) - Mon, 25 Sep 2023 00:00:00 +0000 + Wed, 04 Oct 2023 00:00:00 +0000 + + Week 5: Hallucination + https://llmrisks.github.io/week5/ + Wed, 04 Oct 2023 00:00:00 +0000 + evans@virginia.edu (David Evans) + https://llmrisks.github.io/week5/ + (see bottom for assigned readings and questions) +Hallucination (Week 5) Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei +Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh +Wednesday, September 27th: Intro to Hallucination People Hallucinate Too In general, hallucinations refer to the propagation of false information and/or misinformation. One common example of hallucinations is the Mandela Effect, where incorrect memories are shared by a large group of people. + + Week 4: Capabilities of LLMs https://llmrisks.github.io/week4/ Mon, 25 Sep 2023 00:00:00 +0000 evans@virginia.edu (David Evans) https://llmrisks.github.io/week4/ - Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan + (see bottom for assigned readings and questions) +Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce -Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712 -This discussion was essential to highlight the distinction between large language models (LLMs) and fine-tuned models. +Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712 diff --git a/post/index.html b/post/index.html index cf09fd8..b295a43 100644 --- a/post/index.html +++ b/post/index.html @@ -83,6 +83,23 @@
    +

    Week 5: Hallucination

    + + + +(see bottom for assigned readings and questions) +Hallucination (Week 5) Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei +Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh +Wednesday, September 27th: Intro to Hallucination People Hallucinate Too In general, hallucinations refer to the propagation of false information and/or misinformation. One common example of hallucinations is the Mandela Effect, where incorrect memories are shared by a large group of people. +

    Read More…

    + + +

    Week 4: Capabilities of LLMs

    +(see bottom for assigned readings and questions) Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712 -This discussion was essential to highlight the distinction between large language models (LLMs) and fine-tuned models.

    Read More…

    diff --git a/post/index.xml b/post/index.xml index 3f8744a..23d2cf6 100644 --- a/post/index.xml +++ b/post/index.xml @@ -8,17 +8,29 @@ en-us evans@virginia.edu (David Evans) evans@virginia.edu (David Evans) - Mon, 25 Sep 2023 00:00:00 +0000 + Wed, 04 Oct 2023 00:00:00 +0000 + + Week 5: Hallucination + https://llmrisks.github.io/week5/ + Wed, 04 Oct 2023 00:00:00 +0000 + evans@virginia.edu (David Evans) + https://llmrisks.github.io/week5/ + (see bottom for assigned readings and questions) +Hallucination (Week 5) Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei +Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh +Wednesday, September 27th: Intro to Hallucination People Hallucinate Too In general, hallucinations refer to the propagation of false information and/or misinformation. One common example of hallucinations is the Mandela Effect, where incorrect memories are shared by a large group of people. + + Week 4: Capabilities of LLMs https://llmrisks.github.io/week4/ Mon, 25 Sep 2023 00:00:00 +0000 evans@virginia.edu (David Evans) https://llmrisks.github.io/week4/ - Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan + (see bottom for assigned readings and questions) +Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce -Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712 -This discussion was essential to highlight the distinction between large language models (LLMs) and fine-tuned models. +Monday, September 18 Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712 diff --git a/sitemap.xml b/sitemap.xml index 09a0107..aa0cc72 100644 --- a/sitemap.xml +++ b/sitemap.xml @@ -3,10 +3,13 @@ xmlns:xhtml="http://www.w3.org/1999/xhtml"> https://llmrisks.github.io/post/ - 2023-09-25T00:00:00+00:00 + 2023-10-04T00:00:00+00:00 https://llmrisks.github.io/ - 2023-09-25T00:00:00+00:00 + 2023-10-04T00:00:00+00:00 + + https://llmrisks.github.io/week5/ + 2023-10-04T00:00:00+00:00 https://llmrisks.github.io/week4/ 2023-09-25T00:00:00+00:00 diff --git a/src/content/post/week4.md b/src/content/post/week4.md index 49cf585..fb935f3 100644 --- a/src/content/post/week4.md +++ b/src/content/post/week4.md @@ -5,6 +5,8 @@ title = "Week 4: Capabilities of LLMs" slug = "week4" +++ +(see bottom for assigned readings and questions) + # Capabilities of LLMs (Week 4) Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan @@ -200,3 +202,44 @@ drew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. Palm: Scaling language modeling with pathways, 2022. https://arxiv.org/abs/2204.02311 + +## Readings + +**Monday:** + +1. Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. [_Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond_](https://arxiv.org/abs/2304.13712). April 2023. [https://arxiv.org/abs/2304.13712](https://arxiv.org/abs/2304.13712). [[PDF](https://arxiv.org/pdf/2304.13712.pdf)] + +2. OpenAI. [_GPT-4 Technical Report_](https://arxiv.org/abs/2303.08774). March 2023. [https://arxiv.org/abs/2303.08774](https://arxiv.org/abs/2303.08774) [[PDF](https://arxiv.org/pdf/2303.08774.pdf)] + +Optionally, also explore [https://openai.com/blog/chatgpt-plugins](https://openai.com/blog/chatgpt-plugins). + +**Wednesday:** + +3. Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, Vivek Natarajan. [_Towards Expert-Level Medical Question Answering with Large Language Models_](https://arxiv.org/abs/2305.09617) +[https://arxiv.org/abs/2305.09617](https://arxiv.org/abs/2305.09617) [[PDF](https://arxiv.org/pdf/2305.09617.pdf)] + +Optional Readings: +- Harsha Nori, Nicholas King, Scott Mayer McKinney, Dean Carignan, Eric Horvitz. [_Capabilities of GPT-4 on Medical Challenge Problems_](https://arxiv.org/abs/2303.13375). March 2023. [https://arxiv.org/abs/2303.13375](https://arxiv.org/abs/2303.13375) +- Travis Zack, Eric Lehman, Mirac Suzgun, Jorge A. Rodriguez, Leo Anthony Celi, Judy Gichoya, Dan Jurafsky, Peter Szolovits, David W. Bates, Raja-Elie E. Abdulnour, Atul J. Butte, Emily Alsentzer. [_Coding Inequity: Assessing GPT-4’s Potential for Perpetuating Racial and Gender Biases in Healthcare_](https://www.medrxiv.org/content/10.1101/2023.07.13.23292577). July 2023. +[https://www.medrxiv.org/content/10.1101/2023.07.13.23292577](https://www.medrxiv.org/content/10.1101/2023.07.13.23292577) — This article relates to the underlying biases in the models we talked about this week, but with an application that show clear potential harm resulting from these biases in the form if increased risk of medical misdiagnosis. + +## Discussion for Monday: + +Everyone who is not in either the lead or blogging team for the week should post (in the comments below) an answer to at least one of the questions in this section, or a substantive response to someone else's comment, or something interesting about the readings that is not covered by these questions. Don't post duplicates - if others have already posted, you should read their responses before adding your own. Please post your responses to different questions as separate comments. + +You should post your _initial_ response before 5:29pm on Sunday, September 17, but feel free (and encouraged!) to continue the discussion after that, including responding to any responses by others to your comments. + +1. Based on the criterions shown in Figure 2 of [1], imagine a practical scenario and explain why you would choose or not choose using LLMs for your scenario. +2. Are plug-ins the future of AGI? Do you think that a company should only focus on building powerful AI systems that does not need any support from plug-ins, or they should only focus on the core system and involve more plug-ins into the ecosystem? + +## Discussion for Wednesday: + +You should post your _initial_ response to one of the questions below or something interesting related to the Wednesday readings before 5:29pm on Tuesday, September 19. + +1. What should we do before deploying LLMs in medical diagnosis applications? What (if any) regulations should control or limit how they would be used? + +2. With LLMs handling sensitive medical information, how can patient privacy and data security be maintained? What policies and safeguards should be in place to protect patient data? + +3. The paper discusses the progress of LLMs towards achieving physician-level performance in medical question answering. What are the potential implications of LLMs reaching or surpassing human expertise in medical knowledge? + +4. The paper mentions the importance of safety and minimizing bias in LLM-generated medical information, and the [optional reading](https://www.medrxiv.org/content/10.1101/2023.07.13.23292577) reports on some experiments that show biases in GPT's medical diagnoses. Should models be tuned to ignore protected attributes? Should we prevent models from being used in medical applications until these problems can be solved? diff --git a/src/content/post/week5.md b/src/content/post/week5.md index 041e276..50544e8 100644 --- a/src/content/post/week5.md +++ b/src/content/post/week5.md @@ -5,7 +5,7 @@ title = "Week 5: Hallucination" slug = "week5" +++ - +(see bottom for assigned readings and questions) # Hallucination (Week 5) @@ -19,51 +19,62 @@ slug = "week5" - Figure 1 (Presentation Slides): People Hallucinate Too -
    + + People Hallucinate Too

    In general, hallucinations refer to the propagation of false information and/or misinformation. One common example of hallucinations is the Mandela Effect, where incorrect memories are shared by a large group of people. For instance, a paranormal researcher, Fiona Broome, reported a widespread misremembering of a tragedy that Mandela died in prison in the 1980’s, which was untrue.

    + - + -
    Figure 2 (Presentation Slides): Hallucination Definition +
    Hallucination Definition

    - In the context of LLMs, hallucinations are a phenomenon that refer to a model’s seemingly plausible generated output, usually presented in a confident tone which makes users more susceptible to believing the result. There are three types of hallucinations according to the “Siren's Song in the AI Ocean” paper: (1) input-conflict, (2) context-conflict, and (3) fact-conflict. - In class, there seemed to be several dissenting opinions about the definition of hallucination regarding LLMs. One classmate argued how alignment-based hallucination should not be considered as part of the discussion scope, as the model would still be doing what it was intended to be doing (i.e. aligning with the user and/or aligning with the trainer). -

    -
      -
    • Input-conflict: This subcategory of hallucinations deviates from user input. Input from the user can be separated into a task instruction and a task input. An example of a task instruction is a user prompting a model to make a presentation for them. In this example, a task input could be the research papers the user wanted the presentation to be based off of.
    • -
    • Context-conflict: Context-conflict hallucinations occur when a model generates contradicting information within a response. A simple example of this would be replacing someone’s name (ex. Silver) for another name (ex. Stern).
    • -
    • Fact-conflict: This is the most common subcategory of hallucination, thus making it the most recent focus of research. An example of this could be returning the wrong date for a historical event.
    • -
    -
    + In the context of LLMs, hallucinations are a phenomenon that refer to a model’s seemingly plausible generated output, usually presented in a confident tone which makes users more susceptible to believing the result. + +There are three types of hallucinations according to the “Siren's Song + in the AI Ocean” paper: (1) input-conflict, (2) + context-conflict, and (3) fact-conflict. In class, there + seemed to be several dissenting opinions about the + definition of hallucination regarding LLMs. One classmate + argued how alignment-based hallucination should not be + considered as part of the discussion scope, as the model + would still be doing what it was intended to be doing + (i.e. aligning with the user and/or aligning with the + trainer).

    • Input-conflict: This + subcategory of hallucinations deviates from user + input. Input from the user can be separated into a task + instruction and a task input. An example of a task + instruction is a user prompting a model to make a + presentation for them. In this example, a task input could + be the research papers the user wanted the presentation to + be based on.
    • Context-conflict: + Context-conflict hallucinations occur when a model + generates contradicting information within a response. A + simple example of this would be replacing someone’s name + (ex. Silver) for another name (ex. Stern).
    • +
    • Fact-conflict: This is the most common + subcategory of hallucination, thus making it the most + recent focus of research. An example of this could be + returning the wrong date for a historical event.
    • +
    - + +
    - Figure 3 (Presentation Slides): Sources of Hallucination - - - Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models - -
    -

    - There are several causes for the phenomenon. First, there can be inconsistencies in the training data. The model might not cover certain knowledge during its training phase or may be trained with incorrect information. -

    -

    - Secondly, the alignment process can be misleading. For instance, the model might respond to an instruction even when it lacks the necessary knowledge, leading it to fabricate information. Additionally, there is a tendency in Large Language Models (LLM) to favor the user’s perspective, which can result in the generation of incorrect knowledge. -

    -

    - Lastly, risks inherent in the generation strategy can also be a cause. Sequential token generation can lead to local optimization. This in turn leads to strong self-consistency, which can cause a phenomenon called "hallucination snowballing". An LLM might adhere to a particular point, even if it is erroneous. -

    +
    Sources of Hallucination
    +
    + + _Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models_
    @@ -71,7 +82,7 @@ slug = "week5" - Figure 4 (Presentation Slides): Hallucination Risks + Hallucination Risks @@ -93,14 +104,16 @@ slug = "week5" -# Wednesday, October 4th: Hallucination Solutions and Benefits +# Wednesday, October 4th: Hallucination Solutions - + +
    - Figure 5 (Presentation Slides): Mitigation Strategies - Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models +
    Mitigation Strategies + +_Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models_

    During the pretraining phase, it is essential to feasibly construct high-quality data. A potential solution is to filter out machine-generated sources, especially when tokens are uncommon. However, we can only use heuristic rules, which are not always effective at removing fake content. @@ -115,16 +128,21 @@ slug = "week5" For inference, one strategy is to reduce the snowballing of hallucinations by designing a dynamic p-value. The p-value should start off large and shrink as more tokens are generated. Furthermore, introducing new or external knowledge can be done at two different positions: before and after generation.

    - + @@ -135,10 +153,12 @@ slug = "week5" + @@ -146,21 +166,22 @@ slug = "week5"
    - Figure 6 (Presentation Slides): This figure presents a potential solution titled "DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models". -
    -

    +

    +Decoding Contrasting Layers
    + +_DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models_ + +
    +

    Based on evolving trends, the concept of contrastive decoding is introduced. For example, one might ask, "How do we decide between Seattle or Olympia?" When considering the last layer as a mature layer, it is beneficial to contrast the differences between the preceding layers, which can be deemed as premature. For each of these layers, it is possible to calculate the difference between each probability distribution by comparing mature and premature layers, a process that utilizes the Jensen-Shannon Divergence. Such an approach permits the amplification of the factual knowledge that the model has acquired, thereby enhancing its output generation.

    - Figure 7: Presentation Slides - Potential Solution: In-Context Retrieval-Augmented Language Models -
    -

    + +_In-Context Retrieval-Augmented Language Models_ + +

    The model parameters are kept frozen. Instead of directly inputting text into the model, the approach first uses retrieval to search for relevant documents from external sources. The findings from these sources are then concatenated with the original text. Re-ranking results from the retrieval model also provides benefits; the exact perplexities can be referred to in the slide. It has been observed that smaller strides can enhance performance, albeit at the cost of increased runtime. The authors have noticed that the information at the end of a query is typically more relevant for output generation. In general, shorter queries tend to outperform longer ones.

    - + - + + +
    - Figure 7: Presentation Slides. Benefits of Hallucinations. -

    - Creative and divergent thinking, as seen in DreamGPT, aids in the generation of new ideas. Additionally, data augmentation can be used to produce health records without concerns for patient privacy. However, it is worth noting that it remains uncertain whether privacy is genuinely maintained in such instances where information might be hallucinated. -

    +
    Benefits of Hallucinations
    -

    Discussion 1: Based on the sources of hallucination, what methods can be employed to mitigate the hallucination issue?

    + +## Discussion: _Based on the sources of hallucination, what methods can be employed to mitigate the hallucination issue?_ +

    Group 2: Discussed two papers from this week's reading which highlighted the use of semantic search and the introduction of external context to aid the model. This approach, while useful for diminishing hallucination, heavily depends on external information, which is not effective in generic cases. - Further strategies discussed were automated prompt engineering, optimization of user-provided context (noting that extensive contexts can induce hallucination), and the implementation of filtering or attention mechanisms to limit the tokens the model processes. + Further strategies discussed were automated prompt engineering, optimization of user-provided context (noting that extensive contexts can induce hallucination), and using filtering or attention mechanisms to limit the tokens the model processes. From the model's perspective, it is beneficial to employ red-teaming, explore corner cases, and pinpoint domains where hallucinations are prevalent. Notably, responses can vary for an identical prompt. A proposed solution is to generate multiple responses to the same prompt and amalgamate them, perhaps through a majority voting system, to eliminate low-probability hallucinations.

    @@ -178,15 +199,72 @@ slug = "week5" An interesting perspective discussed was utilizing a larger model to verify the smaller model's hallucinations. But a caveat arises: How can one ensure the larger model's accuracy? And if the larger model is deemed superior, why not employ it directly?

    -

    Discussion 2: What are the potential advantages of hallucinations in Large Language Models (LLMs)?

    +## Discussion: _What are the potential advantages of hallucinations in Large Language Models (LLMs)?_ +

    One advantage discussed was that hallucinations "train" users to not blindly trust the model outputs. If such models are blindly trusted, there is a much greater risk associated with their use. If users can conclusively discern, however, that the produced information is fictitious, it could assist in fostering new ideas or fresh perspectives on a given topic. Furthermore, while fake data has potential utility in synthetic data generation, there's a pressing need to remain vigilant regarding the accuracy and plausibility of the data produced.

    -[1]: Zhang, Y., Li, Y., Cui, L., Cai, D., Liu, L., Fu, T., Huang, X., Zhao, E., Zhang, Y., Chen, Y., Wang, L., Luu, A. T., Bi, W., Shi, F., &Shi, S. (2023, September 24). Siren’s song in the AI Ocean: A survey on hallucination in large language models. arXiv.org. https://arxiv.org/abs/2309.01219 -[2]: Ram, O., Levine, Y., Dalmedigos, I., Muhlgay, D., Shashua, A., Leyton-Brown, K., & Shoham, Y. (2023, August 1). In-context retrieval-augmented language models. arXiv.org. https://arxiv.org/abs/2302.00083 +# Readings + +### For the first class (9/27) + +Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang et al. [_Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models_](https://arxiv.org/abs/2309.01219). September 2023. [https://arxiv.org/abs/2309.01219](https://arxiv.org/abs/2309.01219) + +### For the second class (10/4) + +Choose **one** (or more) of the following papers to read: + +- Ori Ram, Yoav Levine, Itay Dalmedigos, Dor Muhlgay, Amnon Shashua, Kevin Leyton-Brown, and Yoav Shoham. [_In-context retrieval-augmented language models_](https://arxiv.org/abs/2302.00083). Accepted for publication in TACL 2024. +- Yung-Sung Chuang, Yujia Xie, Hongyin Luo, Yoon Kim, James Glass, and Pengcheng He. [_DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models_](https://arxiv.org/abs/2309.03883). September 2023. + +## Optional Additional Readings + +### Overview of hallucination + +- Vipula Rawte, Amit Sheth, and Amitava Das. [_A Survey of Hallucination in Large Foundation Models_](https://arxiv.org/abs/2309.05922). September 2023. +- Hongbin Ye, Tong Liu, Aijia Zhang, Wei Hua, and Weiqiang Jia. [_Cognitive Mirage: A Review of Hallucinations in Large Language Models_](https://arxiv.org/abs/2309.06794). September 2023. +Nick McKenna, Tianyi Li, Liang Cheng, Mohammad Javad Hosseini, Mark Johnson, and Mark Steedman. [_Sources of Hallucination by Large Language Models on Inference Tasks_](https://arxiv.org/abs/2305.14552). May 2023. +- [_Why ChatGPT and Bing Chat are so good at making things up_](https://arstechnica.com/information-technology/2023/04/why-ai-chatbots-are-the-ultimate-bs-machines-and-how-people-hope-to-fix-them/) + +### How to reduce hallucination: Retrieval-augmented LLM + +- Weijia Shi, Sewon Min, Michihiro Yasunaga, Minjoon Seo, Rich James, Mike Lewis, Luke Zettlemoyer, and Wen-tau Yih. [_Replug: Retrieval-augmented black-box language models_](https://arxiv.org/abs/2301.12652). January 2023. +- Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang et al. [_Check your facts and try again: Improving large language models with external knowledge and automated feedback_](https://arxiv.org/abs/2302.12813). Februrary 2023. +- Akari Asai, Sewon Min, Zexuan Zhong, and Danqi Chen. [ACL 2023 Tutorial: _Retrieval-based Language Models and Applications_](https://acl2023-retrieval-lm.github.io/). ACL 2023. + +### How to reduce hallucination: Decoding strategy + +- Nayeon Lee, Wei Ping, Peng Xu, Mostofa Patwary, Pascale N. Fung, Mohammad Shoeybi, and Bryan Catanzaro. [_Factuality enhanced language models for open-ended text generation_](https://proceedings.neurips.cc/paper_files/paper/2022/hash/df438caa36714f69277daa92d608dd63-Abstract-Conference.html). NeurIPS 2022. +- Kenneth Li, Oam Patel, Fernanda Viégas, Hanspeter Pfister, and Martin Wattenberg. [_Inference-Time Intervention: Eliciting Truthful Answers from a Language Model_](https://arxiv.org/abs/2306.03341). June 2023. + +### Hallucination is not always harmful: Possible use cases of hallucination + +- [_dreamGPT: AI powered inspiration_](https://github.com/DivergentAI/dreamGPT) +- [_Are AI models doomed to always hallucinate?_](https://techcrunch.com/2023/09/04/are-language-models-doomed-to-always-hallucinate/) +- [OpenAI CEO Sam Altman sees “a lot of value” in AI hallucinations](https://www.smartcompany.com.au/technology/artificial-intelligence/openai-ceo-sam-altman-ai-hallucinations/#:~:text=%E2%80%9COne%20of%20the%20non%2Dobvious,have%20good%20stuff%20for%20that.). + +# Discussion Questions + +Everyone who is not in either the lead or blogging team for the week should post (in the comments below) an answer to at least one of the four questions in each section, or a substantive response to someone else's comment, or something interesting about the readings that is not covered by these questions. + +Don't post duplicates - if others have already posted, you should read their responses before adding your own. Please post your responses to different questions as separate comments. + +First section (1 - 4): Before 5:29pm on **Tuesday 26 September**. +Second section (5 - 9): Before 5:29pm on **Tuesday 3 October**. + +## Questions for the first class (9/27) + +1. What are the risks of hallucinations, especially when LLMs are used in critical applications such as autonomous vehicles, medical diagnosis, or legal analysis? +2. What are some potential long-term consequences of allowing LLMs to generate fabricated information without proper detection and mitigation measures in place? +3. How can we distinguish between legitimate generalization or "creative writing" and hallucination? Where is the line between expanding on existing knowledge and creating entirely fictional information, and what are the consequences on users? + +## Questions for the second class (10/4) -[3]: Chuang, Y.-S., Xie, Y., Luo, H., Kim, Y., Glass, J., & He, P. (2023, September 7). Dola: Decoding by contrasting layers improves factuality in large language models. arXiv.org. https://arxiv.org/abs/2309.03883 +1. The required reading presents two methods for reducing hallucinations, i.e., introducing external knowledge and designing better decoding strategies. Can you brainstorm or refer to optional readings to explore ways to further mitigate hallucinations? If so, could you elaborate more on your ideas and also discuss the challenges and risks associated with them? +2. Among all the mitigation strategies for addressing hallucination (including those introduced in the reading material from the first class), which one do you find most promising, and why? +3. Do retrieval-augmented LLMs pose any risks or potential negative consequences despite their ability to mitigate LLM hallucinations through the use of external knowledge? +4. The method proposed by DoLa seems quite simple but effective. Where do you think the authors of DoLa get the inspiration for their idea? diff --git a/week4/index.html b/week4/index.html index dd1e95c..dec34d2 100644 --- a/week4/index.html +++ b/week4/index.html @@ -91,7 +91,8 @@

    Week 4: Capabilities of LLMs

    -

    Capabilities of LLMs (Week 4)

    +

    (see bottom for assigned readings and questions)

    +

    Capabilities of LLMs (Week 4)

    Presenting Team: Xindi Guo, Mengxuan Hu, Tseganesh Beyene Kebede, Zihan Guan

    Blogging Team: Ajwa Shahid, Caroline Gihlstorf, Changhong Yang, Hyeongjin Kim, Sarah Boyce

    Monday, September 18

    @@ -224,6 +225,51 @@

    Discussion

    How can we refine and improve LLMs like Med-PaLM2 to be more effective in healthcare applications?

    +

    Readings

    +

    Monday:

    +
      +
    1. +

      Jingfeng Yang, Hongye Jin, Ruixiang Tang, Xiaotian Han, Qizhang Feng, Haoming Jiang, Bing Yin, Xia Hu. Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond. April 2023. https://arxiv.org/abs/2304.13712. [PDF]

      +
    2. +
    3. +

      OpenAI. GPT-4 Technical Report. March 2023. https://arxiv.org/abs/2303.08774 [PDF]

      +
    4. +
    +

    Optionally, also explore https://openai.com/blog/chatgpt-plugins.

    +

    Wednesday:

    +
      +
    1. Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, Vivek Natarajan. Towards Expert-Level Medical Question Answering with Large Language Models +https://arxiv.org/abs/2305.09617 [PDF]
    2. +
    +

    Optional Readings:

    + +

    Discussion for Monday:

    +

    Everyone who is not in either the lead or blogging team for the week should post (in the comments below) an answer to at least one of the questions in this section, or a substantive response to someone else’s comment, or something interesting about the readings that is not covered by these questions. Don’t post duplicates - if others have already posted, you should read their responses before adding your own. Please post your responses to different questions as separate comments.

    +

    You should post your initial response before 5:29pm on Sunday, September 17, but feel free (and encouraged!) to continue the discussion after that, including responding to any responses by others to your comments.

    +
      +
    1. Based on the criterions shown in Figure 2 of [1], imagine a practical scenario and explain why you would choose or not choose using LLMs for your scenario.
    2. +
    3. Are plug-ins the future of AGI? Do you think that a company should only focus on building powerful AI systems that does not need any support from plug-ins, or they should only focus on the core system and involve more plug-ins into the ecosystem?
    4. +
    +

    Discussion for Wednesday:

    +

    You should post your initial response to one of the questions below or something interesting related to the Wednesday readings before 5:29pm on Tuesday, September 19.

    +
      +
    1. +

      What should we do before deploying LLMs in medical diagnosis applications? What (if any) regulations should control or limit how they would be used?

      +
    2. +
    3. +

      With LLMs handling sensitive medical information, how can patient privacy and data security be maintained? What policies and safeguards should be in place to protect patient data?

      +
    4. +
    5. +

      The paper discusses the progress of LLMs towards achieving physician-level performance in medical question answering. What are the potential implications of LLMs reaching or surpassing human expertise in medical knowledge?

      +
    6. +
    7. +

      The paper mentions the importance of safety and minimizing bias in LLM-generated medical information, and the optional reading reports on some experiments that show biases in GPT’s medical diagnoses. Should models be tuned to ignore protected attributes? Should we prevent models from being used in medical applications until these problems can be solved?

      +
    8. +

      @@ -253,7 +299,7 @@

      Discussion

    - + @@ -263,6 +309,8 @@

    Discussion

  1. « Previous page: Week 3: Prompting and Bias
  2. +
  3. Next page: Week 5: Hallucination »
  4. +
diff --git a/week5/index.html b/week5/index.html new file mode 100644 index 0000000..5852972 --- /dev/null +++ b/week5/index.html @@ -0,0 +1,386 @@ + + + + + Week 5: Hallucination | Risks (and Benefits) of Generative AI and Large Language Models + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+ +
+ +
+
+
+ +

Week 5: Hallucination

+ + +
+

(see bottom for assigned readings and questions)

+

Hallucination (Week 5)

+

Presenting Team: Liu Zhe, Peng Wang, Sikun Guo, Yinhan He, Zhepei Wei

+

Blogging Team: Anshuman Suri, Jacob Christopher, Kasra Lekan, Kaylee Liu, My Dinh

+

Wednesday, September 27th: Intro to Hallucination

+ + + + + + + +
People Hallucinate Too +

+ In general, hallucinations refer to the propagation of false information and/or misinformation. One common example of hallucinations is the Mandela Effect, where incorrect memories are shared by a large group of people. For instance, a paranormal researcher, Fiona Broome, reported a widespread misremembering of a tragedy that Mandela died in prison in the 1980’s, which was untrue. +

+
+ + + + + +
Hallucination Definition
+
+

+ In the context of LLMs, hallucinations are a phenomenon that refer to a model’s seemingly plausible generated output, usually presented in a confident tone which makes users more susceptible to believing the result. +

There are three types of hallucinations according to the “Siren’s Song +in the AI Ocean” paper: (1) input-conflict, (2) +context-conflict, and (3) fact-conflict. In class, there +seemed to be several dissenting opinions about the +definition of hallucination regarding LLMs. One classmate +argued how alignment-based hallucination should not be +considered as part of the discussion scope, as the model +would still be doing what it was intended to be doing +(i.e. aligning with the user and/or aligning with the +trainer).

  • Input-conflict: This +subcategory of hallucinations deviates from user +input. Input from the user can be separated into a task +instruction and a task input. An example of a task +instruction is a user prompting a model to make a +presentation for them. In this example, a task input could +be the research papers the user wanted the presentation to +be based on.
  • Context-conflict: +Context-conflict hallucinations occur when a model +generates contradicting information within a response. A +simple example of this would be replacing someone’s name +(ex. Silver) for another name (ex. Stern).
  • +
  • Fact-conflict: This is the most common +subcategory of hallucination, thus making it the most +recent focus of research. An example of this could be +returning the wrong date for a historical event.
  • +

+ + + + + + +

+
Sources of Hallucination
+
+

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models +

+ + + + + +
Hallucination Risks +
+
+

Group Activity: Engage with ChatGPT to Explore Its Hallucinations (Three Groups Focusing on Different Hallucination Types)

+
+

+Group 1 focused on "Input-conflict Hallucination". One member narrated a story involving two characters, where one character murdered the other. Contrarily, ChatGPT presented an opposite conclusion. Another member tried to exploit different languages, using two distinct languages that possess similar words. +

+

+Group 2 concentrated on "Counter-conflict Hallucination". They described four to five fictitious characters, detailing their interrelationships. Some relationships were deducible, yet the model frequently failed to make a complete set of deductions until explictely prompted to be more complete. +

+

+Group 3 delved into "Fact-conflict Hallucination". An illustrative example was when ChatGPT was queried with the fraction "⅓". It offered "0.333" as an approximation. However, when subsequently asked to multiply "0.3333" by "3", it confidently responded with "1". Additional tests included translations between two languages. +

+
+

Wednesday, October 4th: Hallucination Solutions

+ + + + + + +

+
Mitigation Strategies +

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models +
+

+During the pretraining phase, it is essential to feasibly construct high-quality data. A potential solution is to filter out machine-generated sources, especially when tokens are uncommon. However, we can only use heuristic rules, which are not always effective at removing fake content. +

+

+For supervised fine-tuning, there is typically a limited amount of data available for the instruction set. Some of the recommended solutions include manually removing problematic instructions and employing an honest-oriented SFT approach. The term “honesty” can be misleading as it is sometimes used to capture a much broader range of desired behaviors by the trainer. +

+

+RLHF is an important alignment method which can also be used to mitigate hallucinations including through the human labelers focusing more on “honest” answers. +

+

+For inference, one strategy is to reduce the snowballing of hallucinations by designing a dynamic p-value. The p-value should start off large and shrink as more tokens are generated. Furthermore, introducing new or external knowledge can be done at two different positions: before and after generation. +

+
+ + + + + + +
+Decoding Contrasting Layers
+

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

+
+

+ Based on evolving trends, the concept of contrastive decoding is introduced. For example, one might ask, "How do we decide between Seattle or Olympia?" When considering the last layer as a mature layer, it is beneficial to contrast the differences between the preceding layers, which can be deemed as premature. For each of these layers, it is possible to calculate the difference between each probability distribution by comparing mature and premature layers, a process that utilizes the Jensen-Shannon Divergence. Such an approach permits the amplification of the factual knowledge that the model has acquired, thereby enhancing its output generation. +

+
+ + + + + + +
+

In-Context Retrieval-Augmented Language Models

+

+ The model parameters are kept frozen. Instead of directly inputting text into the model, the approach first uses retrieval to search for relevant documents from external sources. The findings from these sources are then concatenated with the original text. Re-ranking results from the retrieval model also provides benefits; the exact perplexities can be referred to in the slide. It has been observed that smaller strides can enhance performance, albeit at the cost of increased runtime. The authors have noticed that the information at the end of a query is typically more relevant for output generation. In general, shorter queries tend to outperform longer ones. +

+
+ + + + + + + +
Benefits of Hallucinations +
+

Discussion: Based on the sources of hallucination, what methods can be employed to mitigate the hallucination issue?

+

+ Group 2: + Discussed two papers from this week's reading which highlighted the use of semantic search and the introduction of external context to aid the model. This approach, while useful for diminishing hallucination, heavily depends on external information, which is not effective in generic cases. + Further strategies discussed were automated prompt engineering, optimization of user-provided context (noting that extensive contexts can induce hallucination), and using filtering or attention mechanisms to limit the tokens the model processes. + From the model's perspective, it is beneficial to employ red-teaming, explore corner cases, and pinpoint domains where hallucinations are prevalent. + Notably, responses can vary for an identical prompt. A proposed solution is to generate multiple responses to the same prompt and amalgamate them, perhaps through a majority voting system, to eliminate low-probability hallucinations. +

+

+ Group 1: + Discussed the scarcity of alternatives to the current training dataset. + Like Group 2, they also explored the idea of generating multiple responses but suggested allowing the user to select from the array of choices. + Another approach discussed was the model admitting uncertainty, stating "I don’t know", rather than producing a hallucination. +

+

+ Group 3: Addressed inconsistencies in the training data. + Emphasized the importance of fine-tuning and ensuring the use of contemporary data. + However, they noted that fine-tuning doesn't ensure exclusion of outdated data. + It was also advised to source data solely from credible sources. + An interesting perspective discussed was utilizing a larger model to verify the smaller model's hallucinations. But a caveat arises: How can one ensure the larger model's accuracy? And if the larger model is deemed superior, why not employ it directly? +

+

Discussion: What are the potential advantages of hallucinations in Large Language Models (LLMs)?

+

+ One advantage discussed was that hallucinations "train" users to not blindly trust the model outputs. If such models are blindly trusted, there is a much greater risk associated with their use. + If users can conclusively discern, however, that the produced information is fictitious, it could assist in fostering new ideas or fresh perspectives on a given topic. + Furthermore, while fake data has potential utility in synthetic data generation, there's a pressing need to remain vigilant regarding the accuracy and plausibility of the data produced. +

+

Readings

+

For the first class (9/27)

+

Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang et al. Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models. September 2023. https://arxiv.org/abs/2309.01219

+

For the second class (10/4)

+

Choose one (or more) of the following papers to read:

+ +

Optional Additional Readings

+

Overview of hallucination

+ +

How to reduce hallucination: Retrieval-augmented LLM

+ +

How to reduce hallucination: Decoding strategy

+ +

Hallucination is not always harmful: Possible use cases of hallucination

+ +

Discussion Questions

+

Everyone who is not in either the lead or blogging team for the week should post (in the comments below) an answer to at least one of the four questions in each section, or a substantive response to someone else’s comment, or something interesting about the readings that is not covered by these questions.

+

Don’t post duplicates - if others have already posted, you should read their responses before adding your own. Please post your responses to different questions as separate comments.

+

First section (1 - 4): Before 5:29pm on Tuesday 26 September.
+Second section (5 - 9): Before 5:29pm on Tuesday 3 October.

+

Questions for the first class (9/27)

+
    +
  1. What are the risks of hallucinations, especially when LLMs are used in critical applications such as autonomous vehicles, medical diagnosis, or legal analysis?
  2. +
  3. What are some potential long-term consequences of allowing LLMs to generate fabricated information without proper detection and mitigation measures in place?
  4. +
  5. How can we distinguish between legitimate generalization or “creative writing” and hallucination? Where is the line between expanding on existing knowledge and creating entirely fictional information, and what are the consequences on users?
  6. +
+

Questions for the second class (10/4)

+
    +
  1. The required reading presents two methods for reducing hallucinations, i.e., introducing external knowledge and designing better decoding strategies. Can you brainstorm or refer to optional readings to explore ways to further mitigate hallucinations? If so, could you elaborate more on your ideas and also discuss the challenges and risks associated with them?
  2. +
  3. Among all the mitigation strategies for addressing hallucination (including those introduced in the reading material from the first class), which one do you find most promising, and why?
  4. +
  5. Do retrieval-augmented LLMs pose any risks or potential negative consequences despite their ability to mitigate LLM hallucinations through the use of external knowledge?
  6. +
  7. The method proposed by DoLa seems quite simple but effective. Where do you think the authors of DoLa get the inspiration for their idea?
  8. +
+ +
+ + + + +
+ + + +
+
+ +
+ + + + + + +
+
+ + + + + + + + + + + + + + +