Skip to content

Commit

Permalink
Rebuilt site
Browse files Browse the repository at this point in the history
  • Loading branch information
David Evans committed Nov 2, 2023
1 parent 1ca13aa commit 93365a7
Show file tree
Hide file tree
Showing 16 changed files with 114 additions and 170 deletions.
70 changes: 35 additions & 35 deletions index.html

Large diffs are not rendered by default.

10 changes: 3 additions & 7 deletions index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,7 @@
<language>en-us</language>
<managingEditor>[email protected] (David Evans)</managingEditor>
<webMaster>[email protected] (David Evans)</webMaster>
<lastBuildDate>Wed, 23 Aug 2023 00:00:00 +0000</lastBuildDate>

<atom:link href="https://llmrisks.github.io/index.xml" rel="self" type="application/rss+xml" />


<lastBuildDate>Mon, 30 Oct 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://llmrisks.github.io/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Week 9: Interpretability</title>
<link>https://llmrisks.github.io/week9/</link>
Expand Down Expand Up @@ -104,7 +100,7 @@ Table of Contents (Monday, 09/04/2023) Introduction to Alignment Introduction
<guid>https://llmrisks.github.io/week1/</guid>
<description>(see bottom for assigned readings and questions)
Attention, Transformers, and BERT Monday, 28 August
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&amp;rsquo;s an overview of transformers&amp;rsquo; evolution and significance.
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&amp;rsquo;s an overview of transformers&#39; evolution and significance.
Background and Origin RNNs2 were one of the earliest models used for sequence-based tasks in machine learning.</description>
</item>

Expand Down Expand Up @@ -222,4 +218,4 @@ I believe each team has at least a few members with enough experience using git
</item>

</channel>
</rss>
</rss>
2 changes: 1 addition & 1 deletion post/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ <h2><a href="/week1/">Week 1: Introduction</a></h2>

(see bottom for assigned readings and questions)
Attention, Transformers, and BERT Monday, 28 August
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&rsquo;s an overview of transformers&rsquo; evolution and significance.
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&rsquo;s an overview of transformers' evolution and significance.
Background and Origin RNNs2 were one of the earliest models used for sequence-based tasks in machine learning.
<p class="text-right"><a href="/week1/">Read More…</a></p>

Expand Down
10 changes: 3 additions & 7 deletions post/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,7 @@
<language>en-us</language>
<managingEditor>[email protected] (David Evans)</managingEditor>
<webMaster>[email protected] (David Evans)</webMaster>
<lastBuildDate>Mon, 30 Oct 2023 00:00:00 +0000</lastBuildDate>

<atom:link href="https://llmrisks.github.io/post/index.xml" rel="self" type="application/rss+xml" />


<lastBuildDate>Mon, 30 Oct 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://llmrisks.github.io/post/index.xml" rel="self" type="application/rss+xml" />
<item>
<title>Week 9: Interpretability</title>
<link>https://llmrisks.github.io/week9/</link>
Expand Down Expand Up @@ -104,7 +100,7 @@ Table of Contents (Monday, 09/04/2023) Introduction to Alignment Introduction
<guid>https://llmrisks.github.io/week1/</guid>
<description>(see bottom for assigned readings and questions)
Attention, Transformers, and BERT Monday, 28 August
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&amp;rsquo;s an overview of transformers&amp;rsquo; evolution and significance.
Transformers1 are a class of deep learning models that have revolutionized the field of natural language processing (NLP) and various other domains. The concept of transformers originated as an attempt to address the limitations of traditional recurrent neural networks (RNNs) in sequential data processing. Here&amp;rsquo;s an overview of transformers&#39; evolution and significance.
Background and Origin RNNs2 were one of the earliest models used for sequence-based tasks in machine learning.</description>
</item>

Expand Down Expand Up @@ -165,4 +161,4 @@ I&amp;rsquo;m expecting the structure and format to that combines aspects of thi
</item>

</channel>
</rss>
</rss>
7 changes: 4 additions & 3 deletions readings/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -139,18 +139,19 @@ <h2 id="abuses-of-llms">Abuses of LLMs</h2>
<p>Nicholas Carlini. <a href="https://arxiv.org/abs/2307.15008"><em>A LLM Assisted Exploitation of AI-Guardian</em></a>.</p>
<p>Andy Zou, Zifan Wang, J. Zico Kolter, Matt Fredrikson. <a href="https://arxiv.org/abs/2307.15043"><em>Universal and Transferable Adversarial Attacks on Aligned Language Models</em></a>. <a href="https://arxiv.org/abs/2307.15043">https://arxiv.org/abs/2307.15043</a>.
<a href="https://llm-attacks.org/">Project Website: https://llm-attacks.org/</a>.</p>
<p>Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz. <a href="https://arxiv.org/abs/2302.12173"><em>Not what you&rsquo;ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection</em></a>. <a href="https://arxiv.org/abs/2302.12173">https://arxiv.org/abs/2302.12173</a>.</p>
<h2 id="fairness-and-bias">Fairness and Bias</h2>
<p>Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov. <a href="https://arxiv.org/abs/2305.08283"><em>From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models</em></a>. ACL 2023.</p>
<p>Myra Cheng, Esin Durmus, Dan Jurafsky. <a href="https://arxiv.org/abs/2305.18189"><em>Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models</em></a>. ACL 2023.</p>
<h2 id="alignment">&ldquo;Alignment&rdquo;</h2>
<p>Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li. <a href="https://arxiv.org/abs/2308.05374"><em>Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models&rsquo; Alignment</em></a>. <a href="https://arxiv.org/abs/2308.05374">https://arxiv.org/abs/2308.05374</a>.</p>
<p>Yang Liu, Yuanshun Yao, Jean-Francois Ton, Xiaoying Zhang, Ruocheng Guo Hao Cheng, Yegor Klochkov, Muhammad Faaiz Taufiq, Hang Li. <a href="https://arxiv.org/abs/2308.05374"><em>Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment</em></a>. <a href="https://arxiv.org/abs/2308.05374">https://arxiv.org/abs/2308.05374</a>.</p>
<h2 id="agi">AGI</h2>
<p>Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang. <a href="https://www.microsoft.com/en-us/research/publication/sparks-of-artificial-general-intelligence-early-experiments-with-gpt-4/"><em>Sparks of Artificial General Intelligence: Early experiments with GPT-4</em></a>. Microsoft, March 2023. <a href="https://arxiv.org/abs/2303.12712">https://arxiv.org/abs/2303.12712</a></p>
<p>Yejin Choi. <a href="https://www.amacad.org/publication/curious-case-commonsense-intelligence"><em>The Curious Case of Commonsense Intelligence</em></a>. Daedalus, Spring 2022.</p>
<p>Konstantine Arkoudas. <a href="https://arxiv.org/abs/2308.03762"><em>GPT-4 Can&rsquo;t Reason</em></a>. <a href="https://arxiv.org/abs/2308.03762">https://arxiv.org/abs/2308.03762</a>.</p>
<p>Natalie Shapira, Mosh Levy, Seyed Hossein Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, Vered Shwartz. <a href="https://arxiv.org/abs/2305.14763"><em>Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models</em></a>.</p>
<p>Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Yejin Choi, Yulia
Tsvetkov. <a href="https://arxiv.org/abs/2306.00924"><em>Minding Language Models&rsquo; (Lack of) Theory of Mind: A
Tsvetkov. <a href="https://arxiv.org/abs/2306.00924"><em>Minding Language Models' (Lack of) Theory of Mind: A
Plug-and-Play Multi-Character Belief
Tracker</em></a>. ACL 2023</p>
<p>Boaz Barak. <a href="https://windowsontheory.org/2023/07/17/the-shape-of-agi-cartoons-and-back-of-envelope/"><em>The shape of AGI: Cartoons and back of envelope</em></a>. July 2023.</p>
Expand Down Expand Up @@ -196,7 +197,7 @@ <h3 id="more-sources">More Sources</h3>

</div>

<meta itemprop="wordCount" content="1105">
<meta itemprop="wordCount" content="1132">
<meta itemprop="datePublished" content="2023-08-21">
<meta itemprop="url" content="https://llmrisks.github.io/readings/">
</article>
Expand Down
92 changes: 24 additions & 68 deletions sitemap.xml
Original file line number Diff line number Diff line change
@@ -1,112 +1,68 @@
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">

<url>
<loc>https://llmrisks.github.io/post/</loc>
<lastmod>2023-10-30T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/</loc>
<lastmod>2023-10-30T00:00:00+00:00</lastmod>
</url><url>
<loc>https://llmrisks.github.io/week9/</loc>
<lastmod>2023-10-30T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week8/</loc>
<lastmod>2023-10-22T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week7/</loc>
<lastmod>2023-10-16T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week5/</loc>
<lastmod>2023-10-04T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week4/</loc>
<lastmod>2023-09-25T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week3/</loc>
<lastmod>2023-09-18T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week2/</loc>
<lastmod>2023-09-11T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/week1/</loc>
<lastmod>2023-09-03T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/discussions/</loc>
<lastmod>2023-08-25T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/class0/</loc>
<lastmod>2023-08-23T00:00:00+00:00</lastmod>
</url>

<url>
<loc>https://llmrisks.github.io/</loc>
<lastmod>2023-08-23T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/weeklyschedule/</loc>
<lastmod>2023-08-23T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/readings/</loc>
<lastmod>2023-08-21T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/schedule/</loc>
<lastmod>2023-08-21T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/updates/</loc>
<lastmod>2023-08-21T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/survey/</loc>
<lastmod>2023-08-17T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/welcome/</loc>
<lastmod>2023-05-26T00:00:00+00:00</lastmod>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/syllabus/</loc>
<priority>0</priority>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/blogging/</loc>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/tags/</loc>
</url>

<url>
</url><url>
<loc>https://llmrisks.github.io/topics/</loc>
</url>

</urlset>
</urlset>
3 changes: 3 additions & 0 deletions src/content/readings.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@ Nicholas Carlini. [_A LLM Assisted Exploitation of AI-Guardian_](https://arxiv.o
Andy Zou, Zifan Wang, J. Zico Kolter, Matt Fredrikson. [_Universal and Transferable Adversarial Attacks on Aligned Language Models_](https://arxiv.org/abs/2307.15043). [https://arxiv.org/abs/2307.15043](https://arxiv.org/abs/2307.15043).
[Project Website: https://llm-attacks.org/](https://llm-attacks.org/).

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz. [_Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection_](https://arxiv.org/abs/2302.12173). [https://arxiv.org/abs/2302.12173](https://arxiv.org/abs/2302.12173).


## Fairness and Bias

Shangbin Feng, Chan Young Park, Yuhan Liu, Yulia Tsvetkov. [_From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models_](https://arxiv.org/abs/2305.08283). ACL 2023.
Expand Down
8 changes: 2 additions & 6 deletions tags/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<managingEditor>[email protected] (David Evans)</managingEditor>
<webMaster>[email protected] (David Evans)</webMaster>

<atom:link href="https://llmrisks.github.io/tags/index.xml" rel="self" type="application/rss+xml" />


<webMaster>[email protected] (David Evans)</webMaster><atom:link href="https://llmrisks.github.io/tags/index.xml" rel="self" type="application/rss+xml" />
</channel>
</rss>
</rss>
8 changes: 2 additions & 6 deletions topics/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,6 @@
<generator>Hugo -- gohugo.io</generator>
<language>en-us</language>
<managingEditor>[email protected] (David Evans)</managingEditor>
<webMaster>[email protected] (David Evans)</webMaster>

<atom:link href="https://llmrisks.github.io/topics/index.xml" rel="self" type="application/rss+xml" />


<webMaster>[email protected] (David Evans)</webMaster><atom:link href="https://llmrisks.github.io/topics/index.xml" rel="self" type="application/rss+xml" />
</channel>
</rss>
</rss>
Loading

0 comments on commit 93365a7

Please sign in to comment.