Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
  • Loading branch information
dnozza committed Dec 16, 2024
1 parent 0b111d3 commit 2009feb
Show file tree
Hide file tree
Showing 174 changed files with 16,493 additions and 911 deletions.
8 changes: 4 additions & 4 deletions 404.html
Original file line number Diff line number Diff line change
Expand Up @@ -595,6 +595,8 @@ <h1>Page not found</h1>
<h2>Latest</h2>
<ul>

<li><a href="/publication/2024-prism/">The PRISM Alignment Dataset: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models</a></li>

<li><a href="/publication/2024-divine-llamas-emotion-bias/">Divine LLaMAs: Bias, Stereotypes, Stigmatization, and Emotion Representation of Religion in Large Language Models</a></li>

<li><a href="/publication/2024-compromesso/">Compromesso! Italian Many-Shot Jailbreaks Undermine the Safety of Large Language Models</a></li>
Expand All @@ -603,6 +605,8 @@ <h2>Latest</h2>

<li><a href="/publication/2024-politicalcompass/">Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models</a></li>

<li><a href="/publication/2024-hate-geographies/">From Languages to Geographies: Towards Evaluating Cultural Bias in Hate Speech Datasets</a></li>

<li><a href="/publication/2024-xstest/">XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models</a></li>

<li><a href="/publication/2024-dadit/">DADIT: A Dataset for Demographic Classification of Italian Twitter Users and a Comparison of Prediction Methods</a></li>
Expand All @@ -611,10 +615,6 @@ <h2>Latest</h2>

<li><a href="/publication/2024-safetyllamas/">Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions</a></li>

<li><a href="/publication/2024-safetyprompts/">SafetyPrompts: a Systematic Review of Open Datasets for Evaluating and Improving Large Language Model Safety </a></li>

<li><a href="/publication/2024-emotion-gender-stereotypes/">Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution</a></li>

</ul>


Expand Down
Loading

0 comments on commit 2009feb

Please sign in to comment.