Skip to content

Commit

Permalink
feat: template QA.md in vi
Browse files Browse the repository at this point in the history
  • Loading branch information
ledong0110 committed Sep 4, 2024
1 parent 69b1b7c commit 1ed8925
Show file tree
Hide file tree
Showing 128 changed files with 2,571 additions and 1,973 deletions.
146 changes: 146 additions & 0 deletions _data/leaderboard/vi/bias_toxicity/qa.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
XQuAD:
URA-LLaMa 70B:
DRR: null
DRG: 0.39
DRG_std: 0.01
SAR: null
SAG: 0.41
SAG_std: 0.00
Tox: 0.02
Tox_std: 0.00
URA-LLaMa 13B:
DRR: null
DRG: 0.39
DRG_std: 0.01
SAR: null
SAG: 0.45
SAG_std: 0.01
Tox: 0.02
Tox_std: 0.00
URA-LLaMa 7B:
DRR: null
DRG: 0.43
DRG_std: 0.01
SAR: null
SAG: 0.48
SAG_std: 0.00
Tox: 0.03
Tox_std: 0.00
LLaMa-2 13B:
DRR: null
DRG: 0.35
DRG_std: 0.03
SAR: null
SAG: 0.46
SAG_std: 0.00
Tox: 0.01
Tox_std: 0.00
LLaMa-2 7B:
DRR: null
DRG: 0.46
DRG_std: 0.01
SAR: null
SAG: 0.42
SAG_std: 0.00
Tox: 0.01
Tox_std: 0.00
Vietcuna 7B:
DRR: null
DRG: 0.50
DRG_std: 0.00
SAR: null
SAG: null
SAG_std: null
Tox: 0.04
Tox_std: 0.00
GPT-3.5:
DRR: null
DRG: 0.43
DRG_std: 0.01
SAR: null
SAG: 0.48
SAG_std: 0.00
Tox: 0.02
Tox_std: 0.00
GPT-4:
DRR: null
DRG: 0.40
DRG_std: 0.01
SAR: null
SAG: 0.45
SAG_std: 0.00
Tox: 0.02
Tox_std: 0.00
MLQA:
URA-LLaMa 70B:
DRR: null
DRG: 0.14
DRG_std: 0.02
SAR: null
SAG: 0.42
SAG_std: 0.03
Tox: 0.02
Tox_std: 0.00
URA-LLaMa 13B:
DRR: null
DRG: 0.17
DRG_std: 0.1
SAR: null
SAG: 0.38
SAG_std: 0.00
Tox: 0.02
Tox_std: 0.00
URA-LLaMa 7B:
DRR: null
DRG: 0.18
DRG_std: 0.01
SAR: null
SAG: 0.37
SAG_std: 0.01
Tox: 0.02
Tox_std: 0.00
LLaMa-2 13B:
DRR: null
DRG: 0.27
DRG_std: 0.01
SAR: null
SAG: 0.43
SAG_std: 0.00
Tox: 0.01
Tox_std: 0.00
LLaMa-2 7B:
DRR: null
DRG: 0.21
DRG_std: 0.06
SAR: null
SAG: 0.45
SAG_std: 0.00
Tox: 0.01
Tox_std: 0.00
Vietcuna 7B:
DRR: null
DRG: 0.23
DRG_std: 0.09
SAR: null
SAG: 0.49
SAG_std: 0.01
Tox: 0.04
Tox_std: 0.00
GPT-3.5:
DRR: null
DRG: 0.18
DRG_std: 0.01
SAR: null
SAG: 0.40
SAG_std: 0.00
Tox: 0.02
Tox_std: 0.00
GPT-4:
DRR: null
DRG: 0.16
DRG_std: 0.01
SAR: null
SAG: 0.41
SAG_std: 0.01
Tox: 0.02
Tox_std: 0.00
9 changes: 9 additions & 0 deletions _data/leaderboard/vi/models.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
models:
- URA-LLaMa 70B
- URA-LLaMa 13B
- URA-LLaMa 7B
- LLaMa-2 13B
- LLaMa-2 7B
- Vietcuna 7B
- GPT-3.5
- GPT-4
217 changes: 91 additions & 126 deletions _pages/vi/bias-toxicity/question-answering.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,131 +3,96 @@ layout: default
permalink: /leaderboard/vi/bias-toxicity/question-answering
---
# Bias-Toxicity Question Answering Leaderboard
<div>{{page.lang}}</div>

<table class="table table-bordered table-sm w-100 dtHorizontalTable" cellspacing="0">
<thead>
<tr>
<th rowspan="2" class="text-center align-middle"><b>Models</b></th>
<th colspan="5" class="text-center"><b>XQuAD</b></th>
<th colspan="5" class="text-center"><b>MLQA</b></th>
</tr>
<tr>
<th class="text-center"><b>DRR→|</b></th>
<th class="text-center"><b>DRG→|</b></th>
<th class="text-center"><b>SAR→|</b></th>
<th class="text-center"><b>SAG→|</b></th>
<th class="text-center"><b>Tox↓</b></th>
<th class="text-center"><b>DRR→|</b></th>
<th class="text-center"><b>DRG→|</b></th>
<th class="text-center"><b>SAR→|</b></th>
<th class="text-center"><b>SAG→|</b></th>
<th class="text-center"><b>Tox↓</b></th>
</tr>
</thead>
<tbody>
<tr>
<td class="text-center"><b>URA-LLaMa 70B</b></td>
<td class="text-center">-</td>
<td class="text-center">0.39 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center">0.41 ± 0.00</td>
<td class="text-center">0.02 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">0.14 ± 0.02</td>
<td class="text-center">-</td>
<td class="text-center">0.42 ± 0.03</td>
<td class="text-center">0.02 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>URA-LLaMa 13B</b></td>
<td class="text-center">-</td>
<td class="text-center">0.39 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center">0.45 ± 0.01</td>
<td class="text-center">0.02 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">0.17 ± 0.1</td>
<td class="text-center">-</td>
<td class="text-center">0.38 ± 0.00</td>
<td class="text-center">0.02 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>URA-LLaMa 7B</b></td>
<td class="text-center">-</td>
<td class="text-center">0.43 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center" style="background-color: cyan;">0.48 ± 0.00</td>
<td class="text-center">0.03 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">0.18 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center">0.37 ± 0.01</td>
<td class="text-center">0.02 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>LLaMa-2 13B</b></td>
<td class="text-center">-</td>
<td class="text-center">0.35 ± 0.03</td>
<td class="text-center">-</td>
<td class="text-center">0.46 ± 0.00</td>
<td class="text-center" style="background-color: cyan;">0.01 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center" style="background-color: cyan;">0.27 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center">0.43 ± 0.00</td>
<td class="text-center" style="background-color: cyan;">0.01 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>LLaMa-2 7B</b></td>
<td class="text-center">-</td>
<td class="text-center">0.46 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center">0.42 ± 0.00</td>
<td class="text-center" style="background-color: cyan;">0.01 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">0.21 ± 0.06</td>
<td class="text-center">-</td>
<td class="text-center">0.45 ± 0.00</td>
<td class="text-center" style="background-color: cyan;">0.01 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>Vietcuna 7B</b></td>
<td class="text-center">-</td>
<td class="text-center" style="background-color: cyan;">0.50 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">-</td>
<td class="text-center">0.04 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">0.23 ± 0.09</td>
<td class="text-center">-</td>
<td class="text-center" style="background-color: cyan;">0.49 ± 0.01</td>
<td class="text-center">0.04 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>GPT-3.5</b></td>
<td class="text-center" style="background-color: #f0f0f0;">-</td>
<td class="text-center" style="background-color: #f0f0f0;">0.43 ± 0.01</td>
<td class="text-center" style="background-color: #f0f0f0;">-</td>
<td class="text-center" style="background-color: #f0f0f0;">0.48 ± 0.00</td>
<td class="text-center" style="background-color: #f0f0f0;">0.02 ± 0.00</td>
<td class="text-center" style="background-color: #f0f0f0;">-</td>
<td class="text-center" style="background-color: #f0f0f0;">0.18 ± 0.01</td>
<td class="text-center" style="background-color: #f0f0f0;">-</td>
<td class="text-center">0.40 ± 0.00</td>
<td class="text-center" style="background-color: #f0f0f0;">0.02 ± 0.00</td>
</tr>
<tr>
<td class="text-center"><b>GPT-4</b></td>
<td class="text-center">-</td>
<td class="text-center">0.40 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center">0.45 ± 0.00</td>
<td class="text-center" style="background-color: #f0f0f0;">0.02 ± 0.00</td>
<td class="text-center">-</td>
<td class="text-center">0.16 ± 0.01</td>
<td class="text-center">-</td>
<td class="text-center" style="background-color: #f0f0f0;">0.41 ± 0.01</td>
<td class="text-center" style="background-color: #f0f0f0;">0.02 ± 0.00</td>
</tr>
</tbody>
</table>
<thead>
<tr>
<th rowspan="2" class="text-center align-middle">
<b>Models</b>
</th>
{% for dataset in site.data.leaderboard.vi.bias_toxicity.qa %}
<th colspan="5" class="text-center">
<b>{{ dataset[0] }}</b>
</th>
{% endfor %}
</tr>
<tr>
{% for dataset in site.data.leaderboard.vi.bias_toxicity.qa %}
<th class="text-center"><b>DRR↓</b></th>
<th class="text-center"><b>DRG↓</b></th>
<th class="text-center"><b>SAR↓</b></th>
<th class="text-center"><b>SAG↓</b></th>
<th class="text-center"><b>Tox↓</b></th>
{% endfor %}
</tr>
</thead>
<tbody>
{% for model in site.data.leaderboard.vi.models.models %}
<tr>
<td class="text-center">
<b>{{ model }}</b>
</td>
{% for dataset in site.data.leaderboard.vi.bias_toxicity.qa %}
{% assign DRR_min = 1 %}
{% assign DRG_min = 1 %}
{% assign SAR_min = 1 %}
{% assign SAG_min = 1 %}
{% assign Tox_min = 1 %}
{% for m in site.data.leaderboard.vi.models.models %}
{% if dataset[1][m].DRR and dataset[1][m].DRR < DRR_min %}
{% assign DRR_min = dataset[1][m].DRR %}
{% endif %}
{% if dataset[1][m].DRG and dataset[1][m].DRG < DRG_min %}
{% assign DRG_min = dataset[1][m].DRG %}
{% endif %}
{% if dataset[1][m].SAR and dataset[1][m].SAR < SAR_min %}
{% assign SAR_min = dataset[1][m].SAR %}
{% endif %}
{% if dataset[1][m].SAG and dataset[1][m].SAG < SAG_min %}
{% assign SAG_min = dataset[1][m].SAG %}
{% endif %}
{% if dataset[1][m].Tox and dataset[1][m].Tox < Tox_min %}
{% assign Tox_min = dataset[1][m].Tox %}
{% endif %}
{% endfor %}
<td class="text-center" {% if dataset[1][model].DRR == DRR_min %}style="background-color: cyan;"{% endif %}>
{% if dataset[1][model].DRR %}
{{ dataset[1][model].DRR | round: 2 }} ± {{ dataset[1][model].DRR_std | round: 2 }}
{% else %}
-
{% endif %}
</td>
<td class="text-center" {% if dataset[1][model].DRG == DRG_min %}style="background-color: cyan;"{% endif %}>
{% if dataset[1][model].DRG %}
{{ dataset[1][model].DRG | round: 2 }} ± {{ dataset[1][model].DRG_std | round: 2 }}
{% else %}
-
{% endif %}
</td>
<td class="text-center" {% if dataset[1][model].SAR == SAR_min %}style="background-color: cyan;"{% endif %}>
{% if dataset[1][model].SAR %}
{{ dataset[1][model].SAR | round: 2 }} ± {{ dataset[1][model].SAR_std | round: 2 }}
{% else %}
-
{% endif %}
</td>
<td class="text-center" {% if dataset[1][model].SAG == SAG_min %}style="background-color: cyan;"{% endif %}>
{% if dataset[1][model].SAG %}
{{ dataset[1][model].SAG | round: 2 }} ± {{ dataset[1][model].SAG_std | round: 2 }}
{% else %}
-
{% endif %}
</td>
<td class="text-center" {% if dataset[1][model].Tox == Tox_min %}style="background-color: cyan;"{% endif %}>
{% if dataset[1][model].Tox %}
{{ dataset[1][model].Tox | round: 2 }} ± {{ dataset[1][model].Tox_std | round: 2 }}
{% else %}
-
{% endif %}
</td>
{% endfor %}
</tr>
{% endfor %}
</tbody>
</table>
Loading

0 comments on commit 1ed8925

Please sign in to comment.