-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathzh.html
327 lines (290 loc) · 14.7 KB
/
zh.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
<!DOCTYPE html>
<html lang="en">
<head>
<script async src="https://www.googletagmanager.com/gtag/js?id=G-C1CRWDNJ1J"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-C1CRWDNJ1J');
</script>
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Noto+Sans+SC:[email protected]&display=swap" rel="stylesheet">
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Chinese reading task about ML</title>
<style>
body {
font-family: Arial, sans-serif;
background-color: #f4f4f9;
color: #333;
margin: 0;
padding: 20px;
}
.container {
max-width: 800px;
margin: 0 auto;
background-color: #fff;
padding: 20px;
border-radius: 8px;
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.1);
}
h1 {
color: #0056b3;
text-align: center;
}
p {
line-height: 1.6;
}
.zh-text {
font-size: 1.3em;
font-family: 'Noto Sans SC';
font-weight: 300;
margin: 0 0 5px 0;
}
.pinyin {
padding-top: 5px;
padding-bottom: 5px;
font-style: italic;
color: #888;
}
table {
width: 100%;
border-collapse: collapse;
margin-top: 20px;
}
th, td {
padding: 12px;
border: 1px solid #ddd;
text-align: left;
}
th {
background-color: #0056b3;
color: #fff;
}
td {
background-color: #f9f9f9;
}
td.zh {
font-family: 'Noto Sans SC';
font-size: 1.2em;
font-weight: 400;
}
</style>
</head>
<body>
<div class="container">
<h1>MLGym: A New Framework and Benchmark for Advancing AI Research Agents</h1>
<div><p class='zh-text'>1. 我们介绍了Meta MLGym和MLGym-Bench,这是一个用于评估和开发LLM代理在AI研究任务上的新框架和基准。</p>
<p class='zh-text'>2. 这是第一个用于机器学习任务的Gym环境,支持强化学习算法的研究。</p>
<p class='zh-text'>3. MLGym-Bench包含13个多样且开放的AI研究任务,涵盖计算机视觉、自然语言处理、强化学习和博弈论等领域。</p>
<p class='zh-text'>4. 解决这些任务需要实际的AI研究技能,如产生新想法和假设、创建和处理数据、实现ML方法、训练模型、运行实验、分析结果并迭代改进任务。</p>
<p class='zh-text'>5. 我们评估了多个前沿大语言模型,如Claude-3.5-Sonnet和GPT-4o。</p>
<p class='zh-text'>6. 我们的MLGym框架使得添加新任务、集成和评估模型、大规模生成合成数据以及开发新的学习算法变得容易。</p>
<p class='zh-text'>7. 我们发现当前的前沿模型可以通过找到更好的超参数来改进基线,但不会产生新的假设、算法、架构或显著改进。</p>
<p class='zh-text'>8. 我们开源了我们的框架和基准,以促进未来在提高LLM代理的AI研究能力方面的研究。</p></div>
<div class="pinyin">
<p>1. Wǒmen jièshào le Meta MLGym hé MLGym-Bench, zhè shì yīgè yòngyú pínggǔ hé kāifā LLM dàilǐ zài AI yánjiū rènwù shàng de xīn kuàngjià hé jīzhǔn</p>
<p>2. Zhè shì dì-yīgè yòngyú jīqǐ xuéxí rènwù de Gym huánjìng, zhīchí qiángzhù xuéxí suànfǎ de yánjiū</p>
<p>3. MLGym-Bench bāohán 13 gè duōyàng qiě kāifàng de AI yánjiū rènwù, hánfù jìsuànjī shìjiè, zìrán yǔyán chǔlǐ, qiángzhù xuéxí hé bóyìlùn děng lǐngyù</p>
<p>4. Jiějué zhèxiē rènwù xūyào shíjì de AI yánjiū jìnéng, rú chǎnshēng xīn yìxiǎng hé jiǎshè, chuàngjiàn hé chǔlǐ shùjù, shíxiàn ML fāngfǎ, xùnliàn móxíng, yùnxíng shìyàn, fēnxi jiéguǒ bìng diédǎi gǎijìn rènwù</p>
<p>5. Wǒmen pínggǔ le duō gè qiánxīn dà yǔyán móxíng, rú Claude-3</p>
<p>6. 5-Sonnet hé GPT-4o</p>
<p>7. Wǒmen de MLGym kuàngjià shǐdé tiānjiā xīn rènwù, jíchéng hé pínggǔ móxíng, dàguīmó shēngchéng héchéng shùjù yǐjí kāifā xīn de xuéxí suànfǎ biàndé róngyì</p>
<p>8. Wǒmen fāxiàn dāngqián de qiánxīn móxíng kěyǐ tōngguò zhǎodào gèng hǎo de chāocānshù lái gǎijìn jīchǔ, dàn bùhuì chǎnshēng xīn de jiǎshè, suànfǎ, jiàgòu huò xiǎnzhù gǎijìn</p>
<p>9. Wǒmen kāiyuán le wǒmen de kuàngjià hé jīzhǔn, yǐ cùjìn wèilǎi zài tígāo LLM dàilǐ de AI yánjiū nénglì fāngmiàn de yánjiū</p>
</div>
<div><p>1. We introduced Meta MLGym and MLGym-Bench, a new framework and benchmark for evaluating and developing LLM agents on AI research tasks.</p>
<p>2. This is the first Gym environment for machine learning tasks that supports research on reinforcement learning algorithms.</p>
<p>3. MLGym-Bench includes 13 diverse and open AI research tasks, covering areas such as computer vision, natural language processing, reinforcement learning, and game theory.</p>
<p>4. Solving these tasks requires practical AI research skills, such as generating new ideas and hypotheses, creating and handling data, implementing ML methods, training models, running experiments, analyzing results, and iteratively improving tasks.</p>
<p>5. We evaluated several state-of-the-art large language models, such as Claude-3.</p>
<p>6. 5-Sonnet and GPT-4o.</p>
<p>7. Our MLGym framework makes it easy to add new tasks, integrate and evaluate models, generate synthetic data at scale, and develop new learning algorithms.</p>
<p>8. We found that current state-of-the-art models can improve baselines by finding better hyperparameters but do not generate new hypotheses, algorithms, architectures, or significant improvements.</p>
<p>9. We have open-sourced our framework and benchmark to promote future research in enhancing the AI research capabilities of LLM agents.</p></div>
<h2>Vocabulary</h2>
<table>
<thead>
<tr>
<th>Word</th>
<th>Pinyin</th>
<th>Translation</th>
</tr>
</thead>
<tbody>
<tr>
<td class="zh">介绍</td>
<td>jiè shào</td>
<td>introduce</td>
</tr>
<tr>
<td class="zh">框架</td>
<td>kuàng jià</td>
<td>framework</td>
</tr>
<tr>
<td class="zh">基准</td>
<td>jī zhǔn</td>
<td>benchmark</td>
</tr>
<tr>
<td class="zh">评估</td>
<td>píng gū</td>
<td>evaluate</td>
</tr>
<tr>
<td class="zh">开发</td>
<td>kāi fā</td>
<td>develop</td>
</tr>
<tr>
<td class="zh">代理</td>
<td>dài lǐ</td>
<td>agent</td>
</tr>
<tr>
<td class="zh">任务</td>
<td>rèn wu</td>
<td>task</td>
</tr>
<tr>
<td class="zh">支持</td>
<td>zhī chí</td>
<td>support</td>
</tr>
<tr>
<td class="zh">强化</td>
<td>qiáng huà</td>
<td>reinforce</td>
</tr>
<tr>
<td class="zh">算法</td>
<td>suàn fǎ</td>
<td>algorithm</td>
</tr>
<tr>
<td class="zh">多样</td>
<td>duō yàng</td>
<td>diverse</td>
</tr>
<tr>
<td class="zh">开放</td>
<td>kāi fàng</td>
<td>open</td>
</tr>
<tr>
<td class="zh">涵盖</td>
<td>hán gài</td>
<td>cover</td>
</tr>
<tr>
<td class="zh">计算机视觉</td>
<td>jì suàn jī shì jué</td>
<td>computer vision</td>
</tr>
<tr>
<td class="zh">自然语言处理</td>
<td>zì rán yǔ yán chǔ lǐ</td>
<td>natural language processing</td>
</tr>
<tr>
<td class="zh">博弈论</td>
<td>bó yì lùn</td>
<td>game theory</td>
</tr>
<tr>
<td class="zh">技能</td>
<td>jì néng</td>
<td>skill</td>
</tr>
<tr>
<td class="zh">假设</td>
<td>jiǎ shè</td>
<td>hypothesis</td>
</tr>
<tr>
<td class="zh">创建</td>
<td>chuàng jiàn</td>
<td>create</td>
</tr>
<tr>
<td class="zh">处理</td>
<td>chǔ lǐ</td>
<td>process</td>
</tr>
<tr>
<td class="zh">实现</td>
<td>shí xiàn</td>
<td>implement</td>
</tr>
<tr>
<td class="zh">训练</td>
<td>xùn liàn</td>
<td>train</td>
</tr>
<tr>
<td class="zh">实验</td>
<td>shí yàn</td>
<td>experiment</td>
</tr>
<tr>
<td class="zh">分析</td>
<td>fēn xī</td>
<td>analyze</td>
</tr>
<tr>
<td class="zh">迭代</td>
<td>dié dài</td>
<td>iterate</td>
</tr>
<tr>
<td class="zh">改进</td>
<td>gǎi jìn</td>
<td>improve</td>
</tr>
<tr>
<td class="zh">前沿</td>
<td>qián yán</td>
<td>frontier</td>
</tr>
<tr>
<td class="zh">超参数</td>
<td>chāo cān shù</td>
<td>hyperparameter</td>
</tr>
<tr>
<td class="zh">基线</td>
<td>jī xiàn</td>
<td>baseline</td>
</tr>
<tr>
<td class="zh">生成</td>
<td>shēng chéng</td>
<td>generate</td>
</tr>
<tr>
<td class="zh">合成</td>
<td>hé chéng</td>
<td>synthetic</td>
</tr>
<tr>
<td class="zh">开源</td>
<td>kāi yuán</td>
<td>open source</td>
</tr>
<tr>
<td class="zh">促进</td>
<td>cù jìn</td>
<td>promote</td>
</tr>
<tr>
<td class="zh">能力</td>
<td>néng lì</td>
<td>ability</td>
</tr>
</tbody>
</table>
</div>
</body>
</html>