-
Notifications
You must be signed in to change notification settings - Fork 0
/
index_ori.jemdoc
52 lines (44 loc) · 4.65 KB
/
index_ori.jemdoc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# jemdoc: menu{MENU}{index.html}, nofooter
==Yiping Wang 王宜平
~~~
{}{img_left}{photos/bio.jpg}{alt text}{146}{200}
Yiping Wang\n
Ph.D student\n [https://www.cs.washington.edu/ Paul G. Allen School of Computer Science & Engineering], \n
[https://www.washington.edu/ University of Washington]\n
Email: ypwang61 at cs dot washington dot edu \n
[https://scholar.google.com/citations?user=IuMFxFUAAAAJ&hl=en&oi=ao Google Scholar ]\n
[https://twitter.com/ypwang61 Twitter]
~~~
== About me
I'm a first-year Ph.D. student in Paul G. Allen School of Computer Science & Engineering from University of Washington.
I feel very fortunate to have worked under the guidance of [https://simonshaoleidu.com/index.html Prof. Simon Shaolei Du] since 2022 summer.
My main research interest broadly spread across *machine learning theory* and *foundation models*.
For the theortical part, I care about understanding the foundations of deep learning and representation learning, especially the *training dynamics of* the basic components like *Transformer*.
For the empirical part, I am keen on developing efficient algorithms with strong theoretical guarantees or insightful observations. In this aspect, currently I'm working on *data selection/scheduling for multi-modal pretraining* and improving model efficiency.
In addition, I always hold a strong enthusiasm for understanding the essence of intelligence and exploring the cross-cutting areas of mathematics, physics, and AGI, such as using LLM for mathematical proof.
Previously, I got my bachelor's degree in [http://www.en.cs.zju.edu.cn/ Computer Science & Technology] from [https://www.zju.edu.cn/english/ Zhejiang University] in 2023, with an honor degree from [http://ckc.zju.edu.cn/ckcen/_t1906/main.psp Chu Kochen Honors College].
I also minored in [http://www.math.zju.edu.cn/mathen/main.psp Mathematics] at [https://www.zju.edu.cn/english/ Zhejiang University].
During my undergraduate, I was very fortunate to work closely with [http://yuandong-tian.com/ Dr. Yuandong Tian], [https://www.huaxiuyao.io/ Prof. Huaxiu Yao], and [https://linjunz.github.io/ Prof. Linjun Zhang] on several exciting research projects and learned a lot.
== Selected Research
\*: indicating equal contribution or alphabetic ordering.
. [https://arxiv.org/abs/2405.19547 CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning] [https://github.com/ypwang61/negCLIPLoss_NormSim \[Code\]] \n
\**Yiping Wang*, \*Yifang Chen, Wendan Yan, Alex Fang, Wenjing Zhou, Kevin Jamieson, Simon Shaolei Du \n
Preprint. \n
tl;dr: We design universal data selection methods for CLIP pretraining and achieve near SOTA results with less than 10% of preprocessing resources. It can achieve a new SOTA in [https://www.datacomp.ai/dcclip/leaderboard.html DataComp benchmark] when combined with current best approaches.
. [https://arxiv.org/abs/2310.00535 JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention] \n
Yuandong Tian, *Yiping Wang*, Zhenyu Zhang, Beidi Chen, Simon Du. \n
International Conference on Learning Representations (ICLR) 2024 \n
tl;dr: We analyze the training dynamics of multilayer transformer, characterizing the role of self-attention, MLP nonlinearity, and the learning procedure of hierarchical structure, if the data follow hierarchical generative models.
. [https://arxiv.org/abs/2305.16380 Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer] \n
Yuandong Tian, *Yiping Wang*, Beidi Chen, Simon Du. \n
Conference on Neural Information Processing Systems (NeurIPS) 2023 \n
Selected as {{<font color="red"> Oral </font>}} presentation at High-dimensional learning dynamics workshop at ICML 2023 \n
tl;dr: We analyze the 1-layer transformer with next token prediction loss, and rigorously prove its training process and reveal how the token is combined via self-attention layer and the nature of its inductive bias.
. [https://arxiv.org/abs/2306.02556 Improved Active Multi-Task Representation Learning via Lasso] \n
*Yiping Wang*, Yifang Chen, Kevin Jamieson, Simon Du. \n
International Conference on Machine Learning (ICML) 2023 \n
tl;dr: We improve the sample complexity of active multi-task representation learning by proposing a new LASSO-based strategy.
. [https://arxiv.org/abs/2210.05775 C-Mixup: Improving Generalization in Regression] [https://github.com/huaxiuyao/C-Mixup \[Code\]] \n
\*Huaxiu Yao, \**Yiping Wang*, Linjun Zhang, James Zou, Chelsea Finn. \n
Conference on Neural Information Processing Systems (NeurIPS) 2022 \n
tl;dr: We propose a simple yet effective data augmentation method to improve generalization on regression tasks.