-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
287 lines (243 loc) · 19 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>Wang Kaixuan, 3D vision and Robotics</title>
<link href="css.css" rel="stylesheet" type="text/css" />
</head>
<!-- <meta charset="UTF-8"> -->
<body>
<div id="main">
<!-- <div id="top-nav">
<b>Wang Kaixuan</b>
<small>, Ph.D. Student on 3D Vision and Robotics</small>
</div> -->
<div id="content">
<a name="Home"></a>
<h1>Wang Kaixuan (王凯旋)<br />Ph.D. on 3D Vision and Robotics</h1>
<br />
<p align="justify">
<img border="0" src="images/myPhoto.jpg" align="left" height="250px" style="padding-right: 20px;padding-bottom: 20px" />
I am a Ph.D. of <a href="http://uav.ust.hk/">UAV Group</a>, HKUST, supervised by <a href="http://uav.ust.hk/group/">Prof. Shaojie Shen</a>. My research includes depth estimation using monocular cameras, deep learning, and 3D reconstruction. During the past years, I have been developing methods to build a pipeline from images to dense 3D maps for robotic navigation, especially for UAVs. Most of the past work is open source on GitHub for the benefit of the community. I worked in the AILab, ByteDance as an intern and in DJI perception team as an algorithm engineer. Now, I work in Zhuoyu Tech(ZYT, 卓驭科技) leading a team for reconstruction and simulation.
</p>
<p>
You are welcomed to contact me by email if you are interested in working at ZYT perception team as an intern or fulltime engineer!
</p>
<p>
In this page, you can find a summary of my past projects and published papers.
</p>
<p>
You may also be interested to my wife, <a href="https://yokoxue.github.io/">Ye Xue</a>, who is now a researcher dedicated to machine learning theory and wireless communication.
</p>
<p>
<b>Link: </b>
<a href="https://github.com/WANG-KX"><img border="0" src="images/gh.png" height="13px" /> GitHub</a>,
<a href="https://scholar.google.com/citations?user=mSyd3BcAAAAJ"><img border="0" src="images/scholar.png" height="13px" />
Google Scholar</a>, <a href="https://www.youtube.com/channel/UCjeo3nQ31b9SJMqMSnyUj3Q"><img border="0" src="images/yt.png"
height="13px" /> Personal</a>, <a href="https://www.youtube.com/channel/UCUc35anCXfPI5U763nWbyKw"><img border="0"
src="images/yt.png" height="13px" />Group</a>
<br />
<b>Contact: </b>
<p> wkx1993 AT gmail DOT com
</p>
<p> halfbullet DOT wang AT zyt DOT com (for work/intern only)
</p>
<br />
<b>CV: </b>
<a href="pdf/CV.pdf"><img border="0" src="images/pdf.png" height="13px" />MyCV</a>
<br />
<a name="Projects"></a>
<br />
<h1>Project Highlights</h1>
<h2>Monocular Dense Mapping</h2>
<p align="justify"><b>Monocular Dense Mapping</b> aims to estimated dense depth maps using images from only one monocular camera. Compared with the widely used stereo perception, the one camera solution has the advantags of sensor size, weight and no need for extrinsic calibration. Using the visual inertial odometry method (e.g. <a href="https://github.com/HKUST-Aerial-Robotics/VINS-Mono">VINS</a>), UAV autonomous navigation can be demonstrated using only one camera and one IMU (see figure for an example).
</p>
<table align='center'>
<tr>
<td><img src="images/drone.png" height="200" align="center" />
<td><iframe height="200" src="https://www.youtube.com/embed/obwV1PFuPC0?start=84" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
</tr>
</table>
<p>
Two methods are developed to calculate the depth map using belief propagation. One method, <a href="https://github.com/HKUST-Aerial-Robotics/Pinhole-Fisheye-Mapping">Pinhole-Fisheye-Mapping</a>, focuses on the varying baseline utilization for robust depth estimation. Another method, <a href="https://github.com/HKUST-Aerial-Robotics/open_quadtree_mapping">Quadtree-Mapping</a>, uses the quadtree structure of the image to speed up the depth global optimization. Benefiting from the development of deep learning methods, we also proposed <a href="https://github.com/HKUST-Aerial-Robotics/MVDepthNet">MVDepthNet</a> that can estimate depth maps using one monocular camera in real time. All the above methods have been applied on real UAV systems and demonstrated their usability.
</p>
<table align='center'>
<tr>
<td><iframe width="200" src="https://www.youtube.com/embed/5zkkEMZaono?start=102" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
<td><iframe width="200" src="https://www.youtube.com/embed/GMlOvgewfeE?start=31" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
<td><iframe width="200" src="https://www.youtube.com/embed/8jUlN-ZROl0?start=157" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
</tr>
</table>
<h2>Surfel-based 3D Reconstruction</h2>
<p align="justify"><b>3D reconstruction</b> needs more than depth images with estimated camera poses. Since there exist noises in depth maps and estimated camera poses will drift in the long term, a proper 3d fusion method is needed to build a dense, globally consistent model for visualization and path planning. <a href="https://github.com/HKUST-Aerial-Robotics/DenseSurfelMapping">SurfelMapping</a> is a reconstruction method that can work with state-of-the-art SLAM methods (e.g., VINS, ORB-SLAM). The map is represented as a collection of surfels such that it can be deformed efficiently according to the pose graph optimization of SLAM systems. Unlike ElasticFusion which model surfel for each pixel, we estimate surfels for extracted superpixels to gain efficiency. SurfelMapping is capable of reconstructing the KITTI dataset in real time without GPU acceleration.
<table align='center'>
<tr>
<td><iframe width="560" height="315" src="https://www.youtube.com/embed/2gZNpFE_yI4?start=120" frameborder="0"
allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</tr>
</table>
</p>
<h2>Deep Learning for 3D Vision</h2>
<p align="justify"><b>Deep learning</b> has been developed a lot in recent years. Some of my work tries to solve 3D vision problems by using learning methods. <a href="https://github.com/HKUST-Aerial-Robotics/MVDepthNet">MVDepthNet</a> is one of the first methods that use networks to solve multiview stereo problems but is the only one that is designed for real-time performance. MVDepthNet is further used by Magic Leap in <a href="https://arxiv.org/pdf/1904.11595.pdf">DeepPerimeter</a> for indoor reconstruction. <a href="https://github.com/HKUST-Aerial-Robotics/Flow-Motion-Depth">Flow-Motion-Depth</a> solves the motion stereo problem by a carefully designed network that can estimate depth maps and camera motions given two images. I also take a deep study into monocular depth prediction. Since the quantity of available datasets for depth learning is quite limited, we propose <a href="https://github.com/HKUST-Aerial-Robotics/GeometricPretraining">GeometricPretraining</a> that can pretrain networks with unlimited internet videos. With the pretrained networks, we achieve new state-of-the-art performance. The performance is demonstrated in the following:
<table align='center'>
<tr>
<td><iframe width="300" src="https://www.youtube.com/embed/wlobHEpPXDU" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
<td><iframe width="300" src="https://www.youtube.com/embed/5AKuyMUk4w8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen></iframe>
</tr>
</table>
</p>
<br />
<a name="Publications"></a>
<h1>Publications</h1>
<p>
<img border="0" src="pdf_img/gao_fisheyes.jpg" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Autonomous aerial robot using dual-fisheye cameras</b> (W. Gao, K. Wang, W. Ding, F. Gao, T. Qin and S. Shen), In <i>Journal of Field Robotics</i>, 2020.
<br />
<b>To appear</b>
<!-- <a href="https://ieeexplore.ieee.org/document/8811597"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a> -->
<!-- <a href="bib/ding2019tro.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=sg46XT9-o1k"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://arxiv.org/abs/1906.09785"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv version]</a> -->
</p>
<p>
<img border="0" src="pdf_img/geopretrain.png" align="left" width="80px" height="70px" style="padding-right: 10px;padding-bottom: 0px" />
<b>Geometric Pretraining for Monocular Depth Estimation</b> (K. Wang and Y. Chen and H. Guo and L. Wen and S. Shen), In <i>IEEE International
Conference on Robotics and Automation (ICRA)</i>, 2020.
<br />
<!-- <a href="pdf/flow_motion_depth.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a> -->
<!-- <a href="pdf/flow_motion_depth_sup.pdf"><img border="0" src="images/pdf.png" height="13px" /> [supplementary pdf]</a> -->
<a href="bib/geopretrain.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/GeometricPretraining"><img border="0" src="images/gh.png" height="13px" />[code]</a>
<!-- <a href="https://arxiv.org/abs/1909.05452"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv -->
<!-- version]</a> -->
</p>
<p>
<img border="0" src="pdf_img/flow-motion-depth.png" align="left" width="80px" style="padding-right: 10px;padding-bottom: 0px" />
<b>Flow-Motion and Depth Network for Monocular Stereo and Beyond</b> (K. Wang and S. Shen), arxiv: 1909.05452, RAL 2020 and to be presented at ICRA 2020.
<br />
<a href="pdf/flow_motion_depth.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="pdf/flow_motion_depth_sup.pdf"><img border="0" src="images/pdf.png" height="13px" /> [supplementary pdf]</a>
<a href="bib/flow_motion_depth.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/Flow-Motion-Depth"><img border="0" src="images/gh.png" height="13px" />[code]</a>
<a href="https://arxiv.org/abs/1909.05452"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv version]</a>
</p>
<p>
<img border="0" src="pdf_img/flowNorm.png" align="left" width="80px" style="padding-right: 10px;padding-bottom: 0px" />
<b>FlowNorm: A Learning-based Method for Increasing Convergence
Range of Direct Alignment</b> (K. Wang, K. Wang and S. Shen), arxiv: 1910.07217, ICRA 2020.
<br />
<!-- <a href="pdf/flow_motion_depth.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a> -->
<!-- <a href="pdf/flow_motion_depth_sup.pdf"><img border="0" src="images/pdf.png" height="13px" /> [supplementary pdf]</a> -->
<a href="bib/flownorm.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<!-- <a href="https://github.com/HKUST-Aerial-Robotics/Flow-Motion-Depth"><img border="0" src="images/gh.png" height="13px" />[code]</a> -->
<a href="https://arxiv.org/abs/1910.07217"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv version]</a>
</p>
<p>
<img border="0" src="pdf_img/surfel_mapping.png" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Real-time Scalable Dense Surfel Mapping</b> (K. Wang and F. Gao and S. Shen), In <i>IEEE International
Conference on Robotics and Automation (ICRA)</i>, 2019.
<br />
<a href="pdf/surfel_fusion.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/surfel_fusion.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=2gZNpFE_yI4"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://arxiv.org/abs/1909.04250"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv version]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/DenseSurfelMapping"><img border="0" src="images/gh.png" height="13px" />[code]</a>
</p>
<p>
<img border="0" src="images/drone.png" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>An Efficient B-spline-Based Kinodynamic Replanning Framework for Quadrotors</b> (W. Ding and W. Gao and K. Wang and S. Shen), In <i>IEEE Transactions on Robotics (T-RO)</i>, 2019.
<br />
<a href="https://ieeexplore.ieee.org/document/8811597"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/ding2019tro.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=sg46XT9-o1k"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://arxiv.org/abs/1906.09785"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv version]</a>
</p>
<p>
<img border="0" src="pdf_img/teach_repeat.png" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Optimal Trajectory Generation for Quadrotor Teach-and-Repeat </b> (F. Gao and L. Wang and K. Wang and W. Wu and B. Zhou and L. Han and S. Shen), In <i>IEEE Robotics and Automation Letters</i>, 2019.
<br />
<a href="https://ieeexplore.ieee.org/document/8625495"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/teach_repeat.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=UUu_eK26EP0"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/Teach-Repeat-Replan"><img border="0" src="images/gh.png" height="13px" />[code]</a>
</p>
<p>
<img border="0" src="pdf_img/mvdepthnet.png" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>MVDepthNet: real-time multiview depth estimation neural network</b> (K. Wang and S. Shen), In <i>International
Conference on 3D Vision (3DV)</i>, 2018, <b>Oral</b>.
<br />
<a href="pdf/mvdepthnet.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/mvdepthnet.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=X540oWQnVko"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://arxiv.org/abs/1807.08563"><img border="0" src="images/arxiv.png" height="13px" /> [2018 arXiv version]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/MVDepthNet"><img border="0" src="images/gh.png" height="13px" />[code]</a>
</p>
<p>
<img border="0" src="pdf_img/pinhole_fisheye.png" align="left" width="80px" height="100px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Adaptive Baseline Monocular Dense Mapping with Inter-frame Depth Propagation</b> (K. Wang and S. Shen), In <i>The
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</i>, 2018.
<br />
<a href="pdf/pinhole_fisheye.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/pinhole_fisheye.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=5zkkEMZaono"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/Pinhole-Fisheye-Mapping"><img border="0" src="images/gh.png" height="13px" />code]</a>
<br />
<br />
<br />
</p>
<p>
<img border="0" src="pdf_img/quadtree_mapping.png" align="left" width="80px" height="60px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Quadtree-accelerated Real-time Monocular Dense Mapping</b> (K. Wang and W. Ding and S. Shen), In <i>The
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</i>, 2018.
<br />
<a href="pdf/quadtree_mapping.pdf"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/quadtree_mapping.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=GMlOvgewfeE"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/open_quadtree_mapping"><img border="0" src="images/gh.png"
height="13px" />[code]</a>
</p>
<p>
<img border="0" src="pdf_img/ling2018iros.png" align="left" width="80px" height="60px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Probabilistic dense reconstruction from a moving camera</b> (Y. Ling and K. Wang and W. Ding and S. Shen), In
<i>The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</i>, 2018.
<br />
<a href="https://ieeexplore.ieee.org/document/8593618"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/ling2018iros.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=XhEkYZujoKA"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://github.com/HKUST-Aerial-Robotics/probabilistic_mapping"><img border="0" src="images/gh.png"
height="13px" />[code]</a>
</p>
<p>
<img border="0" src="images/drone.png" align="left" width="80px" height="60px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Trajectory replanning for quadrotors using kinodynamic search and elastic optimization</b> (W. Ding and W. Gao
and K. Wang and S. Shen), In <i>IEEE International Conference on Robotics and Automation (ICRA)</i>, 2018.
<br />
<a href="https://ieeexplore.ieee.org/document/8463188"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
<a href="bib/ding2018icra.txt"><img border="0" src="images/bib.png" height="13px" /> [bib]</a>
<a href="https://www.youtube.com/watch?v=obwV1PFuPC0"><img border="0" src="images/yt.png" height="13px" /> [video]</a>
<a href="https://arxiv.org/abs/1903.01139"><img border="0" src="images/arxiv.png" height="13px" /> [2019 arXiv version]</a>
</p>
<h3>Undergraduate Projects</h3>
<p>
<img border="0" src="pdf_img/control.png" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>Spherical Formation Tracking Control of Second-Order Nonlinear Agents With Directed Communication</b> (Y. Chen
and K. Wang and Y. Zhang), In <i>12th IEEE International Conference on Control & Automation (ICCA)</i>, 2016.
<br />
<a href="https://ieeexplore.ieee.org/document/7505334"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
</p>
<p>
<img border="0" src="pdf_img/control.png" align="left" width="80px" height="50px" style="padding-right: 10px;padding-bottom: 20px" />
<b>A geometric extension design for second-order nonlinear agents formation surrounding a sphere</b> (Y. Chen and K. Wang and Y. Zhang and C. Liu and Q. Wang), In <i>Chinese Control and Decision Conference (CCDC)</i>, 2016.
<br />
<a href="https://ieeexplore.ieee.org/document/7505334"><img border="0" src="images/pdf.png" height="13px" /> [pdf]</a>
</p>
</div>
</div>
</body>
</html>