Skip to content

Latest commit

 

History

History
118 lines (63 loc) · 6.49 KB

20220110-20220123.md

File metadata and controls

118 lines (63 loc) · 6.49 KB

意见性分享

图表绘制工具分享

Keenan Crane recommend this :

Penrose: UsePenrose

The whole system is open source, and (if you're brave!) you can run it yourself: https://github.com/penrose/penrose

DL architectures

Taco Cohen

The first law of DL architectures: "Whatever" is all you need. Any problem that can be solved by transformer / ViT can be solved by MLP / CNN, and vice versa (provided you do exhaustive tuning, and use the right inductive bias).

Like A ConvNet for the 2020s

Same for RNNs: Capacity and Trainability in Recurrent Neural Networks

paper writing

Thomas G. Dietterich The term "ablation" is widely misused lately in ML papers. An ablation is a removal: you REMOVE some component of the system (e.g., remove batchnorm). A "sensitivity analysis" is where you VARY some component (e.g., network width).

Jia-Bin Huang

Unsolicited paper writing tip:

Don’t claim “We are the FIRST to do…”

Found many papers falsely claimed it as a contribution. (You never knew what some researchers have done in the 90s!)

debate between ConvNets and Transformers

Yann LeCun

ConvNeXt: the debate heats up between ConvNets and Transformers for vision!

Very nice work from FAIR+BAIR colleagues showing that with the right combination of methods, ConvNets are better than Transformers for vision. 87.1% top-1 ImageNet-1k.ConvNet

Some of the helpful tricks make complete sense: larger kernels, layer norm, fat layer inside residual blocks, one stage of non-linearity per residual block, separate downsampling layers....

Am I going to argue that "Conv is all you need"? No! My favorite architecture is DETR-like: ConvNet (or ConvNeXt) for the first layers, then something more memory-based and permutation invariant like transformer blocks for object-based reasoning on top.

课程和报告分享

Maching-learning-book based on the Pytorch

Great video on online presentation tips

Taichi's 2021(https://twitter.com/TaichiGraphics/status/1481932497386881031): Find out how much progress this parallel programming language has made in terms of functionality, performance, application scenarios and our community

PlotNeuralNet: Use LaTex for making neural networks diagrams!

PEPit: PEPit is a new Python package for computer-assisted worst-case analyses; Too busy/tired/lazy to find a convergence proof for your latest optimization algorithm? Let your computer do it!

This Hausdorff School is intended for motivated graduate or postdoctoral students of mathematics or computer science. Leading experts discuss geometric and probabilistic approximation techniques and their connections to learning theory.

Dates: May 23–27; apply by March 27. A CV and Letter of Intent (1 page) are required, as well as the name and contact information of a potential reference.

Kicking off our TUM AI - Lecture Series in 2022 on Monday with no other than Jitendra Malik! He will be talking about "Learning to Walk With Vision and Proprioception".

In this episode I had a chat with Jing Zhang about UC-Net. If you are a student/researcher in #ComputerVision #ML #AI this is for you!!!

UC-Net proposes a probabilistic RGBD saliency detection network. They model the uncertainty in human annotation using a conditional variational autoencoder and generate multiple saliency maps for each input image by sampling the latent space.


成果推荐及讨论

Accelerate Implicit Neural leanring: Natalya Tatarchuk If you haven't seen Instant Neural Graphics Primitives with a Multiresolution Hash Encoding paper -ngp/assets/mueller2022instant.pdf): take a look, truly groundbreaking work. This changes the equation for NERF rendering in a significant way.

Selection_162

ConvNet: Constructed entirely from standard ConvNet modules, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers(Best paper ICCV2021) on COCO detection and ADE20K segmentation

Selection_163

How does it work? You only need to install it and open a compatible LDR or HDR file. The viewer will open automatically, allowing you to view the image, move it, zoom in and out, and adjust the exposure.

The first general high-performance self-supervised algorithm for speech, vision, and text. When applied to different modalities, it matches or outperforms the best self-supervised algorithms.

Excited to share our work on Conditional Object-Centric Learning from Video!

We introduce SAVi, a slot-based model that can discover + represent visual entities in videos, using simple location cues and object motion (...or entirely unsupervised).

NVIDIA’s New AI Draws Images With The Speed of Thought!

招聘

虚拟人研究人员招聘 (Michael Black)组

招实习生 SGI