Keenan Crane recommend this :
Penrose: UsePenrose
The whole system is open source, and (if you're brave!) you can run it yourself: https://github.com/penrose/penrose
The first law of DL architectures: "Whatever" is all you need. Any problem that can be solved by transformer / ViT can be solved by MLP / CNN, and vice versa (provided you do exhaustive tuning, and use the right inductive bias).
Same for RNNs: Capacity and Trainability in Recurrent Neural Networks
Thomas G. Dietterich The term "ablation" is widely misused lately in ML papers. An ablation is a removal: you REMOVE some component of the system (e.g., remove batchnorm). A "sensitivity analysis" is where you VARY some component (e.g., network width).
Unsolicited paper writing tip:
Don’t claim “We are the FIRST to do…”
Found many papers falsely claimed it as a contribution. (You never knew what some researchers have done in the 90s!)
ConvNeXt: the debate heats up between ConvNets and Transformers for vision!
Very nice work from FAIR+BAIR colleagues showing that with the right combination of methods, ConvNets are better than Transformers for vision. 87.1% top-1 ImageNet-1k.ConvNet
Some of the helpful tricks make complete sense: larger kernels, layer norm, fat layer inside residual blocks, one stage of non-linearity per residual block, separate downsampling layers....
Am I going to argue that "Conv is all you need"? No! My favorite architecture is DETR-like: ConvNet (or ConvNeXt) for the first layers, then something more memory-based and permutation invariant like transformer blocks for object-based reasoning on top.
Maching-learning-book based on the Pytorch
Great video on online presentation tips
Taichi's 2021(https://twitter.com/TaichiGraphics/status/1481932497386881031): Find out how much progress this parallel programming language has made in terms of functionality, performance, application scenarios and our community
PlotNeuralNet: Use LaTex for making neural networks diagrams!
PEPit: PEPit is a new Python package for computer-assisted worst-case analyses; Too busy/tired/lazy to find a convergence proof for your latest optimization algorithm? Let your computer do it!
This Hausdorff School is intended for motivated graduate or postdoctoral students of mathematics or computer science. Leading experts discuss geometric and probabilistic approximation techniques and their connections to learning theory.
Dates: May 23–27; apply by March 27. A CV and Letter of Intent (1 page) are required, as well as the name and contact information of a potential reference.
Kicking off our TUM AI - Lecture Series in 2022 on Monday with no other than Jitendra Malik! He will be talking about "Learning to Walk With Vision and Proprioception".
In this episode I had a chat with Jing Zhang about UC-Net. If you are a student/researcher in #ComputerVision #ML #AI this is for you!!!
UC-Net proposes a probabilistic RGBD saliency detection network. They model the uncertainty in human annotation using a conditional variational autoencoder and generate multiple saliency maps for each input image by sampling the latent space.
Accelerate Implicit Neural leanring: Natalya Tatarchuk If you haven't seen Instant Neural Graphics Primitives with a Multiresolution Hash Encoding paper -ngp/assets/mueller2022instant.pdf): take a look, truly groundbreaking work. This changes the equation for NERF rendering in a significant way.
ConvNet: Constructed entirely from standard ConvNet modules, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers(Best paper ICCV2021) on COCO detection and ADE20K segmentation
How does it work? You only need to install it and open a compatible LDR or HDR file. The viewer will open automatically, allowing you to view the image, move it, zoom in and out, and adjust the exposure.
The first general high-performance self-supervised algorithm for speech, vision, and text. When applied to different modalities, it matches or outperforms the best self-supervised algorithms.
Excited to share our work on Conditional Object-Centric Learning from Video!
We introduce SAVi, a slot-based model that can discover + represent visual entities in videos, using simple location cues and object motion (...or entirely unsupervised).
NVIDIA’s New AI Draws Images With The Speed of Thought!