Skip to content

Latest commit

 

History

History
39 lines (31 loc) · 2.34 KB

taskonomy.md

File metadata and controls

39 lines (31 loc) · 2.34 KB

March 2020

tl;dr: Answers this question: which tasks are better transferred to others?

Overall impression

This paper proposed a large dataset with 4 million images, each has 26 tasks labeled as GT. This work directly inspired task grouping, which answers a different question of how to perform multitask learning more efficiently.

The main purpose of transfer learning is to reduce the number of images needed for training the task--they focus on supervision efficiency. Given enough images, trained from scratch is also viable, per rethinking ImageNet pretraining.

Key ideas

  • A fully computational approach to reveal the relationships between tasks. Previously the relationship is based on human intuition or analytical knowledge.
    • Depth -> normals are easy for humans, but the opposite is true for NN.
  • Four main steps
    • Task specific modeling
    • Transfer Modleing (frozen encoder, trained shallow transfer+decoder)
    • Normalization with AHP (analytical hierarchical process): only assumes monotonocity and keeps the ordinal order
    • Taxonomy extraction with BIP (boolean integer programming) with constraints

Technical details

  • With only 2% of data, finetuning from reshading to surface normal yields good results already.
  • Architecture
    • The backbone is ResNet-50
    • Neck is 2 conv layers (concat channels for higher-order tasks
    • Decoder: 15-layer fully convolutional network, 2-3 FC layers.
  • Transitive transfer is multi-hop transfers.

Notes

  • Video of the presentation at CVPR
  • What if I have a new task and I want to quantify which other tasks are related to this task?
    • Need 2k images to find which are the best sources for it
    • Use pretrained networks and use 2k images to finetune
  • Beam search
    • Beam search is an efficient way to explore a graph. Beam search is not optimal (that is, there is no guarantee that it will find the best solution).
    • Beam search with beam width B=∞ is essentially breadth-first search.
    • Beam search with beam width B=1 is essentially greedy search.
  • Powerset of any set S is the set of all subsets of S, including the empty set and S itself, variously denoted as $\wp (S)$.