Gary Marcus (2018)
- DL needs large amounts of data to generalize well, and test must be similar to train (allowing interpolation)
- Generalization = interpolation + extrapolation (beyond known examples)
- 10 challenges for DL:
- DL is data-hungry: DL lacks mechanism to learn abstractions through explicit/implicit definition, instead needs lots of data
- DL has problems learning robust and abstract concepts, and then transferring these to different scenarios
- DL has no natural way of dealing with hierarchy, making generalization hard for tasks where this is important
- What about CNNs? They do have this!
- DL struggles with open-ended inference (inference beyond what is in text: multiple sentences or sentences + background knowledge)
- DL is a black box: problem for debuggability and understanding how decisions were made (medical field)
- DL is self-contained and isolated from other useful knowledge, and this is hard to incorporate
- DL uses feature correlations instead of abstractions
- DL cannot inherently distinguish correlation from causation
- DL presumes a largely stable world
- DL can be easily fooled: quite good at some large fraction of a domain, but easily spoofed in some spot (adversarial examples)
- DL systems are difficult to expand while guaranteeing performance --> hard to engineer with!
- So: DL much better at interpolation than extrapolation
- See DL as a tool in a toolbox
- Better/more complete techniques:
- Unsupervised --> children also learn like this!
- Symbolic AI + DL: incorporate knowledge, allowing for inference and abstraction
- Look at human cognition
- Go beyond the Turing test: comprehension, scientific reasoning, IKEA-set-building, game-playing as challenges