You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Current challenges in using Neural Operators are: irregular meshes, multiple inputs, multiple inputs on different meshes, or multi-scale problems. [1] The Attention mechanism is promising in that regard as it is able to contextualize these different inputs even for different/irregular input locations. However, common implementations of the Attention mechanism posses an overall complexity of O(n²d), which is squared with respect to the length of sequences. [3] This becomes limiting when applying these networks to very big datasets, as is the case for learning the solution operator of partial differential equations. [2] Therefore, multiple papers propose a linear attention mechanism to tackle this issue:
Transformer for partial differential equations' operator learning [1]: Linear Attention.
Proposed Solution
Researching different proposed linear attention models: As there are many different implementations ([1] [2] and more) and related research in the field of NLP [4] a broader look into proposed methods is beneficial.
Implementing the most promising candidates for linear attention: Compose a list of promising candidates and implement the best of these.
Good example dataset and benchmark: Introduce this operator to a good benchmark for this kind of problem.
Expected Benefits
Complexity: The proposed attention mechanism can be used to implement different transformer architectures to learn operators quickly because of the space complexity of this model. Also weighting different input functions and being able to adapt to irregular meshes is interesting.
Scalability: Transformer models can be trained on a very large scale and linear attention significantly lowers the costs.
Implementation Steps
Implement Linear Attention.
Open Questions
Which linear attention implementations are interesting to us?
How should this model be tested?
What are considerations when applying this model to physics constrained problems?
What are interesting benchmarks for this problem?
Literature
[1] Hao, Z. et al. Gnot: A general neural operator transformer for operator learning. in International Conference on Machine Learning 12556–12569 (PMLR, 2023).
[2] Li, Z., Meidani, K. & Farimani, A. B. Transformer for partial differential equations’ operator learning. arXiv preprint arXiv:2205.13671 (2022).
[3] Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems 30, (2017).
[4] Wang, Y. & Xiao, Z. LoMA: Lossless Compressed Memory Attention. (2024).
The text was updated successfully, but these errors were encountered:
Description
Current challenges in using Neural Operators are: irregular meshes, multiple inputs, multiple inputs on different meshes, or multi-scale problems. [1] The Attention mechanism is promising in that regard as it is able to contextualize these different inputs even for different/irregular input locations. However, common implementations of the Attention mechanism posses an overall complexity of O(n²d), which is squared with respect to the length of sequences. [3] This becomes limiting when applying these networks to very big datasets, as is the case for learning the solution operator of partial differential equations. [2] Therefore, multiple papers propose a linear attention mechanism to tackle this issue:
Proposed Solution
Expected Benefits
Implementation Steps
Open Questions
Literature
[1] Hao, Z. et al. Gnot: A general neural operator transformer for operator learning. in International Conference on Machine Learning 12556–12569 (PMLR, 2023).
[2] Li, Z., Meidani, K. & Farimani, A. B. Transformer for partial differential equations’ operator learning. arXiv preprint arXiv:2205.13671 (2022).
[3] Vaswani, A. et al. Attention is all you need. Advances in neural information processing systems 30, (2017).
[4] Wang, Y. & Xiao, Z. LoMA: Lossless Compressed Memory Attention. (2024).
The text was updated successfully, but these errors were encountered: