dm.cs.tu-dortmund.de/mlbits/neural-nlp-beyond-transformers/
Beyond Transformers – Lecture Notes
language learning, CoNLL (2020), 455–475.
[DCSD22]
Dao, T., Chen, B., Sohoni, N.S., Desai, A.D., Poli, M., Grogan, J., Liu, A., Rao, A., Rudra, A. and Ré, C. 2022. Monarch: Expressive structured matrices for [...] Wattenberg, M., Viégas, F.B., Coenen, A., Pearce, A. and Kim, B. 2019. Visualizing and measuring the geometry of BERT . Neural information processing systems, NeurIPS 2019 (2019), 8592–8600.
[Wang21]
Wang [...] accurate training . Int. Conf. Machine learning, ICML (2022), 4690–4721.
[DGER19]
Dao, T., Gu, A., Eichhorn, M., Rudra, A. and Ré, C. 2019. Learning fast algorithms for linear transforms using butterfly factorizations …