#34 Scalable Second-order Optimizer for Language Model Pre-training
www.machinelearningatscale.com
Table of contents Introduction. Scalable Stochastic Second-order Optimizer for Language Model Pre-Training. Closing thoughts. Introduction I have recently come across [1]: a new second order optimizer tailored for large language models. I find innovations in this well-explored space incredibly interesting, so let's dive right in!
#34 Scalable Second-order Optimizer for Language Model Pre-training
#34 Scalable Second-order Optimizer for…
#34 Scalable Second-order Optimizer for Language Model Pre-training
Table of contents Introduction. Scalable Stochastic Second-order Optimizer for Language Model Pre-Training. Closing thoughts. Introduction I have recently come across [1]: a new second order optimizer tailored for large language models. I find innovations in this well-explored space incredibly interesting, so let's dive right in!