🍎 AI Labs
Apple ML Research
10 min read
ParaRNN: Large-Scale Nonlinear RNNs, Trainable in Parallel
Recurrent Neural Networks (RNNs) are naturally suited to efficient inference, requiring far less memory and compute than attention-based architectures, but the sequential nature of their computation has historically made it impractical to scale up RNNs to billions of parameters. A new advancement from Apple researchers makes RNN training dramatically more efficient — enabling large-scale training for the first time and…