Main Concept

An RNN is a neural network designed to process sequential data — information where order and context matter. It has feedback loops that allow it to “remember” previous inputs while processing new ones.

How It Works

RNNs process data one element at a time, maintaining a “memory” (hidden state) of what they’ve seen before. This allows them to understand context and dependencies in sequences.

Common Uses

  • Time-series prediction — stock prices, weather forecasting
  • Speech recognition — understanding audio sequences
  • Text prediction — next word prediction, language modeling
  • Video analysis — understanding frame sequences

RNN Variants

  • LSTM (Long Short-Term Memory) — improved RNN with better long-term memory
  • GRU (Gated Recurrent Unit) — simplified version of LSTM

Important: Transformers vs. RNNs

RNNs process sequentially — slow, especially for long sequences Transformers process in parallel — much faster, better for modern LLMs

This is why modern LLMs use Transformers, not RNNs.

AIF-C01 Context

RNNs are mentioned as a predecessor to Transformers. Know that RNNs are good for sequential data but slow to train compared to Transformers. The exam may ask why Transformers replaced RNNs for language modeling.