Main Concept
LLMs generate text through a probabilistic process, making them inherently non-deterministic. For each token, the model generates a probability distribution over possible next words based on the input context.
Example: For the sentence “After the rain the streets were…”, an LLM might calculate:
- “wet” (0.4)
- “flooded” (0.25)
- “slippery” (0.15)
- “empty” (0.10)
- “muddy” (0.05)
- “clean” (0.03)
- “blocked” (0.02)
Key characteristics:
- Randomness in selection: The model samples from the probability distribution, so it may choose “wet” one time and “flooded” another time, even with identical input
- Context sensitivity: Probabilities are calculated based on the entire input context. For instance, if the input mentioned a strike, the probability of “blocked” would increase
- Iterative process: This probabilistic selection happens for every token generated throughout the entire output
This explains why running the same prompt multiple times can produce different responses, even with identical settings.