From Prediction to Generation: the 2017→2022 Pivot

Bill Faruki
5 days ago
3 min read

For a decade, most practical AI was about classification and prediction: label the image, flag the transaction, forecast the demand. That work still matters—but in 2017 the center of gravity began to move.

2017: A new core architecture.

The Transformer arrived and changed how we model sequences. By replacing recurrence with attention, it unlocked parallelism and scale—and quickly beat prior systems on translation. This wasn’t just an accuracy bump; it was a blueprint for models that learn flexible representations of language and beyond.

2018–2020: Pretraining + scaling.

Bidirectional pretraining (BERT) showed how a single model could be adapted to many tasks with minimal changes. At the same time, “just scale it” turned from a hunch into an empirical rule: as you increase parameters, data, and compute, loss falls as a predictable power law. GPT-3 (175B) then demonstrated few-shot behavior—the ability to generalize from just a handful of examples at inference time.

2022: Reasoning prompts and alignment.

Research showed that prompting models to write their chain of thought could elicit stronger multi-step reasoning, while RLHF (InstructGPT) aligned models to follow human instructions more reliably. These techniques didn’t “solve” hallucinations, but they materially reduced them and made models far more useful.

November 2022: The interface moment.

GPTs existed before 2022, but ChatGPT’s public launch on Nov 30, 2022 put a conversational interface in everyone’s hands—and that changed society’s relationship to AI overnight. The mobile story followed: iOS app in May 2023 and Android app that summer, extending access from browsers to pockets.

2023–2024: Multimodality, tools, retrieval.

With GPT-4, models became multimodal (read images, output text) and reached strong performance on many professional benchmarks. GPT-4o pushed into real-time audio-vision-text, turning LLMs into truly interactive systems. Meanwhile, RAG matured as the standard way to ground answers in fresh or proprietary knowledge—pairing a retriever with a generator to reduce hallucinations and keep models up to date. On the creative front, text-to-image/video systems (e.g., Stable Diffusion; Sora) made high-quality media generation mainstream.

What actually changed?

From static labels to dynamic synthesis. Instead of only telling us what is, models now compose—drafting text, code, images, audio, and video that meet a goal or style.
From offline models to connected agents. Retrieval, tools, and structured prompting let models browse, call APIs, and work with user data (with guardrails), closing the loop between reasoning and action.
From experts-only to everyone. The leap wasn’t just algorithmic; it was UX. A chat box—and later native apps—democratized access.

A careful claim about “the end of prediction.”

Classification and forecasting aren’t “over”—they’re absorbed. Generative systems contain classifiers (and sometimes train with them), and many production pipelines still rely on classic predictive components. But the frontier has moved: the most valuable new software behaviors are generative (create, plan, simulate, explain) rather than merely discriminative (sort, label, rank).

Why this keeps accelerating.

Scaling laws still offer a guiding map for improvements, even as the community supplements scaling with better data, evaluation, and efficiency.
Prompting & alignment research continues to improve factuality, safety, and usefulness—while acknowledging limitations remain.
Multimodality expands the problem space from text to the full bandwidth of human communication.

Bottom line.

2017 gave us the architectural key; 2018–2021 validated pretraining and scaling; late 2022 gave the world a handle; 2023–2025 have been about capability, grounding, and modality. We’re not merely classifying reality anymore—we’re constructing with it. That’s the shift you’re pointing to: from the Age of Prediction to the Age of Generation.

Comments