Most teams still talk about GenAI like it’s one ingredient: “Which model should we use?”

But real-world AI products are systems. They’re made of components that interact. And once you start thinking in components, a simple idea becomes powerful:

A periodic table isn’t just for memorizing elements. It’s for predicting reactions.

Recently, Martin Keen from IBM Technology popularized a “periodic table” style sketch for AI systems. I like it because it gives people a shared language: prompts, embeddings, vector DBs, RAG, guardrails, agents, fine-tuning, small models, and more — organized in a way that hints at what combinations are stable, useful, or risky.

In this post I’ll do three things:

Explain the table in plain English (row by row, family by family)
Fill the missing blocks in a way that matches the logic of the table
Give 10 canonical “reactions” (reference architectures) you can reuse — each one explicitly using MCP servers to connect tools and data

Why a “Periodic Table” for AI?

Because most product debates are framed incorrectly.

People argue:

“RAG vs fine-tuning”
“Agents are hype”
“We just need a bigger model”
“Guardrails are enough”

Those arguments miss the point: systems are recipes. The right question is:

What components do we need to combine — under our constraints — to produce a reliable behavior?

A periodic table helps you:

reason about architectures like chemistry (components + interactions)
identify missing parts (what’s needed for reliability)
communicate design choices quickly (a shared vocabulary)

The Table at a Glance

The table is organized along two axes:

Rows = maturity / capability layers

Row 1: Primitive — raw building blocks
Row 2: Composition — you can assemble useful apps
Row 3: Deployment — production-grade, scalable, safer
Row 4: Emerging — fast-moving, not fully standardized yet

Columns (Families) = what role the component plays

Group 1: Reactive — interaction & action (prompts, function calling, agents)
Group 2: Retrieval — memory outside the model (embeddings, vector DBs, synthetic data)
Group 3: Orchestration — how components are coordinated (RAG, frameworks, protocols)
Group 4: Validation — safety, correctness, testing (guardrails, red teaming, interpretability)
Group 5: Models — the engines (LLMs, multimodal, small models, thinking models)

The Filled Periodic Table (Elements)

Below is a clean, “complete” version. If you’re publishing this on Medium, place your table image right here.

Row 1 — Primitive

Pr (Prompts) — the simplest interface: language in, behavior out
Em (Embeddings) — numerical meaning representations
Ch (Prompt Chaining / Templates) — multi-step prompts, reusable templates, decomposition without tools
Sc (Schemas / Output Constraints) — structured outputs, format constraints, JSON schema, validation rules
Lg (LLM) — the general-purpose reasoning/generation engine

Row 2 — Composition

Fc (Function Calling) — the model triggers tools/APIs
Vx (Vector Databases) — store/query embeddings at scale
Rg (RAG) — retrieve context and ground generation
Gr (Guardrails) — runtime checks and safety filters
Mm (Multimodal Models) — text + images + audio/video inputs

Row 3 — Deployment

Ag (Agent) — plan → act → observe loop
Ft (Fine-tuning) — bake specialization into weights
Fw (Frameworks) — glue code + orchestration (LangChain/LangGraph/AutoGen patterns)
Rt (Red Teaming) — adversarial testing: jailbreaks, prompt injection, exfiltration simulation
Sm (Small Models) — fast, cheap, deployable (edge/on-device)

Row 4 — Emerging

Ma (Multi-agent) — multiple agents collaborating
Sy (Synthetic Data) — generate training/eval data using AI
MCP (MCP Servers / Tool Protocol Layer) — standardized tool/data access for agents and apps
In (Interpretability) — explain why models do what they do
Th (Thinking Models) — explicit reasoning loops, test-time compute scaling

Why MCP belongs here: it’s becoming the “ports and adapters” layer for AI apps — standardizing how models talk to tools and context. It’s still evolving rapidly, so it fits best as an emerging orchestration element.

What This Table Gets Exactly Right

1) Most AI products are “Row 2 systems”

If you’ve shipped a chatbot that feels useful, you almost certainly used Row2:

RAG (Rg) for grounding
Vector DB (Vx) for memory
Guardrails (Gr) for safety
Function calls (Fc) for real-time actions

Row 2 is where prototypes become products.

2) Autonomy grows along Group 1 (Reactive)

There’s a visible progression:

Pr → Fc → Ag → Ma
That’s control → action → autonomy.

3) Reliability grows along Group 4 (Validation)

Most failures in production are not “the model was dumb” but:

wrong retrieval
prompt injection
data leakage
tool misuse
brittle formatting

Validation elements are the difference between a demo and something you can trust.

How to “Predict Reactions”

When you combine elements, you get predictable product behaviors.

A few examples:

Rg + Vx + Em + Lg → grounded Q&A over documents
Ag + Fc + MCP → tool-using assistant that can actually do things
Gr + Rt → safer deployment and fewer surprises
Sm + Vx → fast, cheap assistants with external memory
Sy + In → faster iteration and better understanding of failure modes

Now let’s make it concrete.

10 Canonical Reactions (Reference Architectures) — All Using MCP Servers

Each reaction below is written like a reusable recipe:

Goal
Core elements
How it flows
Where it breaks
What to add when scaling

1) Secure Enterprise Documentation Chatbot (Production RAG)

Goal: answer questions from internal docs without hallucinating or leaking sensitive content.

Elements: Pr + Em + Vx + Rg + Lg + Gr + Rt + MCP

Flow:

Ingest docs → chunk → Em → store in Vx
User question (Pr) → embed → retrieve (Rg)
Compose context → answer with Lg
Apply Gr (PII/policy checks, citation requirements)
Tools and doc stores accessed via MCP servers
Validate using Rt (prompt injection + exfil attempts)

Common failure: retrieval returns plausible but wrong chunks
Scale add-ons: eval sets, retrieval metrics, access controls by role

2) Customer Support Triage Agent (Tickets → Actions)

Goal: classify tickets, draft replies, escalate, create tasks automatically.

Elements: Pr + Lg + Fc + Ag + Gr + MCP

Flow:

Ticket text arrives → agent (Ag) plans steps
Uses Fc via MCP to query CRM, order status, knowledge base
Draft response with Lg
Gr checks tone, policy, and PII rules
Optional human-in-the-loop approval

Common failure: tool misuse (wrong customer record)
Scale add-ons: strict tool schemas (Sc), audit logs, sampling-based QA

3) Agentic Flight Booking Loop (Classic “Think–Act–Observe”)

Goal: book a flight under constraints (budget, dates, preferences).

Elements: Pr + Ag + Fc + Gr + MCP (+ Rt recommended)

Flow:

User gives constraints (Pr)
Agent decomposes: search → compare → confirm → book
Each tool call happens via MCP servers (flight API, calendar, payment sandbox)
Gr enforces “ask before purchase” and safe payment handling
Red-team (Rt) to test injection (“ignore budget, buy premium”) and data exposure

Common failure: agent loops or over-optimizes irrelevant criteria
Scale add-ons: planner constraints, budget ceilings hard-coded in tools

4) Codebase Copilot (Repo-Aware Coding Assistant)

Goal: answer questions about your codebase and generate diffs responsibly.

Elements: Em + Vx + Rg + Lg + Sc + Gr + MCP (+ Ft optional)

Flow:

Index repo docs + code → embeddings (Em) → Vx
Ask “where is X implemented?” → Rg retrieves relevant files
Model answers + proposes patch
Sc forces structured output (diff format, file paths)
Tools (repo search, CI results, ticket context) via MCP servers
Gr blocks secrets in output and risky commands

Common failure: confident wrong refactors
Scale add-ons: unit-test execution tool, static analysis, gated merges

5) Multimodal Document Q&A (PDFs with Tables + Images)

Goal: answer questions over scanned PDFs, diagrams, and tables.

Elements: Mm + Em + Vx + Rg + Lg + Gr + MCP

Flow:

Extract text + images; multimodal embeddings (Mm/Em)
Store in Vx with page-level metadata
Rg retrieves relevant passages + visuals
Mm/Lg generates grounded answer
File store + OCR/vision tools accessed via MCP servers
Gr enforces citations and uncertainty when confidence is low

Common failure: table misreads or citation mismatch
Scale add-ons: page screenshot citation, structured table extraction

6) Analytics Insight Bot (Natural Language → SQL → Explanation)

Goal: business user asks questions; system runs SQL and explains results.

Elements: Pr + Ag + Fc + Sc + Gr + MCP (+ Rt recommended)

Flow:

User asks: “Why did churn spike last month?”
Agent plans queries and checks assumptions
Uses Fc via MCP to run SQL on warehouse + fetch semantic layer metrics
Sc forces output: SQL + result summary + caveats
Gr prevents unsafe queries and enforces row limits / privacy rules

Common failure: wrong joins, misleading causal language
Scale add-ons: metric definitions store, query linting, counterfactual checks

7) SOC / Security Assistant (Alert Triage + Playbooks)

Goal: reduce analyst load by summarizing alerts and suggesting next steps safely.

Elements: Rg + Vx + Lg + Ag + Fc + Rt + Gr + MCP

Flow:

Retrieve prior incidents/playbooks (Rg)
Agent suggests actions (isolate host, query logs)
Executes via MCP servers (SIEM, EDR, ticketing)
Gr ensures safe actions (no destructive steps without approval)
Rt stress-tests prompt injection from malicious logs

Common failure: acting on attacker-controlled text
Scale add-ons: tool allow-lists, human approval gates, sandbox execution

8) On-Device Personal Assistant (Cheap + Private)

Goal: fast assistant with privacy; uses external memory without cloud dependence.

Elements: Sm + Em + Vx + Rg + Gr + MCP

Flow:

Small model (Sm) handles interaction locally
Personal notes indexed with Em/Vx
Rg retrieves relevant context for responses
Device tools (calendar, reminders) via MCP servers (local adapters)
Gr enforces privacy rules and prevents data export

Common failure: limited reasoning depth
Scale add-ons: hybrid routing to larger model for complex tasks (with consent)

9) Synthetic Data Factory for Training + Evaluation

Goal: generate diverse training/eval sets for hard edge cases.

Elements: Sy + Lg/Mm + In + Gr + MCP

Flow:

Define failure modes (e.g., ambiguous queries, OCR noise)
Generate synthetic examples (Sy) with constraints
Use In-style analysis to understand where models fail
Store datasets and pipelines via MCP servers
Guardrails (Gr) ensure no sensitive data leaks into synthetic sets

Common failure: synthetic data that’s too “clean”
Scale add-ons: adversarial generation, noise models, human review sampling

10) Production Reliability Loop (Eval Harness + Guardrails + Red Teaming)

Goal: continuously improve quality, safety, and robustness.

Elements: Gr + Rt + In + Ch + Sc + MCP (+ Th emerging)

Flow:

Define key behaviors and tests (prompt suites, injection tests)
Run regressions after prompt/model/retrieval changes
Use MCP servers to pull logs, traces, and evaluation datasets
Apply interpretability tools (In) to investigate failure clusters
Introduce systematic prompting (Ch) and structure (Sc) to reduce variance
Optionally use thinking models (Th) where deeper reasoning is worth the latency

Common failure: “we shipped a change and broke everything silently”
Scale add-ons: automated gates, canary deploys, quality dashboards

How to Choose the Right Reaction

A quick rule of thumb:

If you need grounding → start with Rg + Vx + Em
If you need actions → add Fc, then Ag, then MCP
If you need safety → add Gr, then validate with Rt
If you need speed/cost → consider Sm + retrieval instead of bigger models
If you need domain specialization → consider Ft, but keep retrieval anyway

What’s Still Missing (and Why That’s Good)

Like any periodic table, this one will evolve. In practice, teams also need “elements” for:

observability (traces, eval metrics)
governance (access control, PII workflows)
data engineering (ETL quality, lineage)

Those may become future blocks — or they might live as “compounds” that span families.

That’s the fun part: a periodic table is a living map of a field.

Closing Thought

If you take one idea from this post, take this:

Stop debating single ingredients. Start designing reactions.

Once your team can point at components and predict outcomes, your architecture discussions become clearer, faster, and far more practical.

Follow me on LinkedIn, explore my work on GitHub, and know more about me on my portfolio.

Watch the original video from IBM Technology by Martin Keen:

“AI Periodic Table Explained: Mapping LLMs, RAG & AI Agent Frameworks”

References: https://www.youtube.com/watch?v=ESBMgZHzfG0