Most teams still talk about GenAI like it’s one ingredient: “Which model should we use?”

But real-world AI products are systems. They’re made of components that interact. And once you start thinking in components, a simple idea becomes powerful:

A periodic table isn’t just for memorizing elements. It’s for predicting reactions.

Recently, Martin Keen from IBM Technology popularized a “periodic table” style sketch for AI systems. I like it because it gives people a shared language: prompts, embeddings, vector DBs, RAG, guardrails, agents, fine-tuning, small models, and more — organized in a way that hints at what combinations are stable, useful, or risky.

In this post I’ll do three things:

Why a “Periodic Table” for AI?

Because most product debates are framed incorrectly.

People argue:

Those arguments miss the point: systems are recipes. The right question is:

What components do we need to combine — under our constraints — to produce a reliable behavior?

A periodic table helps you:

The Table at a Glance

The table is organized along two axes:

Rows = maturity / capability layers

Columns (Families) = what role the component plays

The Filled Periodic Table (Elements)

Below is a clean, “complete” version. If you’re publishing this on Medium, place your table image right here.

AI Systems Periodic Table

Row 1 — Primitive

Pr (Prompts) — the simplest interface: language in, behavior out
Em (Embeddings) — numerical meaning representations
Ch (Prompt Chaining / Templates) — multi-step prompts, reusable templates, decomposition without tools
Sc (Schemas / Output Constraints) — structured outputs, format constraints, JSON schema, validation rules
Lg (LLM) — the general-purpose reasoning/generation engine

Row 2 — Composition

Fc (Function Calling) — the model triggers tools/APIs
Vx (Vector Databases) — store/query embeddings at scale
Rg (RAG) — retrieve context and ground generation
Gr (Guardrails) — runtime checks and safety filters
Mm (Multimodal Models) — text + images + audio/video inputs

Row 3 — Deployment

Ag (Agent) — plan → act → observe loop
Ft (Fine-tuning) — bake specialization into weights
Fw (Frameworks) — glue code + orchestration (LangChain/LangGraph/AutoGen patterns)
Rt (Red Teaming) — adversarial testing: jailbreaks, prompt injection, exfiltration simulation
Sm (Small Models) — fast, cheap, deployable (edge/on-device)

Row 4 — Emerging

Ma (Multi-agent) — multiple agents collaborating
Sy (Synthetic Data) — generate training/eval data using AI
MCP (MCP Servers / Tool Protocol Layer) — standardized tool/data access for agents and apps
In (Interpretability) — explain why models do what they do
Th (Thinking Models) — explicit reasoning loops, test-time compute scaling

Why MCP belongs here: it’s becoming the “ports and adapters” layer for AI apps — standardizing how models talk to tools and context. It’s still evolving rapidly, so it fits best as an emerging orchestration element.

What This Table Gets Exactly Right

1) Most AI products are “Row 2 systems”

If you’ve shipped a chatbot that feels useful, you almost certainly used Row2:

Row 2 is where prototypes become products.

2) Autonomy grows along Group 1 (Reactive)

There’s a visible progression:

3) Reliability grows along Group 4 (Validation)

Most failures in production are not “the model was dumb” but:

Validation elements are the difference between a demo and something you can trust.

How to “Predict Reactions”

When you combine elements, you get predictable product behaviors.

A few examples:

Now let’s make it concrete.

10 Canonical Reactions (Reference Architectures) — All Using MCP Servers

Each reaction below is written like a reusable recipe:

1) Secure Enterprise Documentation Chatbot (Production RAG)

Goal: answer questions from internal docs without hallucinating or leaking sensitive content.

Elements: Pr + Em + Vx + Rg + Lg + Gr + Rt + MCP

Flow:

  1. Ingest docs → chunk → Em → store in Vx
  2. User question (Pr) → embed → retrieve (Rg)
  3. Compose context → answer with Lg
  4. Apply Gr (PII/policy checks, citation requirements)
  5. Tools and doc stores accessed via MCP servers
  6. Validate using Rt (prompt injection + exfil attempts)

Common failure: retrieval returns plausible but wrong chunks
Scale add-ons: eval sets, retrieval metrics, access controls by role

2) Customer Support Triage Agent (Tickets → Actions)

Goal: classify tickets, draft replies, escalate, create tasks automatically.

Elements: Pr + Lg + Fc + Ag + Gr + MCP

Flow:

  1. Ticket text arrives → agent (Ag) plans steps
  2. Uses Fc via MCP to query CRM, order status, knowledge base
  3. Draft response with Lg
  4. Gr checks tone, policy, and PII rules
  5. Optional human-in-the-loop approval

Common failure: tool misuse (wrong customer record)
Scale add-ons: strict tool schemas (Sc), audit logs, sampling-based QA

3) Agentic Flight Booking Loop (Classic “Think–Act–Observe”)

Goal: book a flight under constraints (budget, dates, preferences).

Elements: Pr + Ag + Fc + Gr + MCP (+ Rt recommended)

Flow:

  1. User gives constraints (Pr)
  2. Agent decomposes: search → compare → confirm → book
  3. Each tool call happens via MCP servers (flight API, calendar, payment sandbox)
  4. Gr enforces “ask before purchase” and safe payment handling
  5. Red-team (Rt) to test injection (“ignore budget, buy premium”) and data exposure

Common failure: agent loops or over-optimizes irrelevant criteria
Scale add-ons: planner constraints, budget ceilings hard-coded in tools

4) Codebase Copilot (Repo-Aware Coding Assistant)

Goal: answer questions about your codebase and generate diffs responsibly.

Elements: Em + Vx + Rg + Lg + Sc + Gr + MCP (+ Ft optional)

Flow:

  1. Index repo docs + code → embeddings (Em) → Vx
  2. Ask “where is X implemented?” → Rg retrieves relevant files
  3. Model answers + proposes patch
  4. Sc forces structured output (diff format, file paths)
  5. Tools (repo search, CI results, ticket context) via MCP servers
  6. Gr blocks secrets in output and risky commands

Common failure: confident wrong refactors
Scale add-ons: unit-test execution tool, static analysis, gated merges

5) Multimodal Document Q&A (PDFs with Tables + Images)

Goal: answer questions over scanned PDFs, diagrams, and tables.

Elements: Mm + Em + Vx + Rg + Lg + Gr + MCP

Flow:

  1. Extract text + images; multimodal embeddings (Mm/Em)
  2. Store in Vx with page-level metadata
  3. Rg retrieves relevant passages + visuals
  4. Mm/Lg generates grounded answer
  5. File store + OCR/vision tools accessed via MCP servers
  6. Gr enforces citations and uncertainty when confidence is low

Common failure: table misreads or citation mismatch
Scale add-ons: page screenshot citation, structured table extraction

6) Analytics Insight Bot (Natural Language → SQL → Explanation)

Goal: business user asks questions; system runs SQL and explains results.

Elements: Pr + Ag + Fc + Sc + Gr + MCP (+ Rt recommended)

Flow:

  1. User asks: “Why did churn spike last month?”
  2. Agent plans queries and checks assumptions
  3. Uses Fc via MCP to run SQL on warehouse + fetch semantic layer metrics
  4. Sc forces output: SQL + result summary + caveats
  5. Gr prevents unsafe queries and enforces row limits / privacy rules

Common failure: wrong joins, misleading causal language
Scale add-ons: metric definitions store, query linting, counterfactual checks

7) SOC / Security Assistant (Alert Triage + Playbooks)

Goal: reduce analyst load by summarizing alerts and suggesting next steps safely.

Elements: Rg + Vx + Lg + Ag + Fc + Rt + Gr + MCP

Flow:

  1. Retrieve prior incidents/playbooks (Rg)
  2. Agent suggests actions (isolate host, query logs)
  3. Executes via MCP servers (SIEM, EDR, ticketing)
  4. Gr ensures safe actions (no destructive steps without approval)
  5. Rt stress-tests prompt injection from malicious logs

Common failure: acting on attacker-controlled text
Scale add-ons: tool allow-lists, human approval gates, sandbox execution

8) On-Device Personal Assistant (Cheap + Private)

Goal: fast assistant with privacy; uses external memory without cloud dependence.

Elements: Sm + Em + Vx + Rg + Gr + MCP

Flow:

  1. Small model (Sm) handles interaction locally
  2. Personal notes indexed with Em/Vx
  3. Rg retrieves relevant context for responses
  4. Device tools (calendar, reminders) via MCP servers (local adapters)
  5. Gr enforces privacy rules and prevents data export

Common failure: limited reasoning depth
Scale add-ons: hybrid routing to larger model for complex tasks (with consent)

9) Synthetic Data Factory for Training + Evaluation

Goal: generate diverse training/eval sets for hard edge cases.

Elements: Sy + Lg/Mm + In + Gr + MCP

Flow:

  1. Define failure modes (e.g., ambiguous queries, OCR noise)
  2. Generate synthetic examples (Sy) with constraints
  3. Use In-style analysis to understand where models fail
  4. Store datasets and pipelines via MCP servers
  5. Guardrails (Gr) ensure no sensitive data leaks into synthetic sets

Common failure: synthetic data that’s too “clean”
Scale add-ons: adversarial generation, noise models, human review sampling

10) Production Reliability Loop (Eval Harness + Guardrails + Red Teaming)

Goal: continuously improve quality, safety, and robustness.

Elements: Gr + Rt + In + Ch + Sc + MCP (+ Th emerging)

Flow:

  1. Define key behaviors and tests (prompt suites, injection tests)
  2. Run regressions after prompt/model/retrieval changes
  3. Use MCP servers to pull logs, traces, and evaluation datasets
  4. Apply interpretability tools (In) to investigate failure clusters
  5. Introduce systematic prompting (Ch) and structure (Sc) to reduce variance
  6. Optionally use thinking models (Th) where deeper reasoning is worth the latency

Common failure: “we shipped a change and broke everything silently”
Scale add-ons: automated gates, canary deploys, quality dashboards

How to Choose the Right Reaction

A quick rule of thumb:

What’s Still Missing (and Why That’s Good)

Like any periodic table, this one will evolve. In practice, teams also need “elements” for:

Those may become future blocks — or they might live as “compounds” that span families.

That’s the fun part: a periodic table is a living map of a field.

Closing Thought

If you take one idea from this post, take this:

Stop debating single ingredients. Start designing reactions.

Once your team can point at components and predict outcomes, your architecture discussions become clearer, faster, and far more practical.

Follow me on LinkedIn, explore my work on GitHub, and know more about me on my portfolio.

Watch the original video from IBM Technology by Martin Keen:

AI Periodic Table Explained: Mapping LLMs, RAG & AI Agent Frameworks

References: https://www.youtube.com/watch?v=ESBMgZHzfG0