Technical · 4 min read

JSON-LD Recipes for Articles, Datasets, and FAQs

Copy-paste JSON-LD blocks that pass schema.org validators, render correctly in Google Rich Results, and surface in AI search engines. Annotated with what each field actually does.

GeoSalience

·Published 7 June 2026·Updated 8 Jun 2026·View as Markdown (.md)

Six JSON-LD blocks cover almost everything a content site needs for AI search: Article, FAQPage, Dataset, Person, Organization, and BreadcrumbList. Every snippet below is the exact markup we run in production on GeoSalience as of 2026-06-08, annotated with what each field actually does — copy it, swap in your values, and validate before you ship. See what GEO is for why structured data matters to AI search in the first place.

Article

For every editorial post:

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Your article title",
  "description": "The deck — your dek, your subtitle, 1–2 sentences.",
  "image": "https://example.com/og-image.png",
  "datePublished": "2026-05-19T00:00:00Z",
  "dateModified": "2026-05-19T00:00:00Z",
  "author": [
    {
      "@type": "Person",
      "name": "Author Name",
      "url": "https://example.com/authors/author-slug"
    }
  ],
  "publisher": {
    "@type": "Organization",
    "name": "Your Publication",
    "url": "https://example.com",
    "logo": { "@type": "ImageObject", "url": "https://example.com/logo.png" }
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://example.com/article-slug"
  }
}

Field notes:

headline — keep it under 110 characters or Google Rich Results truncates.
image — must be ≥1200px wide for Top Stories eligibility.
dateModified — bump this every time you make a substantive edit. Some LLMs use it as a freshness signal.
author — array, even with one author. Future-proofs co-author additions.

FAQPage

For a page that's a list of question-answer pairs:

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is X?",
      "acceptedAnswer": { "@type": "Answer", "text": "X is …" }
    },
    {
      "@type": "Question",
      "name": "How do I Y?",
      "acceptedAnswer": { "@type": "Answer", "text": "To Y, …" }
    }
  ]
}

Field notes:

Only apply FAQPage to pages that are actually FAQ-shaped. Google has periodically penalised over-use.
acceptedAnswer.text can include limited HTML. Keep it short and complete — if you need 500 words to answer, FAQPage is the wrong format.

Dataset

For a structured-data download (this is our flagship recipe for GEO):

{
  "@context": "https://schema.org",
  "@type": "Dataset",
  "name": "Your Dataset Name — Raw Results",
  "description": "One-sentence description of the columns: e.g. prompt, brand, LLM, citation, ground truth.",
  "url": "https://example.com/datasets/your-dataset.csv",
  "encodingFormat": "text/csv",
  "creator": { "@type": "Organization", "name": "Your Publication", "url": "https://example.com" },
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "datePublished": "2026-06-07",
  "isAccessibleForFree": true,
  "keywords": ["geo", "benchmark", "ai-search", "citations"]
}

Field notes:

Dataset typing makes a structured download explicitly machine-readable and licensable — it tells an engine "this is data, here is how it's formatted, here is who may use it." We publish our own measurements this way; the citation-rate and crawler datasets behind our lab carry exactly this block, and we're measuring whether it moves our own citation rate over time rather than asserting a number we haven't earned yet.
license carries the reuse terms with the data. A permissive license like CC-BY states the attribution expectation explicitly, which is the rationale for using it on data you want cited back.
isAccessibleForFree: true tells the engine the dataset can be linked and fetched without a paywall.

Person

For author bio pages:

{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Author Name",
  "jobTitle": "Editor",
  "description": "What this person does.",
  "url": "https://example.com/authors/slug",
  "sameAs": [
    "https://twitter.com/handle",
    "https://linkedin.com/in/handle",
    "https://github.com/handle"
  ]
}

Field notes:

sameAs is the field LLMs use to disambiguate "Jane Doe at Acme" from any other Jane Doe. Always link your verified social profiles.
jobTitle is useful but not required.

Organization (for the publisher)

Goes in your site's root layout — one block per page:

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Publication",
  "url": "https://example.com",
  "logo": { "@type": "ImageObject", "url": "https://example.com/logo.png" },
  "sameAs": ["https://twitter.com/handle", "https://linkedin.com/company/handle"]
}

Field notes:

This is a publisher-identity signal. LLMs use it to assess source trustworthiness.
logo should meet Google's structured-data logo guidance: a rectangular image, ideally 600×60, and at least 112×112px.

BreadcrumbList

For navigation context — Google uses this to render breadcrumbs in SERP:

{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://example.com" },
    { "@type": "ListItem", "position": 2, "name": "Foundations", "item": "https://example.com/foundations" },
    { "@type": "ListItem", "position": 3, "name": "What is GEO?", "item": "https://example.com/foundations/what-is-geo" }
  ]
}

Field notes:

position starts at 1.
The last item's item URL must match the canonical of the current page.

Common mistakes

Schema in the page body — JSON-LD lives in <script type="application/ld+json"> in the <head> (or directly inside the article markup). Don't put it in <noscript> or hidden divs.
Mismatching dates — datePublished and dateModified should match the visible date on the page.
Stale dateModified — bumping this without actually editing the article is a credibility risk.
Missing mainEntityOfPage — small thing but it's how the search engine confirms which URL this article block is about.
Schema for SEO that contradicts the page — if headline differs from the visible <h1>, Google flags this and may demote the page.

Validation

Always validate before deploying:

Schema.org Validator — catches syntax errors and required-field misses.
Google Rich Results Test — confirms eligibility for specific Google features.

How we wrote this

These recipes are taken directly from production code on GeoSalience, as of 2026-06-07. Inspect lib/seo.ts in our repo for the live implementations. The Article, Organization, BreadcrumbList, and Dataset blocks above all ship on real pages — the lab case study, for instance, carries the Dataset block for its citation-rate data.

Changelog

Published — 7 June 2026
Updated — 8 June 2026

GeoSalience

Editorial

Independent publication on Generative Engine Optimization. Primary research on how AI search engines retrieve, rank, and cite.

Twitter LinkedIn GitHub Bluesky

JSON-LD Recipes for Articles, Datasets, and FAQs

Article

FAQPage

Dataset

Person

Organization (for the publisher)

BreadcrumbList

Common mistakes

Validation

How we wrote this

See also

Changelog

Related

llms.txt: Spec, 100-Domain Adoption Audit, and Setup

How to Get Cited by LLMs: The Complete Taxonomy of GEO Methods

State of GEO Q2 2026: the AI engine you optimize for matters most