# JSON-LD Recipes for Articles, Datasets, and FAQs

> Copy-paste JSON-LD blocks that pass schema.org validators, render correctly in Google Rich Results, and surface in AI search engines. Annotated with what each field actually does.

Canonical: https://geosalience.com/technical/json-ld-recipes
Published: 2026-06-07T00:00:00.000Z
Updated: 2026-06-07T00:00:00.000Z
Pillar: technical
Authors: geosalience

---
You're not going to remember the entire schema.org spec, and you don't need to. These are the seven blocks that cover most practical use on a content site, with notes on what each field is doing. Every snippet here is JSON-LD we run in production on GeoSalience as of 2026-06-07 — see [what GEO is](/foundations/what-is-geo) for why structured data matters to AI search in the first place.

## Article

For every editorial post:

```json
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Your article title",
  "description": "The deck — your dek, your subtitle, 1–2 sentences.",
  "image": "https://example.com/og-image.png",
  "datePublished": "2026-05-19T00:00:00Z",
  "dateModified": "2026-05-19T00:00:00Z",
  "author": [
    {
      "@type": "Person",
      "name": "Author Name",
      "url": "https://example.com/authors/author-slug"
    }
  ],
  "publisher": {
    "@type": "Organization",
    "name": "Your Publication",
    "url": "https://example.com",
    "logo": { "@type": "ImageObject", "url": "https://example.com/logo.png" }
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "https://example.com/article-slug"
  }
}
```

Field notes:

- `headline` — keep it under 110 characters or Google Rich Results truncates.
- `image` — must be ≥1200px wide for Top Stories eligibility.
- `dateModified` — bump this every time you make a substantive edit. Some LLMs use it as a freshness signal.
- `author` — array, even with one author. Future-proofs co-author additions.

## FAQPage

For a page that's a list of question-answer pairs:

```json
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [
    {
      "@type": "Question",
      "name": "What is X?",
      "acceptedAnswer": { "@type": "Answer", "text": "X is …" }
    },
    {
      "@type": "Question",
      "name": "How do I Y?",
      "acceptedAnswer": { "@type": "Answer", "text": "To Y, …" }
    }
  ]
}
```

Field notes:

- Only apply FAQPage to pages that are *actually* FAQ-shaped. Google has periodically penalised over-use.
- `acceptedAnswer.text` can include limited HTML. Keep it short and complete — if you need 500 words to answer, FAQPage is the wrong format.

## Dataset

For a structured-data download (this is our flagship recipe for GEO):

```json
{
  "@context": "https://schema.org",
  "@type": "Dataset",
  "name": "Your Dataset Name — Raw Results",
  "description": "One-sentence description of the columns: e.g. prompt, brand, LLM, citation, ground truth.",
  "url": "https://example.com/datasets/your-dataset.csv",
  "encodingFormat": "text/csv",
  "creator": { "@type": "Organization", "name": "Your Publication", "url": "https://example.com" },
  "license": "https://creativecommons.org/licenses/by/4.0/",
  "datePublished": "2026-06-07",
  "isAccessibleForFree": true,
  "keywords": ["geo", "benchmark", "ai-search", "citations"]
}
```

Field notes:

- `Dataset` typing makes a structured download explicitly machine-readable and licensable — it tells an engine "this is data, here is how it's formatted, here is who may use it." We publish our own measurements this way; the citation-rate and crawler datasets behind [our lab](/lab) carry exactly this block, and we're measuring whether it moves our own [citation rate](/glossary/citation-rate) over time rather than asserting a number we haven't earned yet.
- `license` carries the reuse terms with the data. A permissive license like CC-BY states the attribution expectation explicitly, which is the rationale for using it on data you *want* cited back.
- `isAccessibleForFree: true` tells the engine the dataset can be linked and fetched without a paywall.

## Person

For author bio pages:

```json
{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Author Name",
  "jobTitle": "Editor",
  "description": "What this person does.",
  "url": "https://example.com/authors/slug",
  "sameAs": [
    "https://twitter.com/handle",
    "https://linkedin.com/in/handle",
    "https://github.com/handle"
  ]
}
```

Field notes:

- `sameAs` is the field LLMs use to disambiguate "Jane Doe at Acme" from any other Jane Doe. Always link your verified social profiles.
- `jobTitle` is useful but not required.

## Organization (for the publisher)

Goes in your site's root layout — one block per page:

```json
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Your Publication",
  "url": "https://example.com",
  "logo": { "@type": "ImageObject", "url": "https://example.com/logo.png" },
  "sameAs": ["https://twitter.com/handle", "https://linkedin.com/company/handle"]
}
```

Field notes:

- This is a publisher-identity signal. LLMs use it to assess source trustworthiness.
- `logo` should meet Google's structured-data logo guidance: a rectangular image, ideally 600×60, and at least 112×112px.

## BreadcrumbList

For navigation context — Google uses this to render breadcrumbs in SERP:

```json
{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    { "@type": "ListItem", "position": 1, "name": "Home", "item": "https://example.com" },
    { "@type": "ListItem", "position": 2, "name": "Foundations", "item": "https://example.com/foundations" },
    { "@type": "ListItem", "position": 3, "name": "What is GEO?", "item": "https://example.com/foundations/what-is-geo" }
  ]
}
```

Field notes:

- `position` starts at 1.
- The last item's `item` URL must match the canonical of the current page.

## Common mistakes

1. **Schema in the page body** — JSON-LD lives in `<script type="application/ld+json">` in the `<head>` (or directly inside the article markup). Don't put it in `<noscript>` or hidden divs.
2. **Mismatching dates** — `datePublished` and `dateModified` should match the visible date on the page.
3. **Stale `dateModified`** — bumping this without actually editing the article is a credibility risk.
4. **Missing `mainEntityOfPage`** — small thing but it's how the search engine confirms which URL this article block is *about*.
5. **Schema for SEO that contradicts the page** — if `headline` differs from the visible `<h1>`, Google flags this and may demote the page.

## Validation

Always validate before deploying:

- [Schema.org Validator](https://validator.schema.org/) — catches syntax errors and required-field misses.
- [Google Rich Results Test](https://search.google.com/test/rich-results) — confirms eligibility for specific Google features.

## How we wrote this

These recipes are taken directly from production code on GeoSalience, as of 2026-06-07. Inspect `lib/seo.ts` in [our repo](https://github.com/geosalience/geosalience) for the live implementations. The `Article`, `Organization`, `BreadcrumbList`, and `Dataset` blocks above all ship on real pages — the [lab case study](/case-studies/geosalience-as-its-own-case-study), for instance, carries the `Dataset` block for its citation-rate data.

## See also

- [Technical pillar](/technical) — every article we publish on the technical side of GEO.
- [The llms.txt spec: adoption and setup](/technical/llms-txt-spec-adoption-setup) — the other machine-readability layer, alongside JSON-LD.
- [What is GEO](/foundations/what-is-geo) — the concept these recipes serve.
- [This site is our GEO lab](/case-studies/geosalience-as-its-own-case-study) — where we run this exact `Dataset` markup on our own data.
- [GEO](/glossary/geo) — the one-line definition, if you landed here cold.