# Does inclusion in llms-full.txt change crawl or citation?

> llms-full.txt bundles our cornerstones as one fetchable text file. Does being in it actually get a page read and cited more than being left out? Keep one article in, hold a comparable one out, and measure. Proposed — and the 'hold out' arm has a real cost we name up front.

Experiment: exp-004
Status: planned
Metric: crawler-hits
Canonical: https://geosalience.com/lab/experiments/exp-004
Window: 2026-06-21T00:00:00.000Z → 2026-07-21T00:00:00.000Z
Hypothesis: An article included in llms-full.txt is fetched and cited more by AI systems than a comparable article deliberately excluded from it.
What changed: Treatment article remains in llms-full.txt (status quo). Control article is temporarily removed from the llms-full.txt build for the window. Both pages' HTML, schema, and .md aliases are otherwise unchanged.
Treatment pages: /foundations/what-is-geo
Control pages: /technical/llms-txt-spec-adoption-setup

---
**Status: proposed.** llms-full.txt is unchanged and both articles are currently in it. The dates are proposed. The editor decides whether to run this and confirms the pairing.

## The question

We publish [llms-full.txt](/llms-full.txt) — a single text bundle of our cornerstone articles — on the assumption that LLMs find a concatenated file easier to ingest than crawling pages one by one. Does it actually change anything?

## Proposed design

- **Treatment:** keep [What is GEO?](/foundations/what-is-geo) in llms-full.txt.
- **Control:** temporarily remove [the llms.txt spec article](/technical/llms-txt-spec-adoption-setup) from the llms-full.txt build for the window, then restore it.
- **Primary metric:** per-page AI-crawler hits, plus hits on llms-full.txt itself, before vs after.
- **Window:** 14 days before, 30-day review.

## The cost we won't hide

This is the one candidate where the control arm has a price: the held-out cornerstone gets less machine exposure for a month. That is a deliberate, disclosed trade for a clean answer to "does the bundle matter." If the editor judges the exposure too valuable to risk, the honest alternative is to run this only once a third comparable cornerstone is live, so no flagship page bears the cost.

## What would count as a result

A drop in crawl frequency for the held-out control (and/or a relative rise for the included treatment) that recovers after the control is restored. A clean recover-on-restore pattern would be the strongest version of the signal.