Skip to content
GeoSalience
exp-001ConcludedMixed

Baseline established — the day-zero snapshot

Before any GEO experiment can claim a lift, it needs a zero point. This records the state of GeoSalience at launch on 2026-05-31: the corpus, the crawler policy, and exactly which metrics are — and are not yet — being measured.

Result

At launch the live corpus was 2 articles (both cornerstone) plus 6 glossary terms; robots.txt welcomes the major AI crawlers and Cloudflare runs DNS-only, so bots reach the origin directly. AI-crawler hit tracking and the citation-rate harness were not yet streaming on 2026-05-31, so the measured baseline for both is 'pending first run' — expected near zero. This zero point anchors every later experiment.

Hypothesis
There is no hypothesis to test here. A baseline is the reference against which every later experiment's before/after is measured. Without a dated zero point, no future lift is credible.
What changed
None. No page was changed. This experiment is an observation of the launch state, not an intervention.
Treatment
none
Control
none
Metric
Mixed
Window
31 May 202631 May 2026

GeoSalience launched on 2026-05-31. This experiment records the zero point — the state of the site on day one — so that every later experiment has an honest "before" to measure against.

Key facts at baseline (2026-05-31):

  • Live corpus: 2 articles, both cornerstone — What is GEO? and The llms.txt spec: adoption and setup — plus 6 glossary terms. Ten further articles exist as drafts, gated on primary research, and are hidden from listings.
  • Crawler policy: robots.txt welcomes the major AI crawlers (GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended and others). Cloudflare runs in DNS-only mode, so those crawlers reach the origin directly rather than being filtered at the edge.
  • GEO surface live: llms.txt, llms-full.txt, .md aliases on every article, Article/Dataset/Breadcrumb JSON-LD, canonical URLs, sitemap, and RSS/Atom/JSON feeds — all verified returning 200 over HTTPS at launch.

What is measured at baseline, and what is not

This is the honest part. A baseline is only useful if it says exactly what it knows.

  • AI-crawler hits — not yet measured. The nginx-log parser that turns crawler footprints into a daily dataset is scheduled but was not streaming on 2026-05-31. The day-zero record for crawler hits is therefore "pending first run," not "zero confirmed."
  • Citation rate (North Star) — not yet measured. The 50-prompt, 4-LLM harness had not run a baseline at launch. The expected day-zero value is near 0%, but we record it as "pending," not as a measured 0%.
  • Traffic — off by design. Plausible is env-gated off at launch.

We do not fabricate a number we did not collect. As each pipeline produces its first run, this page's changelog is updated with the real figure.

Why a near-zero baseline is valuable, not embarrassing

The entire point of a build-in-public GEO lab is the curve, not the starting height. A site that can show "citation rate went from a measured 0% on 2026-05-31 to X% on a later date, and here is the experiment that moved it" has primary research no competitor can copy. That story is only possible if the zero point is recorded honestly today.

How to reproduce this snapshot

Every claim above is checkable: fetch https://geosalience.com/robots.txt, https://geosalience.com/llms.txt, and any article's .md alias; count the articles visible on the home page; read the launch entry in the project changelog dated 2026-05-31. Nothing here depends on private data.

Limitations

This is a snapshot of corpus and policy, not of measured traffic. The crawler and citation pipelines were not yet streaming at launch, so their day-zero values are "pending first run." The corpus is small and single-domain. Those limits are the reason exp-001 exists: to make the starting line explicit before anything is claimed to have moved.

Limitations

  • This is a corpus-and-policy snapshot, not a measured-traffic snapshot. Plausible, the AI-crawler log parser, and the citation harness are deliberately off or unrun at launch, so day-zero crawler hits and citation rate are 'not yet measured', not '0 confirmed'.
  • A small corpus (2 live articles) means later experiments have few comparable pages to use as matched control/treatment pairs.
  • Single domain. Everything here is about geosalience.com only.

Changelog

  • Published — 31 May 2026

Raw markdown: /lab/experiments/exp-001.md