# Baseline established — the day-zero snapshot

> Before any GEO experiment can claim a lift, it needs a zero point. This records the state of GeoSalience at launch on 2026-05-31: the corpus, the crawler policy, and exactly which metrics are — and are not yet — being measured.

Experiment: exp-001
Status: concluded
Metric: mixed
Canonical: https://geosalience.com/lab/experiments/exp-001
Window: 2026-05-31T00:00:00.000Z → 2026-05-31T00:00:00.000Z
Hypothesis: There is no hypothesis to test here. A baseline is the reference against which every later experiment's before/after is measured. Without a dated zero point, no future lift is credible.
What changed: None. No page was changed. This experiment is an observation of the launch state, not an intervention.
Treatment pages: none
Control pages: none
Result: At launch the live corpus was 2 articles (both cornerstone) plus 6 glossary terms; robots.txt welcomes the major AI crawlers and Cloudflare runs DNS-only, so bots reach the origin directly. AI-crawler hit tracking and the citation-rate harness were not yet streaming on 2026-05-31, so the measured baseline for both is 'pending first run' — expected near zero. This zero point anchors every later experiment.

---
GeoSalience launched on **2026-05-31**. This experiment records the zero point — the state of the site on day one — so that every later experiment has an honest "before" to measure against.

**Key facts at baseline (2026-05-31):**

- **Live corpus:** 2 articles, both cornerstone — [What is GEO?](/foundations/what-is-geo) and [The llms.txt spec: adoption and setup](/technical/llms-txt-spec-adoption-setup) — plus 6 glossary terms. Ten further articles exist as drafts, gated on primary research, and are hidden from listings.
- **Crawler policy:** `robots.txt` welcomes the major AI crawlers (GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, Google-Extended and others). Cloudflare runs in DNS-only mode, so those crawlers reach the origin directly rather than being filtered at the edge.
- **GEO surface live:** `llms.txt`, `llms-full.txt`, `.md` aliases on every article, Article/Dataset/Breadcrumb JSON-LD, canonical URLs, sitemap, and RSS/Atom/JSON feeds — all verified returning 200 over HTTPS at launch.

## What is measured at baseline, and what is not

This is the honest part. A baseline is only useful if it says exactly what it knows.

- **AI-crawler hits — not yet measured.** The nginx-log parser that turns crawler footprints into a daily dataset is scheduled but was not streaming on 2026-05-31. The day-zero record for crawler hits is therefore "pending first run," not "zero confirmed."
- **Citation rate (North Star) — not yet measured.** The 50-prompt, 4-LLM harness had not run a baseline at launch. The expected day-zero value is near 0%, but we record it as "pending," not as a measured 0%.
- **Traffic — off by design.** Plausible is env-gated off at launch.

We do not fabricate a number we did not collect. As each pipeline produces its first run, this page's changelog is updated with the real figure.

## Why a near-zero baseline is valuable, not embarrassing

The entire point of a build-in-public GEO lab is the curve, not the starting height. A site that can show "citation rate went from a measured 0% on 2026-05-31 to X% on a later date, and here is the experiment that moved it" has primary research no competitor can copy. That story is only possible if the zero point is recorded honestly today.

## How to reproduce this snapshot

Every claim above is checkable: fetch `https://geosalience.com/robots.txt`, `https://geosalience.com/llms.txt`, and any article's `.md` alias; count the articles visible on the home page; read the launch entry in the project changelog dated 2026-05-31. Nothing here depends on private data.

## Limitations

This is a snapshot of corpus and policy, not of measured traffic. The crawler and citation pipelines were not yet streaming at launch, so their day-zero values are "pending first run." The corpus is small and single-domain. Those limits are the reason exp-001 exists: to make the starting line explicit before anything is claimed to have moved.