AGENTUPDATE JOURNAL

1000usdinchina.com Dev Retrospective (8) - Multilingual SEO & GEO: Sitemaps, llms.txt and AI-Citation

1000usdinchina.com Dev Retrospective (8) - Multilingual SEO & GEO: Sitemaps, llms.txt and AI-Citation
Table of Contents

The final step before launch wasn't a feature — it was being findable. Not just by Google, but by ChatGPT, Perplexity, and AI Overviews. This post covers multilingual SEO and GEO (generative engine optimization) for a four-language edge app: hreflang, sitemaps, robots, llms.txt, AI-citation readiness, analytics, and the maintenance discipline that keeps it all in sync.

This is post 8 — the finale — of the series.

Table of contents

Multilingual SEO foundations

Four languages means every page exists in four variants, and search engines need to know they're translations, not duplicates. hreflang declares the relationships:

<link rel="alternate" hreflang="en" href="https://1000usdinchina.com/en/..." />
<link rel="alternate" hreflang="ja" href="https://1000usdinchina.com/ja/..." />
<link rel="alternate" hreflang="ko" href="https://1000usdinchina.com/ko/..." />
<link rel="alternate" hreflang="x-default" href="https://1000usdinchina.com/en/..." />

The currency coupling from post 2 matters here too: each localized page shows local currency, which makes the translation genuinely useful to that audience — a real signal, not a machine-translated shell.

sitemap, robots, and llms.txt

Three files tell crawlers what exists and how to read it:

flowchart LR
    C[Content: 100 cities × 4 langs] --> SM[sitemap.xml]
    C --> RB[robots.txt]
    C --> LT[llms.txt + llms-full.txt]
    SM --> G[Google / Bing]
    RB --> G
    LT --> AI[ChatGPT / Perplexity / AI Overviews]
  • sitemap.xml — every route in every language, generated automatically.
  • robots.txt — crawl rules; the middleware allow-lists the right root paths (including the search-console verification files).
  • llms.txt + llms-full.txt — a curated, machine-readable summary of the site for AI crawlers. llms.txt is the concise index; llms-full.txt is the expanded version. This is the GEO equivalent of a sitemap: it tells an LLM what your site is and what's worth citing.

GEO: getting cited by AI engines

SEO gets you ranked; GEO gets you quoted. When someone asks an AI "how much does a week in China cost," you want your numbers to be the cited answer. What helps:

  • Structured data (FAQ, Article schema) so engines can extract clean Q&A pairs — exactly what every post in this series ends with.
  • Quotable, self-contained statements — a clear sentence that answers the question without needing the surrounding page.
  • llms.txt pointing at the canonical, aggregate facts (the compliant data from post 3).

GEO and SEO reinforce each other: clean structure and clear answers rank in Google and get cited by AI.

Auditing with an SEO skill

Audits run with an SEO skill rather than ad-hoc checks, and two lessons shaped how:

  • Fetch raw HTML, not a rendered proxy. A naive fetch can swallow <head> tags — exactly the title, meta, and hreflang you're auditing. Read the raw HTML so the audit sees what crawlers see.
  • Measure performance with the project's own Lighthouse CI rather than an external API that needs a key. The performance gate from post 7 doubles as the SEO performance check.

The hand-maintained surfaces

Here's the maintenance trap. When you add a city or a module, some SEO surfaces update automatically and some must be edited by hand:

Updates automatically Must update by hand
sitemap.xml llms.txt / llms-full.txt
robots.txt site-stats / hero counts
RSS the 4 locale copy files (e.g. "100 cities")
meta descriptions

Miss one and you ship a contradiction — the homepage says "100 cities" in English but a stale "67 cities" in Japanese, and a CI surfaces-check catches it red. The discipline: a checklist of hand-maintained surfaces that runs every time the city count or module set changes. (This is exactly the kind of institutional knowledge that lives in a skill.)

Analytics and search-console submission

The launch checklist closes with measurement and submission:

  • Google Analytics 4 via gtag.js, loaded with a lazyOnload strategy so it doesn't blow the Total Blocking Time budget (the GA-vs-TBT fight from post 7).
  • Submit to Google Search Console and Bing Webmaster Tools, with the verification files allow-listed at the site root so the crawlers can confirm ownership.

With sitemaps submitted and analytics live, the site is officially findable — and the loop from content → sitemap/llms.txt → Google + AI engines is closed.

Key takeaways

  • For multilingual SEO, declare translations with hreflang and make each locale genuinely useful (local currency), not a machine-translated shell.
  • sitemap + robots serve search engines; llms.txt + llms-full.txt serve AI crawlers.
  • GEO is SEO's sibling: structured data and quotable statements get you cited by AI engines.
  • Audit on raw HTML (not a head-swallowing proxy) and reuse Lighthouse CI for performance.
  • Keep a checklist of hand-maintained surfaces (llms.txt, locale copy, meta) so counts never contradict across languages.
  • Finish with GA4 (lazy-loaded) and Search Console / Bing submission.

FAQ

What is llms.txt and do I need one? A machine-readable file that tells AI crawlers what your site is and what's worth citing — a GEO counterpart to sitemap.xml. If you want AI engines to represent your content accurately, it's worth having, alongside an expanded llms-full.txt.

What is GEO (generative engine optimization)? Optimizing content to be cited by AI engines like ChatGPT, Perplexity, and AI Overviews — via structured data, quotable self-contained statements, and llms.txt — rather than only ranking in classic search.

How do you do SEO for a multilingual site? Declare translations with hreflang (plus x-default), generate a sitemap covering every language, localize meaningfully (including currency), and keep hand-maintained surfaces in sync across locales.

How do you submit a site to Google and Bing? Add the verification file (allow-listed at the site root), then submit your sitemap in Google Search Console and Bing Webmaster Tools.


Back to the start → The product and business story · Try it: 1000usdinchina.com