Is llms.txt the same as robots.txt?

No. The two files serve opposite purposes. robots.txt is an access-control file, it tells crawlers which paths are off-limits, and well-behaved crawlers respect those Disallow rules. llms.txt is a content-discovery file, it tells LLMs which pages are most worth reading and pre-summarises them in markdown so the LLM does not have to crawl, parse, and chunk the entire site to figure out what matters. Both live at the root domain, both are plain text, both are public, but robots.txt is gatekeeping (what NOT to fetch) while llms.txt is curation (what TO fetch, ranked).

Do LLMs actually read llms.txt yet, or is it speculative?

Adoption is real and growing. Perplexity, Claude (when browsing is enabled), and ChatGPT’s browsing mode all check for /llms.txt when answering URL-grounded questions as of late 2024 / 2025, you can verify this by watching the network panel in Perplexity Pro during an answer involving your domain, or by checking your access logs for User-Agent strings like PerplexityBot or ClaudeBot fetching /llms.txt. Not every LLM does it yet, and not every browsing model does it on every query. The cost of shipping the file is near-zero (a single CLI command after installing the module), and the upside is meaningful and growing, this is the textbook case for a low-risk experiment.

Does Magento support llms.txt out of the box?

No, Adobe has not shipped a built-in llms.txt module as of mid-2026, and there is no native llms.txt admin section in Commerce or Open Source. The two practical paths are: (1) install the mage2kishan/module-llms-txt extension used on this site, which provides an admin config section, a CLI generator, and a frontend route at /llms.txt; or (2) write a custom controller that builds the markdown from a hard-coded list of URLs. Option 1 is what 95% of sites should do, the extension handles the canonical URL filtering, observer-triggered regeneration, and per-type limits that a hand-rolled controller will get wrong on the first revision.

Should I list every product detail page in llms.txt?

No, curate. List hero / flagship / lead products (the 20-40 SKUs that represent the brand and pull the most search demand), not the full catalogue. The whole point of llms.txt is to be the curated, signal-dense answer to “what does this site sell?” If you dump every SKU into the file, you defeat the purpose, you bloat the file past the 50KB ceiling, dilute the signal, and force the LLM to either truncate the tail or skip the file entirely. For the verbose, exhaustive variant, use the companion /llms-full.txt file. The Panth_LlmsTxt extension exposes max_products in admin precisely so you can pick the cap (somewhere between 20 and 80, depending on catalogue shape).

How is llms.txt different from a sitemap.xml?

sitemap.xml is an exhaustive list of URLs intended for indexing search crawlers, it includes every canonical URL the site wants Google / Bing to know about, in machine-readable XML, with lastmod / changefreq / priority attributes that hint at crawl scheduling. llms.txt is a markdown-curated map of “what matters” with a human-written one-line description per entry. sitemap.xml is for indexing crawlers that will visit each URL; llms.txt is for LLMs that may only read the file itself, never visiting the individual URLs, the description on each line is what gets folded into the LLM’s answer. Different audience, different format, different curation logic, both useful, neither replaces the other.

Can I block specific pages from appearing in llms.txt?

Yes. The mage2kishan/module-llms-txt extension has an exclusion admin config under Stores → Configuration → Panth → LLMS.txt → Exclusions where you can list URL patterns to suppress, useful for landing pages that are ad-campaign-specific, gated content, or deprecated URLs you have not yet 301’d. For hand-rolled implementations, simply do not include the URLs when generating the markdown. Note that excluding from llms.txt does not exclude from search crawlers, if you want to block from both, you also need a robots.txt Disallow and a noindex meta tag, since the three files target different audiences.

Magento glossary

What is Magento llms.txt ?

Magento llms.txt wires the llms.txt v0.1 proposal (Jeremy Howard / Answer.AI, 2024) into your storefront via an extension such as mage2kishan/module-llms-txt. The module emits a curated markdown file at /llms.txt, H1 site title, blockquote tagline, H2 section per content type, bulleted markdown links per page. LLMs (Perplexity, Claude, ChatGPT browsing) read it as a structured summary instead of crawling and chunking the whole site. Analogous to robots.txt for search crawlers and sitemap.xml for indexers.

Need llms.txt setup help? How it works

Written by Kishan Savaliya, Adobe Certified Magento & Hyvä developer, 8 years on platform

At a glance

Standard llms.txt v0.1 proposal (2024, Jeremy Howard / Answer.AI)
Location /llms.txt at site root: HTTPS, no auth, no robots block
Extension mage2kishan/module-llms-txt : auto-emits from CMS + categories + products + URLs

How it works

Five steps from composer-install to a live /llms.txt file

llms.txt is a plain markdown file at the site root, nothing exotic. The work is in deciding which pages to include, keeping the file under the LLM context ceiling, and wiring auto-regeneration on content change. Here is the end-to-end flow.

01

Composer-install the llms.txt module

Run composer require mage2kishan/module-llms-txt (or any compatible Magento llms.txt extension) to pull in the module. Then bin/magento module:enable Panth_LlmsTxt followed by setup:upgrade and setup:di:compile. The install adds an admin config section under Stores → Configuration → Panth → LLMS.txt, a panth:seo:llms-txt:generate CLI command, and a frontend route that responds at /llms.txt. The module supports Magento 2.4.4+ and is Hyvä-compatible out of the box because the output is a plain markdown file, no theme-layer dependencies.
02

Configure scope and per-type limits in admin

Open Stores → Configuration → Panth → LLMS.txt and tune max_cms (default 100), max_categories, and max_products to decide how many of each type get included in the output. Use the manual-additions field to force-include hero URLs (your homepage, top services, key landing pages) that might otherwise be ranked lower, and the exclusions field to suppress noise (paginated category URLs, faceted-filter combinations, internal-search results). Limits matter, the rendered file should stay under 50KB so LLMs do not truncate the tail.
03

Generate the markdown file

Run bin/magento panth:seo:llms-txt:generate (or the equivalent CLI in your chosen module). The generator crawls CMS pages, picks the top-N by hand-curated priority, formats each entry as a markdown bullet (- [Page title](https://example.com/page): description), groups them into H2 sections per content type (Services, Glossary, Products, Categories, About), and writes the result to pub/media/llms.txt. The structure starts with an H1 site title, a one-sentence blockquote tagline, then the H2 section blocks, the exact spec from the llms.txt v0.1 proposal.
04

Serve the file at /llms.txt on the root domain

The module ships a frontend controller that responds at /llms.txt with Content-Type: text/markdown; charset=utf-8. Alternatively, drop an nginx rewrite (location = /llms.txt { try_files /media/llms.txt =404; }) for cache-friendly static-file serving from pub/media/llms.txt. Either way the file must be reachable on HTTPS, must not require authentication, must not be blocked by robots.txt, and must not be behind a Cloudflare Bot Fight challenge, LLM crawlers will silently skip pages that 401 / 403 / 503.
05

Refresh on content change

Observer hooks on cms_page_save_after, catalog_category_save_after, and catalog_product_save_after trigger an asynchronous re-generation so the file stays current with editorial changes. A nightly cron entry rebuilds the file from scratch as a safety net in case an observer was suppressed during a bulk import. After every change, sanity-check with curl https://yoursite.com/llms.txt | head, the H1, blockquote tagline, and first H2 section should appear within the first 25 lines.

When to use

Four scenarios where shipping an llms.txt is the obvious next move

llms.txt is cheap to ship, one afternoon of work after the module install. These four scenarios are where the AI-citation upside is meaningful enough that it should be done this quarter, not next.

Magento stores with rich editorial / glossary / blog content

If your Magento site carries a meaningful glossary, blog, or knowledge-base section, the type of editorial pages LLMs cite as authoritative when answering a user’s question, a curated llms.txt tells the LLM exactly which pages matter. Without the file, LLMs fall back to web-scale crawling and chunking, which surfaces noisier URLs (paginated archives, faceted-filter pages) ahead of the editorial gems. With the file, the editorial pages get pole position in the LLM’s context window, which is the layer that gets cited.
Sites optimising for AI Overviews, Perplexity, ChatGPT, Claude

The llms.txt file is one of the few direct signals LLMs read at the site root, analogous to robots.txt for traditional search crawlers. Perplexity and ChatGPT browsing modes both fetch it as of late 2024-2025, and Claude’s browsing tool checks for it when answering URL-grounded questions. If your AEO / GEO strategy depends on being cited by AI search experiences, shipping a tidy llms.txt is the lowest-effort, highest-leverage move on the board, a single afternoon of work for an outsized return.
Service / agency / SaaS Magento sites with focused catalogues

Magento isn’t only for big-catalogue retail, plenty of agency, SaaS, and service-business sites run on Magento with a tightly curated product list where each item is a discrete answer to a buyer query. For those sites, llms.txt is the perfect shape: 20-40 high-signal URLs, each with a one-line description that pre-summarises the value proposition for the LLM. Compare that to a 5000-URL sitemap.xml, the LLM has no way to know which 30 URLs actually matter without the curation layer that llms.txt provides.
Sites running a brand-mention strategy for AI training pipelines

Brand-mention frequency in LLM training corpora correlates strongly with how often the LLM cites the brand in user answers. The fastest way to “introduce” a brand to the AI training pipelines without waiting for web-scale crawl coverage is a clean llms.txt file at the root domain, LLM crawlers prefer it because it lowers their parsing cost. The file is a self-contained, structured pitch for what the brand is and which pages summarise it best. Pair it with a focused content-marketing push and the citation rate climbs measurably within weeks.

Common mistakes

Three llms.txt mistakes that quietly kill the AI-citation lift

Most llms.txt failures aren’t catastrophic, they’re slow leaks. Audit your config and regeneration pipeline against these three before assuming the file is doing its job.

Letting the file balloon past 50KB

LLM context windows are finite and most ingestion pipelines truncate long markdown contexts at a soft ceiling around 50KB. An llms.txt that crosses that line starts losing its tail-end entries, which is usually where the long-tail glossary pages and recent blog posts live. Tune max_cms / max_categories / max_products to keep the rendered file comfortably under that ceiling, prioritising the highest-intent landing pages. If you genuinely need the verbose, exhaustive variant, use the companion /llms-full.txt file for that, keep /llms.txt tight.
Including session-specific or paginated URLs

Magento natively generates pagination and faceted-filter URLs (?p=2, ?___SID=U, faceted filter combinations like ?color=red&size=xl) that bloat sitemaps without adding signal. If those leak into llms.txt, the LLM’s context window fills with low-information URLs that all point to near-duplicate pages. Filter aggressively to canonical URLs only, the URL that rel=canonical points to, not the raw request URI. The Panth_LlmsTxt module pulls the canonical-tag URL by default; custom implementations need to do the same explicitly.
Forgetting to regenerate after content updates

A stale llms.txt is worse than no llms.txt: it tells LLMs about pages that no longer exist or have moved, which signals dead links and erodes the trust the file is meant to build. Always tie regeneration to the standard content-save observers (cms_page_save_after, catalog_category_save_after, catalog_product_save_after) OR a daily cron entry, ideally both, with the observer doing fast async regeneration and cron acting as a nightly safety net. After every Magento content release, curl the file and eyeball it; it takes 30 seconds and catches every drift bug.

FAQ

Magento llms.txt, frequently asked questions

Is llms.txt the same as robots.txt?

No. The two files serve opposite purposes. robots.txt is an access-control file, it tells crawlers which paths are off-limits, and well-behaved crawlers respect those Disallow rules. llms.txt is a content-discovery file, it tells LLMs which pages are most worth reading and pre-summarises them in markdown so the LLM does not have to crawl, parse, and chunk the entire site to figure out what matters. Both live at the root domain, both are plain text, both are public, but robots.txt is gatekeeping (what NOT to fetch) while llms.txt is curation (what TO fetch, ranked).
Do LLMs actually read llms.txt yet, or is it speculative?

Adoption is real and growing. Perplexity, Claude (when browsing is enabled), and ChatGPT’s browsing mode all check for /llms.txt when answering URL-grounded questions as of late 2024 / 2025, you can verify this by watching the network panel in Perplexity Pro during an answer involving your domain, or by checking your access logs for User-Agent strings like PerplexityBot or ClaudeBot fetching /llms.txt. Not every LLM does it yet, and not every browsing model does it on every query. The cost of shipping the file is near-zero (a single CLI command after installing the module), and the upside is meaningful and growing, this is the textbook case for a low-risk experiment.
Does Magento support llms.txt out of the box?

No, Adobe has not shipped a built-in llms.txt module as of mid-2026, and there is no native llms.txt admin section in Commerce or Open Source. The two practical paths are: (1) install the <code>mage2kishan/module-llms-txt</code> extension used on this site, which provides an admin config section, a CLI generator, and a frontend route at /llms.txt; or (2) write a custom controller that builds the markdown from a hard-coded list of URLs. Option 1 is what 95% of sites should do, the extension handles the canonical URL filtering, observer-triggered regeneration, and per-type limits that a hand-rolled controller will get wrong on the first revision.
Should I list every product detail page in llms.txt?

No, curate. List hero / flagship / lead products (the 20-40 SKUs that represent the brand and pull the most search demand), not the full catalogue. The whole point of llms.txt is to be the curated, signal-dense answer to “what does this site sell?” If you dump every SKU into the file, you defeat the purpose, you bloat the file past the 50KB ceiling, dilute the signal, and force the LLM to either truncate the tail or skip the file entirely. For the verbose, exhaustive variant, use the companion <code>/llms-full.txt</code> file. The Panth_LlmsTxt extension exposes max_products in admin precisely so you can pick the cap (somewhere between 20 and 80, depending on catalogue shape).
How is llms.txt different from a sitemap.xml?

sitemap.xml is an exhaustive list of URLs intended for indexing search crawlers, it includes every canonical URL the site wants Google / Bing to know about, in machine-readable XML, with lastmod / changefreq / priority attributes that hint at crawl scheduling. llms.txt is a markdown-curated map of “what matters” with a human-written one-line description per entry. sitemap.xml is for indexing crawlers that will visit each URL; llms.txt is for LLMs that may only read the file itself, never visiting the individual URLs, the description on each line is what gets folded into the LLM’s answer. Different audience, different format, different curation logic, both useful, neither replaces the other.
Can I block specific pages from appearing in llms.txt?

Yes. The <code>mage2kishan/module-llms-txt</code> extension has an exclusion admin config under <code>Stores → Configuration → Panth → LLMS.txt → Exclusions</code> where you can list URL patterns to suppress, useful for landing pages that are ad-campaign-specific, gated content, or deprecated URLs you have not yet 301’d. For hand-rolled implementations, simply do not include the URLs when generating the markdown. Note that excluding from llms.txt does not exclude from search crawlers, if you want to block from both, you also need a robots.txt Disallow and a <code>noindex</code> meta tag, since the three files target different audiences.

llms.txt audit

Want an llms.txt + AEO/GEO audit on your Magento store?

Send your storefront URL, I will check whether /llms.txt is served, validate the format against the v0.1 spec, audit the curation logic for noise, verify the regeneration hooks fire on content save, and confirm the file is reachable to Perplexity / Claude / ChatGPT crawlers. Written tuning plan, fixed-price quote, and earliest start date back to you in 24 business hours.

Get an llms.txt audit Hire me directly

What is Magento llms.txt ?

Composer-install the llms.txt module

Configure scope and per-type limits in admin

Generate the markdown file

Serve the file at /llms.txt on the root domain

Refresh on content change

Magento stores with rich editorial / glossary / blog content

Sites optimising for AI Overviews, Perplexity, ChatGPT, Claude

Service / agency / SaaS Magento sites with focused catalogues

Sites running a brand-mention strategy for AI training pipelines

Letting the file balloon past 50KB

Including session-specific or paginated URLs

Forgetting to regenerate after content updates

What is Magento IndexNow

What is Magento structured data

Magento 2 SEO optimization

Free Upwork profile review

Hire a Magento developer

Want an llms.txt + AEO/GEO audit on your Magento store?