There is an awkward conversation we have been having almost daily with clients since the beginning of the year. The question comes in different words but the gist is always the same: “how do I get ChatGPT to tell the right story about what we do when someone asks about us? It is a reasonable question. In January 2026, according to Searchlab, Google AI Mode surpassed 75 million daily users and AI Overviews appeared in 48% of searches, with CTR drops hovering around 60-65% when your site is not the cited source. Classic SEO is not dead, but it now competes with a new chapter: appear within the answer, not below it.
In this context, a tiny plain text file has crept into every technical marketing conversation: llms.txt. And as is always the case with any standard that promises to “talk to the AI”, there is a lot of noise around it. Let's take it one step at a time, without cheating.
What exactly is llms.txt
llms.txt is a proposed standard published by Jeremy Howard at the end of 2024, hosted at llmstxt.org. The idea is simple and elegant: a file in Markdown format, placed at the root of your domain (https://tudominio.com/llms.txt), which summarises the gist of your site in a language that a large language model can read in a few tokens.
Unlike robots.txt, which tells crawlers what they can and cannot crawl, llms.txt serves another function: gives the AI a curated summary of your site and links to the pages that you consider canonical for understanding your brand. It's not a permission or a block. It is a map.
The format is deliberately austere. A headline H1 with the name of the site, a prominent quote (blockquote) with the most important description - that phrase is, in practice, the summary that a model can paraphrase when someone asks what your company is about- and, underneath, thematic sections with links to resources. Services, success stories, documentation, pricing policy. Whatever you decide to display.
What real adoption looks like in May 2026
So much for the theory, which is beautiful. What happens in practice is more nuanced and should be told without make-up.
An analysis published by Search Engine Journal of some 300,000 domains found that just over 10,13% had published an llms.txt. The same study did not detect a clear effect on the frequency with which these sites were cited in AI responses. Some analysts, such as Kai Spriestersbach, have written outright that the standard is a fiasco, pointing out that neither OpenAI, Google nor Anthropic request such a file systematically. GPTBot requests it occasionally. IDE agents such as Cursor or Continue do use it with some consistency.
And yet serious brands in the industry have implemented it. Anthropic, Hugging Face, Hugging Face, Perplexity, Vercel, Stripe and Mintlify have llms.txt published on their domains. The reason is consistent with what we see in any emerging standard: the cost of having one is close to zero, the cost of not having one when the ecosystem eventually consolidates may not be so much.
So the honest reading as of May 2026 is this: llms.txt is not, by itself, a shortcut to be quoted by ChatGPT. But it is the first standardised convention of what we're already calling B2A (Business-to-Agent). In other words, a public surface designed so that non-human agents can understand your brand without having to digest your homepage, your menu and three animated dropdowns.
Why models don't always read it (and why do they do it anyway?)
To understand why it makes sense to invest time in a file whose immediate impact is uncertain, one has to look at two vectors.
The first is the actual behaviour of crawlers. Today most of the training for the big models - and the data they retrieve in hot response - comes from your normal site: public HTML, sitemap, structured data and, increasingly, feeds that the companies themselves (Reddit, Wikipedia, media) negotiate directly. The llms.txt doesn't replace any of that. It complements it.
The second is the moment when the agents start sailing for you. And that is already happening. Perplexity's Comet, ChatGPT Atlas and Claude's agentic modes visit pages, read, decide. In that flow, finding a short Markdown file with a clear description of what you are and links to what is important is objectively useful for the model. It doesn't guarantee you a recommendation; it reduces the friction so you can give it.
As we say at Vandelay when a client asks us if it makes sense: is like having a good Google Business listing. No one can guarantee that you will come out on top, but to go out without it is to play the game barefoot.
How a useful llms.txt is constructed
It's not the length that counts, it's what you put in. These are the blocks we recommend to our customers, in order of priority.
1. A single founding sentence
The one that goes in the blockquote under the H1. This is the phrase that the model will paraphrase if someone asks “what is this company? If your homepage says ”integral solutions for your transformation“, rewrite it. You need something verifiable, specific and compact. For example, instead of ”digital marketing agency in Barcelona“, something like ”digital marketing agency in Barcelona specialised in Meta Ads, Google Ads and SEO campaigns for B2B professional services companies, with more than ten years in the sector“.
2. Thematic sections with canonical links
Services, success stories, team, editorial policy. One section per topic, with a maximum of two or three links in each. Each link accompanied by a one-line description explaining what the model will find if you follow it. The clearer and less lyrical, the better.
3. An Optional section at the end
Designed for secondary material: blog, glossaries, internal documentation. It allows an agent with time and tokens to go deeper, without saturating the main reading of the archive.
4. Zero hollow advertising language
Models paraphrase. If your llms.txt is full of “innovation leaders” and “disruptive solutions”, that's what will appear in the response the end customer reads. And that's exactly what you don't want.
Mistakes that can cost you visibility
We have seen them in recent audits and they are all avoidable:
- Block AI bots in
robots.txtwhile you presume to have llms.txt. It's classic incoherence. If you have decided to appear in replies, let theGPTBot,ChatGPT-User,PerplexityBot,ClaudeBotyanthropic-ai. - Serve the file with the wrong MIME type. Some servers return
application/octet-streamdefault for.txt. It should be served astext/plain; charset=utf-8and, while we're at it, check that it returns a clean HTTP 200, with no redirects in between. - Linking pages that require login or change URLs every few months. The model will follow those links and, if it encounters 404 or paywalls, it will simply desist.
- Forgetting to keep it. An out-of-date llms.txt is worse than no llms.txt at all. If it says that you sell eight services and you only offer five on the web, the response the AI will give will be contradictory to your own site.
What changes in your job the day you upload it
The most interesting thing about this whole exercise is not the file itself, but what it forces you to do to get it right. Most of the companies we work with discover, when writing their llms.txt, that are not able to explain in a single sentence exactly what they do. Or that they have three different pages describing the same service with different words. Or that the “success stories” section has not been updated for three years.
That forced inventory is, in itself, the main deliverable. The file is uploaded by your developer in five minutes. It is the conversation beforehand - what we are, what we do, what we want the AI to repeat about us - that is worth the whole exercise.
If you want to check if your site is ready for this new layer of visibility, at Vandelay We include the llms.txt audit and tuning of your B2A surface within our technical SEO service. And if you just want to see how ChatGPT is responding today when someone asks about your brand, we do that too: in fifteen minutes of work you can see if you have a problem or not.