The ghost of Mark V. Shaney

In 1984, an unusual user started posting in net.singles, a Usenet newsgroup where lonely people discussed dating, and he fit right in… sort of. He would respond to threads with sentences like “I have a great time to try to herd cats, and I’m not sure I agree with you.” Plausible. Slightly off, but weirdly confident.

The poster was named Mark V. Shaney, and he wasn’t human. “He” was a computer program, or bot, that was in essence the forerunner of today’s large language models (LLMs).

The little bot that fooled a newsgroup

Mark V. Shaney (a pun on “Markov chain“) was built by Rob Pike at Bell Labs, with the able assistance of Bruce Ellis and Don Mitchell. Its mechanism was straightforward: ingest a text corpus, such as the posts on net.singles, then generate new text by probabilistically choosing words that are likely to follow the previously-chosen words. Repeat until you have enough semi-coherent sentences for a Usenet post.

Shaney had no understanding of meaning, the real world, or grammar. The bot only knew that after “I went to the,” the word “store” was more likely than “moon.” It produced sentences that felt like they were going somewhere, even when they weren’t. Posts had the cadence of thought—without the thought.

Net.singles users argued with Shaney for weeks before realizing “he” wasn’t human. It seems unbelievable now that anyone thought Shaney was a real person, but humans seem shockingly eager to anthropomorphize virtually everything. The 1967 conversational program ELIZA, developed at MIT by Joseph Weizenbaum, wasn’t even as sophisticated as Shaney—yet Weizenbaum’s own secretary was discovered baring her soul to the machine as though it were a trusted confidant.

Singing a different tune

Strip away the transformer architecture, the attention mechanisms, the billions of parameters, and the RLHF tuning—all the stuff that AI companies tout in their marketing—and the fundamental operating principle of a modern LLM is the same as Mark V. Shaney: predict the next word based on what came before.

And like Shaney, LLMs don’t actually know anything. They have merely “learned,” across an almost incomprehensible volume of text, what words tend to follow other words—and at a higher level, what ideas (as defined by clusters of words) tend to follow other ideas. The LLM doesn’t even know what the words or ideas mean, but it knows that some of them go together.

When you ask an LLM to explain quantum entanglement, then, it’s not retrieving a fact. It’s generating a sequence of words that, in its training set, plausibly follows your question.

Mark V. Shaney was trained on bigrams and trigrams—two or three word windows—from a single, albeit popular, Usenet newsgroup. A modern LLM works with context windows of hundreds of thousands of words and was trained on a sizable fraction of the written output of human civilization. The mechanism rhymes, but the scale sings a different tune.

The surprising power of scale

Mark V. Shaney was a clever trick. LLMs are the same trick writ large.

Shaney could vaguely mimic the surface texture of human writing. It could not write a working Python function, apologize that its previous sentence was not factual, or translate among fifty languages like a modern LLM can. Incredibly, such capabilities are not programmed into LLMs. They emerge.

It turns out that when you train a next-token predictor on enough data with enough parameters, the model doesn’t just learn word co-occurrences. It learns structure, syntax, logic, causality—even something like a theory of mind. Not because someone programmed that in, but because modeling language properly seems to also require modeling the world in which languages, and the minds that speak them, live.

It’s the closest thing to magic I’ve seen in forty years of being excited about technology. We struck rocks with lightning and taught them to think. (Okay, okay—”think.”)

Haunted by Shaney’s ghost

Mark V. Shaney’s crudeness pulls the curtain back, reminding us that LLMs are fundamentally pattern-completion engines. They are not databases or search engines, though they can fulfill some of the same functions. And they don’t think or reason, though they often seem to.

When an LLM “hallucinates” a fake citation, it’s not malfunctioning—it’s completing your prompt with a plausible continuation, exactly as it was designed to. At a glance, the false citation looks like a real citation. It has some proper nouns and some numbers in a reasonable order. Mission accomplished.

The fact that LLMs sometimes give us a glimpse of their true nature makes them all the more impressive, not less. It’s genuinely astonishing that “predict the next word, but really well” can result in a system that can pass the bar exam, debug code, and synthesize research papers. Given how they do what they do, it’s not surprising that LLMs hallucinate—it’s surprising that they don’t hallucinate more.

Pike’s little Usenet prank was on to something. It just needed another forty years and a few billion parameters to reach its potential. Today, every LLM is haunted by the ghost of Mark V. Shaney.