All Hail the Magic Conch!

The era of "asking the model" and vibe research

Prelude

Deep in the depths of the ocean, in Bikini Bottom, there was one Club Spongebob1. High up on a tall tree, sits the clubhouse, occupied by a sponge and Patrick.

1

Season 3, Episode 42a. Watch the episode please.

On this fine day, Squidward exits his Moai to see these two barnacle heads laughing it up in their treehouse. Of course, there is nothing more irritating than their giggling. And when Squidward was told he wouldn't "fit in" their club, he couldn't take it. He had to join! And so up he went.

But once inside, he found out that, indeed, he couldn't physically "fit in" the clubhouse and was now squished next to Bob and Patrick.

Squidward, desperate to escape, reaches for a branch and tries to pull himself down.

As he's almost down, the branch rips apart, and the treehouse is launched into the sky (sea). The clubhouse flies through the sea and comes crashing down in the thick Kelp Forest.

Oh no, Squid is trapped. Lost. Hundreds of miles from civilization.

He starts to panic. How can he escape these two morons now?

But luckily, Squidward is part of Club Spongebob, and they have one trick up their sleeve: the Magic Conch.

The Magic Conch is here to save the day!

Squidward: You've got to be kidding! That is just a stupid toy! How can that possibly help us?!

SpongeBob: [gasps] Squidward! We must never question the wisdom of the Magic Conch! The club always takes its advice before we do anything!

Patrick: The shell knows all!

SpongeBob: Oh, Magic Conch Shell. What do we need to do to get out of the Kelp Forest?

Magic Conch: Nothing.

Patrick: The shell has spoken!

Squidward: Nothing?! We can't just sit here and do nothing!

Squidward: I can't believe you two are gonna take advice from a toy!

Just do what the Conch tells you to do.

Squidward thinks he's smarter than the Magic Conch, so he runs and runs to escape the forest, only to find out he's running in circles and is right back at Club Spongebob. He sets up a camp with the resources he has while SpongeBob and Patrick continue to listen to the Conch.

Squidward is certain he's smarter than the Conch followers

Right as Squidward was about to enjoy his roasted sea insect, a miracle falls from the sky, right into the camp of SpongeBob and Patrick! What a gift from the Magic Conch!

The gift from the Conch has arrived!

As SpongeBob and Patrick enjoy their feast, Squidward begs to be allowed to touch the food. They inform Squid that only the Conch can approve his request. And so he asks the Conch, again and again, "may I have something to eat?"

But he didn't give the Conch its due earlier, so the Conch only gives him a simple "no". Squidward goes crazy as the Conch seems to be aware of his dismissal of its powers.

And then, someone cuts a path into the forest!

The Kelp Forest ranger has arrived to save Squidward!

Squidward screams in delight — someone is here to save them!

But to Squid's dismay, the ranger also has a Conch of his own.

Kelp Forest Ranger: All right, Magic Conch... what do we do now?

Magic Conch: Nothing.

SpongeBob, Patrick, and Kelp Forest Ranger: All hail the Magic Conch!

Sound Familiar?

And so they sit, in silence, in devotion to the Conch, until another miracle falls from the sky. Perhaps, if they ask the Conch again and again, it will give them the answer to their problem. Perhaps, a new version of the Conch will do better than the old one.

Whenever you have a question, don't think, just ask the Conch2. And whatever the Conch tells you, is what you should parrot to others and follow diligently.

2

The Conch = a large language model (henceforth called "the model")

But, just because it worked for SpongeBob, doesn't mean it will work for you. There is a growing mass of people who believe the Conch will enable them to do nothing, and that good things will fall out of the sky.

The Situation

There has been a lot said over the past two decades about how electronics can sap away our thinking powers. When humans lose the ability to be bored and produce something from nothing, their mental faculties decay. I believe Cal Newport put this phenomena into the public consciousness with his book "Deep Work"3 where he discusses how electronics, especially phones, have put humans in an unprecedented situation where they can go through life and never be bored. Whatever attention span degradation phones have already caused has been or will soon be dwarfed by the advent of the model.

3

I started hearing about the importance of boredom from 2016 or so when Newport's book was published

In the past two years, in addition to the phone, which can prevent even a few seconds of boredom from setting in, we now have "the model". As people begin to use the model, they start by asking it a few questions about topics that they would have used a search engine for in the past. However, as model dependency grows, people begin to outsource their thinking and even thoughts wholesale to the model. I need not dwell on this point too much, since others have made it much better4.

4

See the article "The End of Thinking" by Derek Thompson

Bimodal Undergrads

I've been in the university system for a long time. Almost certainly too long. Every year, I get a sample of the current class of undergrads to examine — both from teaching classes and from advising them on research. Year after year, the mean quality of undergrads has degraded — the average undergrad is increasingly motivated by money, status, and job prospects rather than any intrinsic interest in computer science. However, this trend about averages says nothing about the extremes.

The mediocre undergrads are crashing to the level of the barely literate ones, while the elite undergrads can rival senior grad students in programming competence, inquisitiveness, and instinct. Just in the past year, I've seen a few truly spectacular undergrads the likes of which I haven't seen before. They can work autonomously, ask all the right questions, learn on the fly, and somehow have enough time to be at the top of their classes and do self-driven research. I know, that when I was an undergrad, I was very far below their level.

You can probably predict why this bimodal shift has occurred. It's the model.

What's happening is that even the 80th percentile undergrads are falling victim to model dependency. The majority of undergrads are not only preempting their boredom via their phones, but they are also preempting their thoughts via the model. We often see undergrads that produce reams of code from the model, but are unable to explain what is going on, and more importantly, what are they trying to do.

The elite undergrads are always in control of their own thoughts. They use the model as just an enhanced search engine, which gives them confidence to enter new areas. Their knowledge compounds rapidly as they use the model to pull information, but use their own brains to synthesize it.

Thinking in "Embedding Space"

After talking about undergrads, I must now talk about those 'above' me: the professor class. If you thought professors were speaking in meaningless platitudes before, you haven't seen anything.5 I swear, if I could peer inside the head of a typical professor, I'm sure all I would find is a vector database.

5

There are still some intelligent professors, but among the mediocre, the recent degradation has been extreme

I often joke that professors (and the model) think in "embedding space". What I mean to say, is that they think as if embeddings are semantics. I can't blame them. After all, if you ask the model, it will claim that embeddings capture semantics.

When they speak, they will put words together that appear close in the embedding space, but are actually unrelated with respect to their real semantics. This is just a matter of training data: both for the professor and the model. Without enough good quality data, embeddings will tend to just capture which words occur together, rather than how they are related. Of course, a professor should be able to think with grounding, so I can give the model a pass here.

Biological Frontends for Mr. Model

What could be worse than people becoming dependent on the model? What if people became the model? Indeed, this is what I witness nowadays. Professors have become biological frontends for the model.6

6

I don't mean to bash on professors too much. All professions have degraded thusly.

When a student asks for advice, the questions are forwarded to the model, and the model's response comes out of the professor's mouth. The professor asks the model for research ideas, and then recites the responses to their students. When reviewing papers for a conference, the PDFs are fed straight into ChatGPT Pro 7, and the outputs are massaged into the HotCRP boxes.

7

It's research-grade intelligence after all

Context Pollution

Models suffer from context pollution, where garbage can accumulate in its context window that distracts the model from the task its supposed to perform. Being able to tell what the relevant context is, while discarding the rest, is an essential aspect of intelligence.

If you look at the managerial class in general, they are increasingly falling victim to context pollution. They attend all kinds of useless meetings and conferences where the executive class jumbles together words that have no relation to each other. Their heads are filled with words from these continuous meetings and, when they are prompted to discuss something with their subordinates, they bring up unrelated nonsense from their context window. Buzzword speak has become more and more ubiquitous.

"Asking the Model" as Research

Day after day, I will look at my Google Scholar notifications and will see the same paper repeated 100 times. Here are some recent examples:

All of these thousands of papers just amount to "asking the model" in a loop, with some scaffolding, some tool use, some "prompt engineering", and perhaps some training data for supervised fine-tuning. Perhaps some will conduct a beam search across many samples provided to them by Mr. Model. Perhaps some will construct a new dataset from thin air, which they pose as a "benchmark".

Of course, this LLM mania isn't limited to computer architecture papers. We see this in all disciplines, where the 'solution' proposed by a paper is "asking Mr. Model", and the 'problem' is whatever fake problem they invent.

Furthermore, due to the high latency of conference paper deadlines and review cycles8, by the time a paper is published, its "results" are already suspect. The authors will have run all their experiments with Mr. Model v7 and by the time the world sees their work, Mr. Model v8 has been released, and large parts of their specialized 'prompting' techniques and scaffolding are made obsolete.

8

It will usually take 6+ months from running experiments to public paper release

It is just too easy for a professor to latch onto the "ask the model" methodology and enjoy the feeling of being part of the hype. At this point, "asking the model" has become its own area of research. Why bother using the model as a productivity booster or powerful search tool to produce impactful research, faster9, when you could just ask Mr. Model to write a bunch of kernels and report that?

9

You will find that Mr. Model becomes useless very quickly when exploring an untouched area

My final point is that academics should work on projects that have substantial intellectual, logistical, and financial risks; risks that are so high that industry researchers would not take on such projects. What is the risk here? That Mr. Model may not always be correct? Why are academics "asking the model" when there are VCs investing billions of dollars and accepting infinite risk for startups to pursue "LLM for X"?10

10

I can appreciate that some academics use papers as a launching pad for a startup. That's fine I guess

"Asking the Model" as a Course

Not only has "asking the model" become its own area of research within an engineering domain (e.g. RTL design, GPU performance engineering), but "asking the model" more generally has become an acceptable topic for courses. Let me go over three examples from Stanford, Berkeley, and Harvard.

Stanford

CS329A: Self-Improving AI Agents is being offered for the second time at Stanford.

The course will start with self-improvement techniques for LLMs, such as constitutional AI, using verifiers, scaling test-time compute, combining search with LLMs, and train time scaling with RL. We will then discuss the latest research in augmenting LLMs with tool use, code, and memory, and orchestrating AI capabilities with multimodal interaction. We will next discuss multi-step reasoning and planning problems for agentic workflows, and the challenges in building robust evaluation frameworks.

It all sounds very fancy, but it boils down to asking the model in a loop. I believe this paper (The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery) is emblematic of the method being promoted. If this method can produce 'papers' that are so 'high-quality' that even humans would accept them to a ML conference, then we should be questioning the value of what humans are working on.

With that being said, the reading list for this class isn't bad at all. These papers are interesting and worth skimming just to understand the state of the field. All I ask is that the students don't make their class projects about getting Mr. Model to solve some problem.

UC Berkeley

CS294: Disrupting Systems Research with AI is being offered for the first time at UC Berkeley.

We are now at the beginning of a significant shift, where a new class of AI tools can autonomously generate algorithms that match and sometimes exceed the best human-designed solutions.

This course explores the frontiers of this new methodology, examining the future role of the researcher as a "strategic advisor" who guides powerful AI assistants rather than manually engineering solutions.

This is quite similar to the Stanford class, albeit a bit more hyped up. The proposed methodology boils down to "vibe research". Ask the model to propose grand changes to some existing codebase. Then, ask the model to produce some code to manipulate a repo and see what happens. Just go with the flow.

"The shell knows all! The shell has spoken!" — Patrick Star

Harvard

And now, this is the most egregious example of model-brained nonsense by far.

Presenting, CS249r: Architecture 2.0, being offered at Harvard for the first time. I urge the reader to read the "blog posts" on the website, such as: "Week 2: The Fundamental Challenges Nobody Talks About". It is very obvious that all these "blog posts" are written by Mr. Model. In fact, the entire website is generated by Mr. Model.

I couldn't have said it any better myself. See Reddi's slides here.

This class is about a concept that Prof. Vijay Janapa Reddi has coined: "Architecture 2.0". All this boils down to is generating tons of "data"11, training some models on that data, and hoping for the best. After all, if this method has yielded good results in image classification and English emission, then surely the same method will yield fruit in computer architecture. It's just a matter of more and more data, and more and more compute: this is the "Bitter Lesson" at work.

11

"Data" includes random RTL designs, gate-level netlists, PPA estimates, instruction traces and so forth: the "corpus" of computer architecture.

Astute readers will note that Reddi's "Architecture 2.0" is just a ripoff of Andrej Karpathy's "Software 2.0". But recently, at YC's "AI Startup School", Karpathy presented his talk, Software Is Changing (Again), where he coined "Software 3.0"!

Karpathy's slides on Software 3.0

So, Prof. Reddi, Architecture 2.0 has already been superseded! It should be time for Architecture 3.0: aka "ask the model".

The model explains Architecture 2.0 (really should be Architecture 3.0)

Architecture 3.0 methodology in practice. Ask the model (in a loop).

A Better Path

As was discussed before, the focus should not be on asking the model and evaluating how it does, but rather on using the model to build better software. I recently saw a good example of this from Mark Ren's group12 called SALUTION. They aren't presenting "asking the model" as the goal itself, but instead asking the model to produce a better SAT solver, which is the main contribution. They can discuss the specific elements of the code that the model generated to produce a faster solver. The focus is on the product, not the model.

12

Mark Ren and his team at NVIDIA Research are one of the few competent actors in the field of "ML for CAD"

Our new framework, SATLUTION, autonomously evolves Boolean Satisfiability (SAT) solvers via LLM agents that outperformed the 2025 SAT Competition champions by more than 10%

It would be good if more work like this was done. Work as a domain expert who is trying to improve some algorithm or work on a specific problem and use the model (who cares how you're using it) to help you make things better.

Conclusion

Just as one should hesitate when picking up their phone to mitigate a moment of boredom, one must hesitate before shooting a query at the model. Think! Losing your boredom is bad enough, but losing your sovereignty is even worse. There is a huge risk that biological general intelligence will dry up way before AGI can come on the scene to save us.

You know what? Perhaps I'm wrong. Perhaps "asking the model" is the most useful, impactful, and important thing we can all do today.

In the next few years, I may be the one saying:

Squidward: All hail the Magic Conch