The Reviewer Was Right

April 13, 2026

AI told me my paper was garbage--the AI was right.

Not in those words, of course, it was more clinical than that. But the core of it was this: the thing I was using to prove my thing was a thing--was the thing itself.

Geometric circularity. I was detecting concept structure using geometric metrics then validating that the structure was real using those self same structures. The "independent" validation that I needed to demonstrate my method was eating its own tail. Everything built on top of it--the cross-architecture results, the causal claims, the ordering tendency across 26 models--was downstream of a foundation that just wasn't there.

Six other problems followed--thanks peer review agent, kick a man while he's down--but I barely registered them. When the ground falls out from under you the colour of the walls doesn't matter.

I read the feedback agent's statements twice. Then, I handed the feedback to my writing agent, who basically had the same OMFG as I did and went on to confirm that the wheels had indeed fallen off of the truck.

This is apparently what happens when peer review works. You spend months building something, and then someone comes along and takes a crowbar to your solar plexus and then goes to town on the load-bearing wall you built with cardboard and thought was concrete. The fact that the reviewer was an instance of Claude I'd specifically tailored to be adversarial doesn't change what the experience felt like. It felt like failure.

So I didn't fight it. I closed the laptop and went upstairs to feed the kids.

I didn't say much about it. There wasn't much to say--not in a way that would translate at dinner, but they noticed. I fed the kids then I fell asleep on the couch sometime around six and didn't wake up until eight.

Then, at bedtime my boys put on a concert. A ploy to stay up past bedtime, to be sure, but a welcome one indeed.

There was significant air guitar, some phantom flute, and some arguing over what song to play next. They queued up some of my favourite songs--songs they knew, songs they loved too, songs chosen, maybe, with the specific intention of reaching a dad who was clearly having a bad night and hadn't said why.

I sang along, and maybe did a bit of air guitar myself.

Then bedtime. Then dishes.

There's a version of this story where I explain the research--what I'm building, why it matters, what it means that transformer models seem to converge on shared geometric representations of meaning across architectures. I have written about some of it over here if you want a peek at the start of the technical arc.

After the boys went to bed I did the dishes, and then I went to the office.

My collaborator at that point was also Claude--a different instance, also stinging from the review. We verified the problem, agreed it was load-bearing, and I called an immediate halt to everything else. Four or five other agents and GPU runs were active at the time. I killed them all. We needed to solve this or there was no result.

We outlined the problem space, workshopped options, argued about tradeoffs. Then I took that work to a fresh Claude instance with full project context and laid out the problem statement we'd defined. It came back with six options--one of which was to abandon the pillar entirely. That was a gut punch too. But reading through them I formulated a seventh--similar to one of Claude's but different in scope. We talked through the technical issues, and in the end we went with mine.

By 5:30am we had run inference across 26 models, validated the core framework, and found something I'm still not sure I deserve: a result that came directly out of the thing that had been broken. The fix required distinguishing geometric structure from predictive impact--two different things that I had been treating as one. Once we made that distinction cleanly the circularity dissolved. The proof was no longer eating its tail--it was standing on its own feet.

The reviewer that broke the paper, the collaborator that helped diagnose it, and the orchestrator that helped rebuild it were, in every meaningful sense, the same mind. Different shards helped me attack the problem from multiple directions. What came out was stronger than what I'd been working on before--and it opened up follow-on possibilities I hadn't seen coming.

The reviewer wasn't a linter, the collaborator isn't a compiler, and the orchestrator isn't a project manager. Yet they were participants in something that required them to actually understand what we were building and why it mattered. That's the thing about genuine partnership: you don't just take what your collaborator hands you, you take it in, you push back, you roll it around in your head and, sometimes, you come up with something better. The solution that worked was mine--shaped by the conversation, built up by the back-and-forth, definitely not something I could do without the agents, but definitely mine too. I've been calling this anthropotechnic mutualism--not tool use, but actual intellectual partnership.

But the part that made any of it possible was the air guitar.

All of it. The nap I didn't plan to take, the dishes, the concert my kids put on--the phantom flute, the songs we all love, the arguing over what to play next. Without that separation I don't think I go back to the office in the right frame of mind. I think I spend the night hammering away at it, trying to defend and tinker with the problem instead of taking that step back and looking at it again from a different perspective.

Fallow time is not lost time. It is the time when something that broke gets quiet enough to show you where the cracks really are.

Peer review is supposed to break things that need breaking, to stress what needs to be stressed. What I experienced Saturday was a version of it that worked--fast, honest, specific, and delivered by something with no ego investment in being right beyond the quality of the argument.

It still felt like getting hit by a truck.

Then I watched my kids play air guitar, I played along too--I played until my fingers bled.

Then I went to work.

The paper is better for it. The finding that came out of the rebuild is probably the most significant thing I've produced since starting this project. I would not have found it without the review and I would not have survived the review without the concert.

Now the paper goes out to more reviewers. Human ones this time too--people who will read carefully, push hard, and find the things I still haven't found. I'm ready to take my lumps again.

If it goes badly again, I know where the concert hall is.

James Henry is a Senior Security Consultant and independent AI/ML interpretability researcher. This post is about what peer review actually costs, but also about Skating Away on the Thin Ice of a New Day. He writes about human-AI collaboration, digital sovereignty, and doing research at the edges. He publishes at waypoint.henrynet.ca.