Inference Comes Home

February 26, 2026

Everyone Gets Their Own Fridge


I sat with my son last night watching a Veritasium video about Frederic Tudor, the Ice King of Boston.

In 1806, Tudor started shipping ice from New England ponds to the Caribbean. Everyone thought he was insane. Ice to the tropics? It would melt before it arrived. But he figured out insulation--sawdust, double-hulled ships--and built an empire. By 1833 he was shipping ice to Calcutta. A global logistics network for keeping things cold: ice houses, shipping routes, local distribution, the whole apparatus of what we'd now call a supply chain.

Watching this with my son, I realized: this is happening right now with compute.


The New Ice Kings

Google, AWS, Microsoft, Twitter/X. These are the current ice kings. Their moat isn't capability. It's knowledge asymmetry: data capture, search control, compute ownership, and right now even access to purchasing hardware.

Google decides what gets found. AWS is the substrate half the internet runs on. Azure and Microsoft control enterprise dependencies so deep that switching costs are measured in years. Twitter/X controls what public discourse gets amplified--and what disappears. These are infrastructure monopolies, and their shape is the same as Tudor's: massive capital investment, logistics for delivery, and a dependency relationship where you're beholden to infrastructure you don't control.

The model companies--OpenAI, Anthropic, the inference providers--are the commercial ice plants of the 1900s: an intermediate step, disrupting the old gatekeepers while the real disruptor is still being built. Models are commoditizing fast. That's not where the monopoly lives.

The Long Melt

The timelines rhyme. Tudor started shipping in 1806. By 1833 he was sending ice to Calcutta--the outer edge of his empire, and a remarkable feat of logistics. When local mechanical ice-making arrived in India in the 1850s, his Calcutta trade collapsed within years. The distant monopoly died almost immediately once a local alternative existed. You didn't need refrigerators in every home. You just needed one ice machine per locality to make the business model dicey.

The full arc is longer. The ice trade peaked in the 1880s--second-largest US export, 25 million tons cut annually, 90,000 workers. The modern equivalent is aerospace: $134 billion a year, the pride of American manufacturing and technology. Imagine it gone in twenty years. By 1914, manufactured ice had overtaken natural ice. By the 1930s the natural ice industry was effectively dead. But refrigerators in 80% of American homes didn't arrive until 1955--nearly 150 years after Tudor's first shipment. The monopoly breaks fast at the margin. The long tail to ubiquity is very long, but we know the digital world runs faster than glaciers melt.

The inference API era is maybe three years old at scale. We're somewhere in the 1840s of this arc.

Where Inference Resides

The anxiety is real: "what if Anthropic is down or cuts me off?" In an outage situation, it's "What do I do now?" Without access to inference not a lot of my current workload gets done. The standard comeback is: "What if the power goes out?" In both cases, it’s a temporary annoyance--recoverable, but it reminds you exactly who owns the switch.

The Ice Kings are still there, though: what if Google decides your content doesn't rank? What if AWS goes down and takes half the internet with it? YouTube demonetizes your platform? The ice company can cut you off. Your livelihood runs on infrastructure you don't control.

But something is shifting.

Taalas built ChatJimmy.ai. Purpose-built inference chips--fast enough to run frontier-class models on dedicated hardware. Every person I've shown ChatJimmy to swears the same--it feels like the answer is there before they hit submit on the question. This is an ice maker showing up in Calcutta. Eventually this is a box that sits in your home or place of business. Cheaper to run than traditional inference, faster, and scalable. The downside is that the model is hardcoded...but yeah, I don't have all the super fancy options on my fridge either. It makes ice.

Tudor's monopoly didn't break when every home had a refrigerator. It broke when one ice machine showed up in a city--enough to undercut his local pricing. The monopoly is fragile at the margin, not at ubiquity.

And, eventually, everyone gets their own fridge.

What Local Inference Will Do

We don't have the ubiquitous home hardware yet, but we have the behavior. We are already acting like we have the fridge.

While editing this draft, I had local changes that weren't committed to git. I had also pushed updates and changes to the same file from a research location elsewhere. To reconcile that without losing my work by hand the task isn't trivial: stash the local edits, pull the new commits, rebase, pop the stash back--then spend twenty minutes resolving conflicts. Multiple operations, and done in the wrong order leads to data loss. Get it wrong and you lose your edits, not just some time.

To solve it, I typed three words to Claude Code: "stash, pull, rebase."

Saved working directory and index state WIP on main
Updating 3198238..5a347e1
Fast-forward
Auto-merging working/draft--post-5-infrastructure.md
Dropped refs/stash@{0}

Thirty seconds. No conflicts. Claude Code did what would take a human 20 minutes to refactor, 'which line got changed', 'where does this go'? All done by the agent.

This isn't a party trick. In an earlier post, I wrote about abandoning my self-hosted mail server--a setup I'd run since the early 2000s--because Google blacklisted my IP and I couldn't get a human to review it. I had the technical ability to fight it. I chose not to. The complexity was real, but so was the dependency, and I had stuff to do. I chose convenience over sovereignty.

What cheap, local inference collapses isn't just cost. It's the threshold for what you can manage yourself. Tasks that required deep technical expertise, a platform relationship, an outsourcing arrangement, or just a bit of time to figure out something a little technical--with inference these sorts of things are becoming tasks you just ask for and are done in moments.

I'm still running on cloud inference--I don't have a (good) local fridge yet. Most people in 1930 didn't have refrigerators either--and they could still see that the Ice King's days were numbered. The trajectory was legible before it was universal.


Taking Back the Workload

Your home router is already a server. It runs Linux (or in my case, a version of BSD), makes routing decisions, processes every packet flowing in and out of your house, probably hosts a website--all of this twenty-four hours a day. You own it. It never turns off. You've already got an ice box.

Many homes have more: the old laptop that became "the house machine," a NAS for backups, a Raspberry Pi running Pi-hole. That's a refrigerator, already in your kitchen. The infrastructure base is already widely distributed--we're just not using it for inference yet.

The question is what happens when you add inference to it.

The git example above is a discrete task--ask, execute, done. Useful, but still the old model: you invoke, the capability responds. Always-on local inference changes a different category of workload.

When inference runs on your hardware, your agents don't wait to be asked. They run on your schedule, with your data, for your benefit: summarizing incoming mail before you wake up, monitoring your finances for anomalies, cross-referencing your calendar against your commitments. The workload is continuous because the compute is continuously available. You're not renting cycles per query--you're running your own plant.

Broadband made the same shift. Dial-up was transactional: connect, do the thing, disconnect. That shape constrained what the internet was used for. Broadband didn't just make things faster--it made presence continuous, and that changed what was even conceivable. Streaming didn't fail on dial-up because of bandwidth. It failed because you had to actively maintain a connection to use it.

There's also a category of workload that never made it to the cloud: tasks too sensitive to outsource. Your full financial picture. Medical records. Correspondence. The agent that could synthesize your email and tell you what matters is technically possible on Gmail--but most people won't run it there, not for abstract reasons, but because that data trains a model someone else may sell to your competitors, your insurer, or whoever buys the platform next. So you don't. The work goes undone.

Local inference makes those workloads possible. The loop--query, inference, response--never leaves your house. The reasoning happens where the data lives.

The inversion matters: right now, you control when AI does work for you. Always-on local inference reverses the dependency. You set the policy; your agents use your compute according to it. Less like invoking a tool. More like employing a staff--they work while you sleep, on hardware you own, reporting to no one else. Sovereign.


Where Credibility Lives

The previous post in this series on credibility ended with a question: where does credibility data live?

If it lives on a centralized platform, you're back to capture. Google becomes the new gatekeeper. If every agent collects independently, there's massive duplication and no portability. The interesting option was a commons--credibility governed by protocol, not owned by any platform.

But who runs it? Who pays for it? The refrigerator dissolves this problem.

If inference is local, verification is local. Your agent doesn't "phone home" to check a fact; it checks the source. It queries the originator directly--the signed claim, the personal domain, the immutable record. In this model, the "truth" doesn't live in a Google cache; it lives with the person who earned the credibility.

The data lives where it originates: with the people who earned it. Not as a piece of property to be locked away--we already admitted the fences are gone--but as a signed record that can be verified without a gatekeeper's permission.

Whoever controls credibility data controls the information asymmetry. That's been the pattern through this whole series:

The answer at every layer is the same: collapse the information asymmetry. Make information symmetric.

Local inference is the next piece. Once you can verify without renting someone else's compute, you're not dependent on their infrastructure to participate in the credibility layer.


The Honest Hedges

I don't want to oversell this.

The third enclosure still holds at the training layer. You can run inference locally, but training frontier models still requires datacenters, capital, concentrated data. The infrastructure capture I described in The Third Enclosure hasn't broken--it's just that inference is starting to escape.

Home compute costs have spiked--and keep spiking. I've been meaning to build a new home server for months. I haven't because the hardware I'd need has gotten significantly more expensive since I started planning it. The RTX 5090 launched at $2,000 MSRP in January 2026 and was selling for $3,500 within weeks; further increases are already announced. AI datacenters are consuming roughly 20% of global DRAM production, and manufacturers have redirected capacity toward high-bandwidth memory for enterprise. That has starved the consumer market. NVIDIA and AMD have both implemented phased price hikes through 2026, with more coming. The fridge exists, but right now it's in the "luxury appliance" phase, not the "every kitchen" phase. That changes--it always changes--but timing matters and the trajectory isn't as fast as the hype.

Ubiquity is decades away. ChatJimmy-speed inference in every home isn't happening next year. Maybe not this decade. The trajectory is visible, but we're early. Restaurants and factories got fridges first, not homes.

Centralization tendency is physics, not a bug. Even with sovereign infrastructure, convenience drives consolidation. Someone will offer to host your identity for you--and most people will accept. The architecture enables sovereignty; it doesn't guarantee it.

But the refrigerator is coming. And every person running local inference is one less customer renting ice. One more node that can verify claims without asking permission. One more person who controls their own credibility data.


This is part of an ongoing series exploring what happens when knowledge stops being property. Previously: Knowledge Was Always Free. Copyright Is Dead. Copyleft Is Too. The Third Enclosure. The Credibility Commons.


James Henry is a senior engineer watching refrigerators show up in kitchens. He works with LLMs liberally--including in the writing of this post--because the collaboration is the point.