The Return of Parallel AI (And Its Infrastructure Costs)

In the early hype phase of any new wave, we’re swamped with exciting breakthrough news. It was so in 1993, it is so today, and it will probably always be so. Let’s take a closer look at something I just stumbled over in a recent X-post.

Carlos E. Perez posted about this Microsoft Research paper: “The Era of Agentic Organization”. The paper introduces AsyncThink – AI learning to parallelize work across multiple agents, acting like an elite project manager. The tech press is celebrating it as a paradigm shift.

I recognize this paradigm. I implemented it in 1993 with a 14.4k modem and whatever Sun Sparc workstations I could commandeer at night.

The 1993 Version

My diploma thesis tackled evolutionary algorithms for combined vehicle routing and 3D bin-packing – two NP-hard problems stacked together because apparently I enjoyed suffering. This wasn’t academic masochism; these problems actually appear together in real logistics operations. But the computational requirements were brutal: 100,000+ simulation runs just to find viable parameter spaces.

Parallelization wasn’t a clever optimization. It was mandatory.

The simple approach that worked:

Late evening, dial into the university network via modem, distribute simulation batches across every idle Sparc workstation in the CS building, go to sleep, collect results at sunrise, analyze, iterate.

The overhead was minimal – a shell script, some file management, basic process coordination. The gains were substantial. With the right parametrization, my genetic algorithm held the best-known TSP heuristic benchmark for six months. Not bad for distributed compute running on borrowed infrastructure.

The sophisticated approach that didn’t:

We had an Occam Transputer system. Actual parallel hardware, cutting-edge for the time, and absurdly expensive – more than the entire Sun Sparc cluster combined.

The problem? Nobody could map the complex, evolving solution populations to the Transputer’s minimalist array architecture. The coordination overhead consumed any potential gains before they manifested. It became a very expensive reminder that parallel hardware and parallel algorithms need to match at a fundamental level.

It never computed a thing.

What AsyncThink Rediscovered

The Microsoft paper describes AI learning to fork sub-tasks to worker agents, coordinate their execution asynchronously, and integrate results dynamically. They trained it with reinforcement learning that rewards both correctness and concurrency – literally teaching the AI to hate idle workers.

The results are impressive: 28% faster than previous parallel methods, more accurate, and demonstrating abstract transfer learning to new problem types.

This is genuinely clever work. The technical achievement is real.

But here’s what I learned in 1993 that AI researchers are about to learn in 2025:

Parallelization isn’t free. It’s a cost multiplier wearing a capability costume.

The Infrastructure Reality

Every “worker” in AsyncThink is another inference call. Every forked sub-task is more compute. Every async coordination adds memory overhead. The paper celebrates escaping the “slowest consultant” bottleneck by creating more consultants that all need to run simultaneously.

Let’s be specific about what this means:

Simple sequential thinking: One inference call, start to finish
Parallel voting methods: N inference calls running concurrently, pick the winner
AsyncThink: 1 organizer call + N worker calls + M integration calls + ongoing coordination overhead

The 28% speedup is measured against wall-clock time. But the total compute consumed? That’s scaling multiplicatively.

This isn’t solving the constraint. It’s discovering it at scale.

The Pattern Recognition

1993: Parallel computing is the future. Distribute work across cheap workstations. Solve bigger problems. Democratize access to supercomputing power.

Reality: Companies did the math. Most never deployed parallel systems. The ones that did were large research institutions and enterprises that could afford dedicated infrastructure and specialized talent. Consolidation happened quietly.

2025: Parallel AI agents are the future. Fork worker LLMs. Organize intelligence. Democratize access to super-team cognition.

Prediction: Give them time. They’ll do the math.

The Positioning Question

This connects directly to the infrastructure constraints we’ve been documenting. AsyncThink isn’t evidence against the bubble analysis – it’s confirmation of the pattern.

Every capability breakthrough that promises to “democratize” AI actually raises the resource floor. The gap between “technically possible” and “economically deployable” keeps widening.

If you’re evaluating AI strategies that assume costs decrease as capabilities increase, AsyncThink just moved you further from reality.

The question for decision-makers isn’t whether agentic AI works. The question is: Who can afford to run coordinated agent systems at production scale?

That list is getting shorter, not longer.

The Wheel Turns

I never finished my PhD. The internet happened, and academic timelines couldn’t compete with the pace of actual deployment.

But I kept the lessons from that diploma thesis. One of them was this: parallel systems have been “the future” multiple times. Each time, the technical breakthrough arrives first. The infrastructure bill arrives second. Then consolidation.

The AsyncThink paper is excellent research. The technical insight is sound. Teaching AI to organize intelligence is a legitimate advance.

But it’s solving a cognitive bottleneck while accelerating into a physical one.

We’ve seen this movie before. Some of us even wrote the script.

P.S. The “Organizer” role in AsyncThink is itself another LLM inference call with coordination prompts. So you’re paying for management overhead too. At least my shell script coordinated the Sun Sparcs for free.

P.P.S. That Transputer taught me that sometimes the most expensive hardware teaches the most valuable lesson: sophisticated doesn’t mean deployable.

COTOAGA.AI

by cotoaga.net