Vibecoding Three.js in 2025: when AI hit the wall of 3D

By autumn 2025 the AI coding wave was in full swing, but it had a shape. Everybody was vibecoding CRUD apps, landing pages, scripts that glue two APIs together. Solved problems with a million examples on the internet for the model to have eaten. What almost nobody was doing was pointing an AI at 3D in the browser. Three.js, WebGL, shaders. That was still considered specialist territory, the kind of thing you hand to one person on the team who "does the graphics."

I was doing it anyway. Around October 2025 I built an interactive 3D LED wall visualizer, a tool to lay out LED panels in space and actually see the result. That project has its own story and this isn't it. This is about the experience of vibecoding 3D back when 3D was the part the models were worst at.

Where the models were genuinely strong

Let me be fair first, because the honest version isn't "AI can't do 3D." It could do a lot, and fast.

The boilerplate was a gift. Setting up a scene, a camera, a renderer, a resize handler, an animation loop, orbit controls. That ceremony is the same every time and the models knew it cold. I'd describe what I wanted and a working scaffold appeared in seconds. Same with the surrounding app: UI panels, state, file import, the React glue around the canvas. All the stuff that isn't really 3D, just software, was as smooth as any other vibecoding.

They were also good at the vocabulary. Ask how to instance a few thousand identical meshes without killing the framerate and you'd get InstancedMesh with a sensible explanation. The API surface of Three.js is well documented and well represented in training data, so for "which method do I call" questions, the models were a faster manual than the manual.

Where it fell down: the math doesn't lie

And then you ask for something that requires the model to actually reason about space, and the floor gives way.

Coordinate systems and orientation. This was the daily fight. Y is up, except when something is rotated, except when the imported geometry assumed Z up, except when the parent transform already flipped it. The model would confidently hand me code that put a panel facing the wrong way, or mirrored, or rotated ninety degrees off, and it would explain the wrong answer with total fluency. It didn't have a mental picture of the scene. It had a plausible-sounding pattern.
3D math. Anything with quaternions, matrix order, or converting between local and world space was a coin flip. The code compiled, ran, and produced something subtly, infuriatingly wrong. The bug was never a crash. It was an object two units to the left, or a rotation that drifted, and you only catch that with your eyes.
Shaders. This was the deep end. GLSL is its own language with its own gotchas, and the models could write something that looked like a shader and even compiled, but the logic inside, the actual math turning coordinates into colors, was frequently nonsense. Debugging a shader is already miserable because you cannot console.log a fragment. Debugging a shader the AI wrote, that it does not understand either, is a special kind of lonely.
Spatial reasoning in general. "Make the panels curve around a center point." "Tilt the whole array so it faces the audience." Those are one sentence to a human standing in a room and a genuinely hard prompt for a 2025 model, because it has no room to stand in. It is pattern-matching geometry it cannot see.

So you herd it

The mode that actually worked was not "describe it and accept the output." It was herding. I would let the model do the plumbing, then take the wheel for anything spatial. The loop looked like this: ask for a small piece, run it, look at the result, and feed back exactly what was wrong in physical terms. "The panel is facing away from the camera." "It is rotated, the top edge is on the left." Concrete, visual corrections, not "fix the math."

The other survival trick was keeping the model's job small. Do not ask for "the curved wall layout." Ask for "given this panel position, return the rotation that makes its normal point at the origin," verify that one function in isolation, then move on. The smaller the spatial step, the better the odds. The bigger the spatial leap, the more confidently wrong the answer.

And you had to know enough to catch it. This is the part the hype skips. I could only herd the model because I could read the scene and tell when an object was subtly off. Someone with zero 3D intuition vibecoding the same tool in 2025 would have shipped something that looked almost right and was quietly broken, with no idea why. I catch those because I have stood in front of enough real LED walls on site to know which way a panel is supposed to face.

Why it was worth doing anyway

Rough as it was, I would make the same call again. Because even with all the herding, the math fights, the shader misery, the net result was that one person built an interactive 3D tool in a fraction of the time it would have taken alone. The models could not do the spatial reasoning, but they erased everything around it, and that everything is usually what stops a side project before it starts.

The deeper point: in autumn 2025, doing this at all meant working slightly ahead of where the tooling was comfortable. The whole conversation about AI coding was flat, two dimensional, web apps and scripts. 3D was a corner everyone assumed was too hard to bother with. It was harder. It was also completely doable if you were willing to be the spatial reasoning the model did not have.

That is usually where the interesting stuff is: not where the models are already good, but one step past it, where you still have to bring something they cannot. If you have a 3D idea sitting in a drawer because "AI cannot really do that yet," that is exactly the reason to take it out.