Arun Pandian M

Arun Pandian M

Android Dev | Full-Stack & AI Learner

Left Null Space — The Error Your Model Cannot Learn

At some point a model stops improving, but not in a dramatic way. The loss doesn’t blow up. It doesn’t fluctuate. It simply settles.

You tweak the learning rate. You train longer. You restart with better initialization.

Nothing changes.

The model isn’t unstable — it has just reached the edge of what it can represent.

What remains isn’t a training issue anymore. It’s a structural limitation.

Linear algebra gives that leftover error a name: the left null space.

What’s Actually Happening

https://storage.googleapis.com/lambdabricks-cd393.firebasestorage.app/left_nullspace.svg?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=firebase-adminsdk-fbsvc%40lambdabricks-cd393.iam.gserviceaccount.com%2F20260225%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20260225T015135Z&X-Goog-Expires=3600&X-Goog-SignedHeaders=host&X-Goog-Signature=022f4bcb87c35aaa2c72fb1b92f857e9801df54372fc62726976d0dcd06b6efb45c96c17c352c29f1d90cdca46014cc9397aced5c2784257af2c86bab66dc4b4f7da1205822563b1c98936f239a6ca58edd12af6e4798085691ab3154e0ffe2cd7785dcda662988b8e5cee416331d1a30bf394bbff1f9c24712468085db9d225ccf430ab399a7a3295193181abb0cb8b07997ac668b214f3829f428b7a9d75beb42dc95d552af0a5e0839cc574334a724111695caf85f889517021ee4303789d95a2cadd232ec9275c60f11c5a929378fa157ff3363442d5742820c458d86848f3391d3df7bb69df7a590e3656bef4b1c9bbc69d88470edbda7171d8ace008b4

We usually picture training as a simple loop:

change the weights → predictions change → error reduces

But here the pattern shifts:

change the weights → predictions move slightly → the error remains

The optimizer is still running, updates are still happening, yet progress feels cosmetic.

At this point you’re not really improving the model anymore — you’re moving inside the space it already understands.

The remaining difference isn’t due to bad training. It exists because the model has no way to represent it.

A tiny math example (no fear)

https://storage.googleapis.com/lambdabricks-cd393.firebasestorage.app/left_nullspace_sample.svg?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=firebase-adminsdk-fbsvc%40lambdabricks-cd393.iam.gserviceaccount.com%2F20260225%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20260225T015135Z&X-Goog-Expires=3600&X-Goog-SignedHeaders=host&X-Goog-Signature=9a31ac568e293cfcab72ba9536e2381729b86fab9015a46b8f434cadf526fe66d9aba903b6acc4779279f390bca887dfa52a76bded1a09cc18ee080db2175f022f7bf84039413f28f4b34042a25fee9b765844868e67f1fc65696c5e687333ec0c2a867fd53148fa3a3e355d561657815df9e252412d68e879aa4bbd3f53d516fa5cff39105479d73b2b1083e5e3fc0bb94469687550b00208934b279f35a36d644c658b8a2bb0afa680e26288dc455af0cb6183ad3d811b21e9041766ab1eb66fdbfbaa6cf00f6f5988a5a4336a6ac93f9f4342e42b2ed5dd9ab3e268f3d6770b47c0641c819f72dd10fcb7d0e201418e06d53a38030ad14e1543049b869938

When Ax Can’t Reach b

Let’s imagine we collect two features for every data point:

the first value x and another value y

At first it looks like we have two independent measurements. But after looking closer, we notice something interesting:

the second feature is always twice the first.

y=2xy = 2x

So although the data appears two-dimensional, it really isn’t. All points sit along a single direction — like dots drawn along a straight path

Writing the data as a matrix

We can place a few samples together into a matrix:

A=[122436]A = \begin{bmatrix} 1 & 2 \\ 2 & 4 \\ 3 & 6 \end{bmatrix}

This matrix represents the inputs given to the model.

The model learns weights:

x=[w1w2]x = \begin{bmatrix} w_1 \\ w_2 \end{bmatrix}

And predictions come from multiplying them:

AxAx

What the model is actually capable of

Carrying out the multiplication:

Ax=[w1+2w22w1+4w23w1+6w2]Ax = \begin{bmatrix} w_1 + 2w_2 \\ 2w_1 + 4w_2 \\ 3w_1 + 6w_2 \end{bmatrix}

If you look carefully, every row shares the same pattern. We can rewrite it as:

Ax=(w1+2w2)[123]Ax = (w_1 + 2w_2) \begin{bmatrix} 1 \\ 2 \\ 3 \end{bmatrix}

This reveals an important limitation:

no matter how the weights change, the model can only move along one direction.

Training can slide predictions forward or backward on that line, but it cannot leave the line

When reality asks for more

Now suppose the true output is:

b=[51120]b = \begin{bmatrix} 5 \\ 11 \\ 20 \end{bmatrix}

We would like the equation

Ax=bAx = b

to have a solution.

But that would require b to lie on the same line as the model’s predictions. It doesn’t.

So no choice of weights can reproduce it exactly.

What training really does

Instead, the model settles for the nearest possible output — we’ll call it b̂, the best approximation it can produce.

The remaining difference

r=bb^r = b - \hat{b}

never disappears — not because training failed, but because the model has no way to express it.

That leftover gap is the part of reality outside the model’s language. In linear algebra, it lives in the left null space.

Real life version

Imagine predicting salary using:

  • years of experience
  • months of experience
  • But:

    months = 12 × years

    Your model really only has one degree of freedom.

    Now HR introduces a performance bonus.

    Your training keeps running — but the error never disappears.

    The model isn’t lazy.

    It literally has no way to represent “bonus”.

    That missing expressiveness is the left null space.

    A calmer analogy

    Fitting a straight ruler onto a curved road.

    You slide it. Rotate it. Press harder.

    Eventually nothing improves.

    You didn’t stop optimizing — you reached the limit of what a straight line can do.

    The remaining gap is the left null space.

    Why this matters in ML

    This explains moments where:

  • loss plateaus early
  • model underfits perfectly
  • adding epochs changes nothing
  • bigger models suddenly work
  • Because training adjusts parameters. but architecture decides possibility. The left null space is where reality lives outside your model’s language.

    The quiet takeaway

    Null space means:

    the model can move without changing behavior

    Left null space means:

    reality can change without the model being able to follow
    The optimizer stops when gradients vanish. But learning stops when you hit the left null space.
    #LinearAlgebra#MathBehindAI#MachineLearning#AIFoundations#VectorSpaces#LeftNullSpace#IrreducibleError#ModelCapacity#LeastSquares#DataGeometry#LearningLimits#AIConcepts