Anyone who's done machine learning knows gradient descent. You have a function you're trying to minimize. You can't see the whole landscape. You can only feel the slope under your feet. So you take a step downhill. Check the slope again. Step again. Repeat a few million times and you end up in a valley. Not necessarily the deepest valley. Just the one you found by following the local slope from where you started.

Finding product-market fit works the same way. You have a product and a market. You can't see the whole landscape either. Nobody can. You only know what's happening with your current users, your current positioning, your current price. So you make a change. Check the signal. Revenue moved. Retention didn't. Try something else. Check again.

Every founder does this. Most don't think of it as optimization, but it is. You're searching for a minimum in a function you can't see, guided by noisy signals, one step at a time. And just like in gradient descent, the step size matters. Pivot every week and you never learn from where you are. Polish the same feature for six months and you're optimizing a dead end.


For gradient descent to work, the function needs to be differentiable. You need to be able to feel the slope. In machine learning, people spend enormous effort making sure this property holds. Smooth loss functions, careful architectures, tricks to keep gradients flowing.

The startup function is barely differentiable. The signal is incredibly noisy. You change the pricing and nothing happens. You change it again and half your users leave. You tweak the onboarding and retention jumps, but you also changed the landing page that week so you don't know which one did it. There are flat regions where nothing you do seems to matter, and cliffs where everything moves at once. And the landscape itself keeps shifting under you. Markets change, competitors move, a pandemic hits.

In ML, that's a nightmare scenario. In startups, that's Tuesday.


But this isn't just about startups. Zoom out and the same structure is everywhere.

A company is a function. Thousands of input variables: who you hire, what you build, how you price it, which market you target, how you talk about it. Thousands of output dimensions: revenue, retention, morale, reputation, speed, culture. The relationships between inputs and outputs are non-linear and non-obvious. Hiring a great engineer doesn't add one unit of output. It might change nothing, or it might unblock three teams and shift the whole trajectory. You won't know which until months later.

A person is a function too. What you work on, who you spend time with, what you read, how you sleep, what you say no to. The outputs are your career, your health, your relationships, your sense of whether any of it means anything. Same properties. Non-linear. Non-obvious. High-dimensional. You're adjusting inputs all the time, hoping the outputs move in the right direction.

And in every case, you're doing the same thing. Feeling for the slope. Taking a step. Checking what changed. Gradient descent on a function you can barely see, let alone understand.


Here's where it gets hard. In machine learning, you know your loss function. You defined it. You can compute it exactly, at every step, for every input. The whole system is built around this: you know what you're optimizing for, and you can measure it directly.

In real life, you almost never can. The thing you actually care about lags behind. Did we build the right product? Ask again in eighteen months. Was that a good hire? Give it a year. Is this the right strategy? You'll find out when the market moves.

You can't wait that long. So you pick proxies. Revenue instead of value created. Engagement instead of satisfaction. Quarterly growth instead of long-term compounding. And you optimize those, because they're what you can measure now.

This works until it doesn't. The proxy drifts from the truth, slowly, and the moment it stops meaning what you thought it meant is usually invisible. Because you're watching the proxy. Revenue is up but users are quietly leaving. Engagement is up but people hate the product and can't stop using it. The numbers look like convergence. The reality is divergence. You're descending beautifully into the wrong valley.


The gradient descent analogy is useful but it flatters the process. It makes finding product-market fit sound systematic. Feel the slope, take a step, converge. In practice, half the landscape is dark. The signals are lagging. The proxies are lying. And sometimes the ground shifts between one step and the next through no fault of your own.

There is no playbook. If there were, everyone would follow it and PMF would be an engineering problem. It's not. Luck is a real input variable and anyone who's found PMF and tells you otherwise is either lying or didn't notice.

But luck is not the whole story either. Great artists are luckier than average. Also more prolific, more disciplined, more systematic about their process than most engineers. They produce volume. They iterate fast. They throw away more than they keep. They have taste about what to keep and what to kill. And they've built that taste through years of paying attention to what works and what doesn't.

Finding product-market fit is the same kind of work. Part engineering, part art, part luck. More similar to making something beautiful than most founders would admit. And more similar to gradient descent than most artists would admit.