Binding Affinity Is Not a Context-Free Label

By Blaise AI Team

One of the stranger habits in molecular ML is treating binding affinity as a clean scalar function of ligand plus protein sequence.

As if the reported number exists independently of everything around it.

In real drug discovery, binding is measured inside an experimental setup shaped by substrate or cofactor concentration, competitive context, protein construct, buffer conditions, incubation time, assay format, and readout artefacts. The observed affinity is not just a property of the molecule and the protein. It is a property of the molecule, protein, and assay context together.

That distinction matters because it changes what a useful model should aim to predict.

The question projects actually ask

Medicinal chemistry programmes are usually not asking: What is the universal true affinity of this ligand-protein pair?

They are asking: Within this assay and this series, which compound is better, and what should we make next?

That is a ranking problem far more often than it is an absolute-value problem. The team has a specific assay, a specific protein construct, a specific set of conditions. The compounds being compared all went through the same experimental regime. What matters is relative ordering inside that regime, not the recovery of some context-free ground truth.

The finance analogy is useful here

Useful financial models usually do not try to predict the absolute stock price directly. They model returns, excess returns, or relative moves conditional on market context. Absolute price is entangled with a huge amount of background state that makes direct prediction unreliable and hard to act on.

Binding data are similar. A pIC50, Ki, or Kd is not a context-free truth handed down from nature. It is a measurement produced under a particular experimental regime. The reported number carries the fingerprint of the assay as much as it carries information about the molecule.

That is why relative ordering within assay is often more meaningful than pretending there is a single universal affinity value to recover from sequence and structure alone.

What the useful model actually does

For real discovery work, what matters is usually more specific than headline accuracy on a heterogeneous benchmark. Can the model rank analogues correctly within this series? Can it identify which substitution improved potency in this assay? Can it help the team choose the next compounds to make?

Those questions are local, conditional, and decision-driven. They are also the questions that determine whether the model changes behaviour on a project or just publishes well.

In molecular ML, we often reward models for claiming universality — broad coverage, mixed datasets, cross-target transfer. That sounds impressive. But medicinal chemistry is local. The assay context shapes the measurement. The series constrains the question. The decision sits inside a specific experimental regime with its own biases and quirks.

The useful model is not necessarily the one that predicts a supposedly absolute affinity best across heterogeneous datasets. It may be the one that understands context well enough to rank compounds correctly where decisions are actually being made.

You Might Also Like