Better Assay Sequencing May Matter More Than Better Affinity Prediction

By Blaise AI Team

There is a familiar kind of ambition in molecular ML.

Make the affinity model a little better. Squeeze out another gain in correlation. Improve ranking accuracy by a few points. Publish the comparison table.

None of that is pointless.

But in a lot of live projects, a bigger lever sits somewhere else entirely: which assays get run, in what order, on which compounds.

That is assay sequencing.

It often matters more than people admit.

A project does not run one assay in isolation

Real programmes move through a sequence of measurements. Potency, selectivity, permeability, solubility, microsomal stability, binding panels, PK, and eventually in vivo. The order in which those questions get asked changes what the team learns, how much it spends, and which compounds keep moving.

That means the decision problem is not simply “predict affinity better”.

It is also “which measurement buys the most useful next update right now?”

Those are different problems. The second one is often closer to the operational bottleneck.

Bad sequencing can hide the truth even when the model is decent

Suppose you only push compounds forward after a potency cutoff. Now your ADME understanding becomes heavily biased toward survivors. Suppose you delay a cheap permeability readout until after several chemistry rounds. Now you may spend weeks sharpening a series that was structurally doomed on exposure. Suppose you run an expensive assay because it is next in the ritual order rather than because it will change a decision.

All of those are sequencing failures.

They can slow a project more than a modest shortfall in affinity accuracy ever would.

A small modelling gain on the wrong assay can be strategically irrelevant

This is the uncomfortable part.

The field can get very excited about improving prediction quality inside one assay regime while ignoring whether that assay is even the main bottleneck at the stage where decisions are being made.

If the current project is really struggling with assay ordering, liability triage, or deciding which sparse measurements to buy next, then a marginally better affinity model may not change behaviour much. It improves one local estimate inside a workflow whose larger logic remains badly optimised.

That does not make affinity modelling useless.

It just means the biggest source of project acceleration may sit elsewhere.

Sequencing determines both spend and learning rate

This is why assay order should be treated as a first-class design variable.

Run the wrong assay too early and you burn budget before narrowing the right uncertainty. Run the wrong assay too late and you spend synthesis cycles on chemistry that should already have been deprioritised. Sequence a cheap proxy early and you may save an entire month of work. Sequence a rich assay on a deliberately chosen borderline compound and you may sharpen the next ten design decisions, not just the one in front of you.

Assay sequencing is therefore not a logistics detail.

It is part of the intelligence of the project.

The order of questions shapes the dataset too

This also matters because sequencing changes what your future models get to learn from.

If you always measure only the obvious winners, your models become experts on the chemistry that survived your process. They become weaker on the boundary cases, the failures, the rescued directions, and the compounds that would have revealed the trade-offs earlier.

Better sequencing can therefore improve not only the current cycle but the quality of the local training signal the project accumulates over time.

That is a compounding effect.

The useful system should help choose assays, not only score molecules

A serious AI system in this setting should be able to do more than rank compounds by one endpoint. It should help answer questions like these.

What is the next cheapest measurement that would materially change the decision? Which compound should get a richer panel because the result would generalise across the series? Where are we blind because the current ordering censors too much chemistry too early?

That is much closer to project value than one more round of point-estimate worship on a static benchmark.

The field likes elegant subproblems because they are easier to publish

Affinity prediction is a clean modelling story. Assay sequencing is messier. It requires economics, workflow knowledge, uncertainty, and an honest view of how teams actually move compounds through the system.

That messiness is exactly why it matters.

Projects are not lost because somebody failed to improve a retrospective metric by 0.03. They are often lost because the wrong questions got asked in the wrong order for too long.

Better assay sequencing will not sound as glamorous as a new molecular foundation model.

It may still move programmes faster.