AI-Augmented Assay Cascades

By Blaise AI Team

Assay cascades are one of those things everyone accepts because they’re familiar.

You start with cheap screens, then “promote” compounds through increasingly expensive assays: potency → selectivity → ADME → safety → in vivo.

It feels disciplined.

It’s also increasingly outdated — because it assumes the only sensible strategy is dense measurement on a shrinking set of winners.

AI makes a different strategy viable: sparse measurement on a broader set + targeted triage on what’s actually worth learning from.

Assay cascades are a 90s solution to a 2026 problem

The classic cascade was designed around two constraints:

assays were expensive and slow
decision-making was human-only

So you used rigid rules to ration assays: only advance if potency clears a cutoff, only run PK if ADME passes, only collect extra endpoints on the top N compounds.

That works if your goal is to avoid wasting money.

But it quietly wastes something far more valuable: information.

The hidden cost of cascades: you bias your dataset toward “winners”

When you run a cascade, you systematically don’t measure properties on compounds that fail early.

That creates two problems:

1) You don’t learn correlations you actually need

Example: you never learn how permeability behaves across a scaffold family, because you only measured permeability on the handful that happened to be potent.

So your ADME models become “models of the survivors”, not models of chemistry.

2) You create selection bias that wrecks downstream modeling

If ADME is only measured conditional on potency passing a threshold, then potency becomes a confounder baked into the dataset.

Your model learns patterns like “potent-looking molecules tend to have decent ADME” — not because it’s mechanistically true, but because that’s what you chose to measure.

Your uncertainty estimates become meaningless too, because the training distribution is censored.

AI triage flips the workflow: sparse early, dense only where it pays

Instead of a rigid ladder, you treat assays as a portfolio allocation problem:

What is the next best piece of data to buy, given cost, latency, and value-of-information?

In practice, AI triage means you measure a thin slice of multiple endpoints early — potency plus one or two cheap ADME proxies plus basic physchem — then selectively “zoom in” with expensive assays only for compounds where the additional data will actually change a decision.

This gives you two wins at once:

Win #1: You cut costs by stopping “automatic advancement”

Most cascades have huge implicit spend because advancement is rule-based, not value-based.

AI triage stops the dumb spend — no more repeated expensive assays on compounds whose fate is already determined, no more automatic “standard panel” testing just because “that’s what we do next”.

You only pay for assays when they buy a decision.

Win #2: You build better models by collecting informative failures

The counterintuitive move is this:

Assay some compounds that fail potency.

Not to rescue them — to learn.

Because the compounds that “fail early” are usually chemically diverse, closer to the decision boundary, and the ones that teach you what doesn’t work.

If you never collect ADME/selectivity on failures, your models miss activity cliffs, liability cliffs, series-specific tradeoffs, and where the real constraints actually are.

A small, strategic budget of “assay the failures” massively improves the model’s grasp of the design space.

What “AI triage” actually looks like (concretely)

Step 1: Treat assays like features with prices

Every assay has a cost, a latency, a noise profile, and a decision impact. You can formalise this as: expected value of information per unit cost.

Step 2: Run a sparse initial panel

Instead of “potency first”, you run potency alongside a cheap permeability or solubility proxy, microsomal stability if it’s fast in your org, and a quick selectivity counter-screen if off-target risk is known.

Sparse doesn’t mean sloppy. It means intentional coverage.

Step 3: Use the model to pick the next assays, not the next molecules

This is the big shift.

The model asks:

“Which assay reduces uncertainty about the decision for this compound?”
“Which measurement will generalize across this series and improve next week’s designs?”
“Where are we blind because of censoring?”

Step 4: Maintain “sentinel sampling” outside the winners

You always reserve 10–20% budget for borderline potency compounds, diverse scaffolds, apparent outliers, and deliberate negatives. These become your anti-bias anchors.

Why this beats cascades: the objective is program speed, not assay order

Cascades optimize assay spend locally.

AI triage optimizes learning rate globally.

And the killer metric isn’t “how many compounds advanced”. It’s how quickly you mapped SAR and liabilities, how quickly you eliminated dead directions, and how confidently you can make the next design set smaller and better.

The punchline

Rigid cascades made sense when data was expensive and models were weak.

Now they’re often just a habit — one that censors your dataset, biases your models toward survivors, and burns cash on expensive tests just because they’re next in the queue.

AI triage turns testing into a market. You buy an assay because the data is worth the price, not because it’s the next hoop to jump through.

You measure sparsely to see the space. Then you spend heavily only where the data buys a decision.

You cut the waste, and your models finally stop learning from a rigged dataset.