Most ADME ML Models Are Confidently Wrong
Most ADME ML models are confidently wrong more often than people admit.
The failure mode is predictable:
- Train on a big “global” dataset.
- Report flattering offline metrics.
- Deploy on molecules that sit outside the model’s applicability domain.
- Ship point predictions with the emotional confidence of a weather app.
Then the program hits reality: distribution shift is the rule, not the exception.
The hidden tax: false certainty erodes trust
Wet-lab teams don’t hate models because they’re imperfect. They hate models because they sound certain when they’re guessing.
In drug discovery, trust is cumulative. Every overconfident miss burns political capital:
- chemists stop listening
- project teams stop budgeting time for modelling
- the model becomes a checkbox instead of a decision tool
Why “global ADME models” break in practice
Small-molecule datasets are sparse, biased, and scaffold-skewed.
A solubility or clearance model trained on legacy chemotypes does not generalise to a novel scaffold just because you used a GNN and a million datapoints.
The real question is always:
How similar is this molecule to what the model actually learned?
What Blaise does differently: uncertainty as a first-class output
At Blaise AI, we surface the things that matter in the room where decisions get made:
- what data a model was trained on
- how similar your molecule is to that data
- when a prediction is worth acting on
- when the right next step is wet-lab work, not more modelling
This is the difference between “an ML model exists” and “a model is usable in a live program.”
Why this matters even more for agentic design
We’re building a Small Molecule Agent: something that proposes designs, weighs evidence, and recommends actions.
An agent cannot be fed overconfident signals.
If the property model lies with confidence, the agent will:
- optimise the wrong objective
- converge on brittle chemotypes
- recommend work that looks rational on paper and fails at the bench
An agent needs calibrated opinions. It needs to say “this is informative” versus “this is guessing.”
The product stance: no incentive to oversell point predictions
Blaise doesn’t exist to impress users with a single predicted number.
Blaise exists to reason correctly about molecules.
That means down-weighting weak models, flagging uncertainty, and saying “we don’t know yet” before the bench forces the issue.
AI helps drug discovery only when it can admit uncertainty faster than the lab can expose it.