Design of Experiments: Labs Confront the Challenge of Capturing Experimental Intent
Image reference: / source
At the core of every chemical synthesis and formulation improvement is systematic testing and optimizing reaction and formulation parameters through a series of experiments. The experiments they run are not random; rather, they are structured attempts to understand how different reaction factors influence outcomes such as yield, purity, stability or efficacy.
And yet, despite this structure, a familiar frustration often persists in the lab: results that don’t quite add up, patterns that remain just out of reach, and iterations that feel necessary but not always illuminating.
Traditionally, experimental design has relied heavily on scientific expertise and domain intuition leading to incremental trial-and-error of altering one variable at a time and studying its impacts on a reaction (OVAT – One Variable At a Time). While this approach has led to many breakthroughs, it can also be slow and inefficient, especially when greater than three factors are involved. More importantly, it risks overlooking how variables interact — often where the most critical insights lie.
This led to the emergence of Design of Experiments (DoE), which is a structured methodology that helps scientists learn more from fewer experiments.

Figure 1: Decision workflow for reaction optimization Design of Experiments adapted from Wall et al. (2025).
What is Design of Experiments (DoE)?
Design of Experiments is a branch of applied statistics that helps to systematically plan experiments so that multiple variables can be studied simultaneously. DoE represents a shift from ad hoc experimentation to intentional experimental design by prompting the following questions:
- Which factors might matter?
- What ranges should we test?
- How can we combine them efficiently to understand both individual effects and interactions?
This shift is subtle, but powerful — it moves experimentation from sequential exploration to structured discovery.
Let’s consider a simple hypothetical study to identify conditions that improve the yield of a reaction. Based on prior knowledge and scientific domain expertise, a few factors would be identified as critical:
- Temperature: Low / High
- Reaction Time: Short / Long
- Catalyst Amount: Low / High
While other variables may exist, they are excluded as unlikely to influence yield based on previous knowledge and understanding. This highlights that DoE can be used as a powerful tool to improve efficiency and accelerate experimentation, but not as a substitute for scientific judgement.
At the same time, it also introduces an important caveat: the quality of a DoE is only as good as the assumptions that guide it.
If a lab employs DoE before studying all key variables, the result can be misleading. For example, if humidity wasn’t recognized as a meaningful variable before designing the DoE, an unexpected yield at a particular factor combination might be misinterpreted. The deviation could simply be due to uncontrolled ambient humidity rather than the factors under study.
The hypothetical DoE matrix for this example study is shown below:
| Experimental Condition | Temperature (°C) | Temperature (°C) | Catalyst (mol %) |
|---|---|---|---|
| 1 | 60 | 30 | 0.5 |
| 2 | 60 | 30 | 2.0 |
| 3 | 70 | 60 | 1.25 |
| 4 | 60 | 90 | 0.5 |
| 5 | 60 | 90 | 2.0 |
| 6 | 70 | 60 | 1.25 |
| 7 | 80 | 30 | 0.5 |
| 8 | 80 | 30 | 2.0 |
| 9 | 80 | 90 | 0.5 |
| 10 | 80 | 90 | 2.0 |
| 11 | 70 | 60 | 1.25 |
| 12 | 70 | 60 | 1.25 |
There are many aspects that could be understood through employment of DoE. One such aspect that helps in understanding whether observed variation in results is due to true chemical effects or simply the natural variability of the experimental system is Center Point Replicates.
In practice, this answers a deceptively simple but critical question: are we observing a real signal, or just noise?
As seen above, each factor has a defined range in a designed experiment. The center point is simply the midpoint of every factor simultaneously: in this case, 70°C, along with the midpoints of reaction time and catalyst loading. Runs 3, 6, 11, and 12 are considered center point replicates which help in understanding the variability of the experiment.
The repetition is deliberate. When a scientist runs the same conditions three times and obtains yields of 74%, 76%, and 75%, the results suggest that the experimental system has roughly 1% natural variability. This matters enormously because without that baseline, it is difficult to assert whether a factor actually affects yield or whether the effect is just noise.
What may appear as an improvement — or degradation — can sometimes simply be randomness in disguise.
As such, center point replicates can validate the validity and reproducibility of results to auditors and reviewers in downstream reports.
As such, center point replicates can validate the validity and reproducibility of results to auditors and reviewers in downstream reports.
Why Design of Experiments?
Design of Experiments offers multiple benefits in the scientific realm. Within the context of chemical formulations and synthesis reactions, let’s focus on some of the benefits to understand its importance.
Beyond efficiency, DoE fundamentally improves the quality of decisions derived from experimental data. It enables scientists to surface interactions, explore experimental spaces more effectively, and draw conclusions with greater confidence.
But this raises a less obvious question: What happens to that insight once the experiment is over?
DoE doesn’t just change how experiments are run — it changes what needs to be remembered.
Evolving DoE design through Cross-Reaction Pattern Recognition
DoE studies are often designed by utilizing existing knowledge to decide which factors to include, what ranges to explore, and how many runs to allocate.
If yield is observed to consistently plateau above 75°C across multiple studies for a particular compound class, the next design would be better designed by narrowing the temperature for finer resolution. If reaction time is observed to have negligible impacts across multiple studies, then it frees up runs to introduce a new variable like solvent ratio.
Each of these becomes a more informed design decision — not just based on intuition, but on accumulated evidence.
This is only possible if DoE data is captured in a structured manner that makes it easily accessible and comparable across experiments — and crucially, if the intent of those experiments is preserved alongside the results.
Predictive Modelling and Advanced Analytics
Every data-point captured in a DoE study carries meaning beyond individual experiments. Collectively, these structured data points reveal trends, boundaries, and relationships that single experiments could fail to give.
However, the value of this data depends not just on what was recorded, but how completely the experimental context was captured.
This is where a well thought-out, DoE-integrated data capture design in ELN/LIMS systems becomes critical — not as an add-on, but as the natural continuation of DoE thinking.
When this structured data accumulates over time, it becomes the foundation upon which predictive models and advanced analytics are built.
A machine learning model trained on hundreds of structured DoE datasets from an ELN can predict the probability of a reaction succeeding before a single additional experiment is run.
Instead of starting from a blank slate, scientists can begin with direction.
Integrating DoE Thinking into ELNs and LIMS
This gap — between what was observed and what was intended in an experiment — becomes especially visible in how experimental data is captured in ELNs and LIMS.
Minimum viable product requirements are often based out of the data capture needs such as recording recipes, inputs, outputs, and associated calculations. This structure works well for documenting what went into each batch, but it completely misses the experimental considerations taken to arrive at the formulation recipe.
In effect, the results are remembered, but the reasoning that led to them is not.
Not every component in a formulation is a variable under study, and not every variable is a formulation component.
Temperature, reaction time, and catalyst loading could be factors a scientist intentionally varies in understanding how they influence yield. A standard reaction record can tell you that run 1 was executed at 60°C, 30 minutes, and 0.5 mol% catalyst.
What this record doesn’t specify is that these three factors were chosen as DoE factors, that 60°C was the low level of a deliberately set temperature range, or that runs 1 and 7 were placed at opposite ends of that range in order to estimate the temperature effect on yield.
This DoE awareness can be factored into an ELN/LIMS by introducing a separate definition table and a design run table that sit between the formulation, process parameters, and results. These tables explicitly define which variables are being studied, what ranges or levels they take, and how they are combined across planned runs.
This is what makes the difference between documenting experiments and learning from them.
This in turn could be leveraged by LLMs or AI models to uncover hidden patterns and generate meaningful recommendations for scientists starting on new projects.
Instead of a scientist starting from a blank page, an AI-enabled system might be able to open a new ELN entry and immediately suggest:
“Based on similar crystallization steps, a fractional factorial design with these four factors is likely to give you the most insight,”
or
“For this type of tablet formulation, a mixture DoE with these excipient constraints has historically produced robust design spaces.”
Capturing What Experiments Actually Teach Us
Introducing DoE awareness into your ELN/LIMS system will not only aid in elevating your data quality, but also can set a foundation for a smarter, AI-augmented scientific workflow.
More importantly, it shifts experimentation from a series of isolated trials to a connected system of learning — where every experiment not only answers a question but also contributes to a growing body of reusable knowledge.
And that is the real promise of DoE: not just better experiments, but experiments whose value extends far beyond a single run.
Where Experimental Design Meets Institutional Memory
As DoE adoption grows, the conversation is naturally shifting from how experiments are designed to how their insights are preserved.
Designing better experiments is only part of the equation. The real value lies in ensuring that the intent behind those experiments — the factors considered, the ranges explored, and the structure of the design — is captured in a way that can be reused and built upon.
In practice, however, this is not a trivial shift. Most ELN and LIMS implementations are built around recording experimental inputs and outputs, not the reasoning that shaped them. Retrofitting these systems to capture DoE intent requires not just changes in data models, but also alignment across scientists, workflows, and organizational priorities. What to capture, how consistently to capture it, and how much structure is “enough” are decisions that rarely have straightforward answers.
This raises a few important questions for modern labs:
- At what point does additional structure begin to add more value than friction?
- How can experimental intent be captured without slowing down the scientist at the bench?
- And how do organizations ensure that this data remains usable — not just stored — as systems and teams evolve?
When this intent is lost, experiments remain isolated outcomes. When it is preserved, they become part of a larger system of learning.
In that sense, DoE is not just a methodology for improving individual studies — it is a foundation for creating experiments that continue to generate value beyond a single run. And when thoughtfully integrated with ELNs and LIMS, it enables a shift from experimentation as a sequence of trials to experimentation as an accumulating, compounding source of insight.
Because the real advantage isn’t just in running better experiments — it’s in remembering them well enough to not have to start from scratch.
