In a previous post, I talked about the importance of collecting implementation data and the benefits it brings to program evaluation. Today, as part of the Stats Series, I want to go back to that topic and take it one step further.

This post walks through six ways implementation data can be used in statistical analysis to help make sense of what is happening in the program.

Before diving in, a quick recap on what implementation data is. Simply put, implementation data is captured in any structured record that shows what was delivered, who received services, who provided the service, and how frequently. Implementation data can be found in session logs, attendance records, fidelity checklists, dosage counts, or observation rubrics, to mention a few examples. That is the raw material for everything that follows.

Read a prior post here.

Let us look at the different ways implementation data can be included in the statistical analysis. These are drawn from my own work, you might find them similarly applicable to your line of work, or they can spark ideas to do it differently.

1. For subgroup analysis

This is the most accessible entry point. Once you have implementation records, you can split participants into groups based on their level of exposure and compare outcomes between them. It can look like high implementers versus low implementers, students who attended at least a certain percentage of sessions versus those who attended less, or schools that delivered the program with fidelity versus those that did not. To identify cutoff thresholds, I typically look to the existing literature to see what has been used in similar studies.

Once you have a good cutoff point, you just need a clear definition of your groups and a comparison of results. If the high-implementation group consistently outperforms the low-implementation group, that is evidence that the program, when delivered as intended, produces results. It also tells you something important about where to focus support.

2. As a covariate

When you enter implementation data into a statistical model as a control variable, you are telling the model to account for the fact that not everyone received the same amount of the program. A student who attended 5 sessions and a student who attended 25 sessions both appear in the program group. If you treat them the same, your outcome estimate is muddy. But if you add dosage or attendance as a covariate, it helps to clean that up. The model holds implementation level constant, which means the comparison between program participants and the comparison group becomes more precise. The effect you are estimating is closer to the true program effect rather than an average that mixes high and low recipients.

3. As a predictor

Here, implementation data moves from the background of the model to the foreground. Instead of using program participation as a simple yes or no variable, you use the actual dosage number: hours of exposure, number of sessions attended, fidelity score from an observation. This looks like a continuous scale starting at zero, where zero represents no implementation at all. This way you move from a binary participant or not participant variable to a ratio variable that accounts for levels of implementation. You enter that variable directly into the model as the independent variable and ask a more precise question: for each additional unit of implementation, how much does the outcome change?

This approach is particularly useful when participation varied widely across students or sites. It lets the data tell you whether there is a dose-response relationship, meaning whether outcomes improve as implementation increases. If they do, that is strong evidence that the program ingredient matters.

4. For sensitivity analysis

Sensitivity analysis is about testing how robust your findings are to different analytical decisions. With implementation data, you can run your primary model and then rerun it under different implementation thresholds. What happens to the effect when you include only students who attended at least 80 percent of sessions? What about 60 percent? Does the finding hold up, or does it change significantly depending on where you draw the line?

If the finding is stable across different thresholds, that is a sign the finding is genuine. If it shifts dramatically depending on who you include, that tells you the definition of participation matters enormously for how you interpret the outcome. That is important information for both the evaluator and the program manager.

5. As a moderator

A moderation analysis asks whether the relationship between the program and the outcome changes depending on a third variable. In this case, that third variable can be implementation quality or fidelity. In plain terms: did the program work differently in schools where it was implemented well versus schools where it was not?

I find this one particularly useful because it asks under what conditions the program worked. That is a question for program improvement: if high-fidelity implementation is associated with significantly stronger effects, that finding has direct implications for training, coaching, and support structures.

Technically, this involves adding an interaction term to the model, which tests whether the program effect varies across levels of implementation. The interpretation requires care, but the concept is accessible: the program may work well when delivered well, and less well when delivery falls short.

6. In economic evaluation

The connection between implementation data and economic evaluation is direct and practical. Dosage records are essential for calculating accurate cost per participant. A student who received 25 sessions cost more to serve than one who received 5. Without implementation records, those differences collapse into an average that misrepresents both the cost and the effect.

In a cost-effectiveness analysis, implementation data allows you to calculate the cost per unit of outcome for different levels of program delivery. Did higher-fidelity sites produce better outcomes at a lower cost per student? Did some sites deliver the program more efficiently than others?

The bottom line

Implementation data is an analytical asset. The six uses described above range from something any program manager can understand to techniques that require an evaluator's hand. But all of them start from the same place: a clean, consistent record of what was delivered. If your program is currently collecting implementation data, ask your evaluator which of these uses applies to your current analysis. If your program is not collecting it yet, the earlier post on implementation data is a good place to start.

Stats Series | Implementation Data in Statistical Analysis: Six Ways to Put It to Work

1. For subgroup analysis

2. As a covariate

3. As a predictor

4. For sensitivity analysis

5. As a moderator

6. In economic evaluation

The bottom line

Contact Me:

Email: kami.ojedarodriguez@gmail.com

Stats Series | Implementation Data in Statistical Analysis: Six Ways to Put It to Work

1. For subgroup analysis

2. As a covariate

3. As a predictor

4. For sensitivity analysis

5. As a moderator

6. In economic evaluation

The bottom line

Evaluation in a Constrained Landscape: What AI Makes Possible and What Remains Unknown

Stats Series | Statistical Jargon Decoded: A Plain Language Breakdown of the Statistics Behing the Findings

Contact Me:

Email: kami.ojedarodriguez@gmail.com