What I’m Learning About Psychometrics as an Evaluator
This fall, I’ve been immersed in one of the most challenging and unexpectedly meaningful parts of my doctoral program: psychometrics. Psychometrics is the science of how we measure concepts that cannot be observed directly, and before this course, I understood reliability, validity, and measurement error in the general sense that most evaluators do. I could define them, apply them, and teach others about them.
But learning psychometrics in depth has changed how I see measurement, and it is already reshaping how I think about the evaluation of the programs I serve.
So here are some of the lessons I’ve learned so far in my PhD journey.
Understanding Measurement Through a New Lens
One of the biggest lessons from psychometrics has been realizing how much happens behind every number we use in evaluation. No measure is ever perfect because every score is an approximation shaped by the instrument itself, the people responding, the context in which the data were collected, and the decisions made during the measurement process. I wrote about this in more detail in my post on measurement error and reliability, so I won’t repeat it here. If you’re curious, you can read that reflection:
What psychometrics is helping me do now is move beyond those foundational ideas and examine more critically the structure, quality, and purpose of the tools we use in real evaluations. I am beginning to look at measurement not as a quick step in the process, but as a central component that shapes every interpretation that follows.
Validity as a Process, not a Label
Another idea that is becoming transformative for me is understanding validity as an ongoing process rather than a fixed property of a tool. Validity is the collection of evidence supporting the interpretation we want to make from scores. This shift is changing how I think about the instruments used in federal program evaluations.
Now, instead of asking, “Is this a valid instrument?” I find myself asking more reflective and precise questions such as:
• What interpretation are we trying to make from these scores?
• What evidence supports making that interpretation?
• Does the tool actually capture the construct we care about?
Looking at validity this way makes me more careful when selecting tools, because it pushes me to consider whether the data can support the decisions that stakeholders want to make. It has also made me more transparent about the limitations of the instruments we use and more willing to explain how those limitations shape the conclusions we draw.
Becoming More Intentional with the Measures I Use
Because I serve as the evaluator for multiple federally funded programs, measurement decisions show up in the smallest details of my daily work. Psychometrics has pushed me to slow down, pay closer attention, and be more disciplined in this part of the process. I now find myself pausing to consider:
• What exactly are we trying to measure?
• Does this tool align with how the construct is defined?
• What kinds of decisions will be made from this data?
• Does this instrument provide enough evidence to support those decisions?
This added intentionality has strengthened my evaluation practice. The measures I select, the data I interpret, and the conclusions I share feel more grounded and deliberate because I am thinking more carefully about how well each tool fits the purpose of the evaluation.
Connecting Coursework to Real Evaluation Practice
The more I learn about measurement, the more I see its fingerprints across every evaluation project I touch. When I review surveys, observation tools, implementation checklists, and district-created assessments, I notice things I did not notice before. I see when items fail to align with the construct they claim to measure. I recognize design choices that introduce unnecessary noise. I catch scoring approaches that do not match the stated purpose of the tool. I notice assumptions that were never examined. And I pay closer attention to decisions that rely on data with unclear connections between program activities and intended outcomes.
Psychometrics has also made me more transparent with stakeholders. When I share findings now, I do not focus solely on what the data shows. I explain how the quality of measurement shapes the story we are able to tell, and I make sure program staff understand what the data can and cannot support.
A Personal Reflection from the Doctoral Journey
This course has been difficult. It has pushed my thinking, challenged my assumptions, and required me to slow down and question my habits as an evaluator. It has also become one of the most meaningful parts of my program so far. I love how psychometrics is teaching me to be a more careful, intentional, and thoughtful evaluator, one who does not rush to interpret data but instead considers how the instruments, the conditions, and the purpose all shape what those numbers truly represent.
I am still learning. I am still in the middle of the transformation. But I can already see how this deeper understanding of measurement will strengthen my work for years to come.