IS RELIABILITY THE SAME AS VALIDITY?
Assessment involves trade-offs no matter how you cut the cake. More reliability and less validity. More validity and less reliability.
ASSESSMENT: The big RELIABILITY – VALIDITY trade-off

It’s common to hear definitions of reliability and validity (see graphic above) when discussing assessment. Most times they leave me thinking, Well, so what! What are the implications of this?
Not familiar with the terms? Have a look at this.
So, let’s dig a little deeper. What does it mean for the coach developer and how we go about assessment?
All assessment involves trade-offs (1,2). Too much time on summative assessments (3) robs the coach of practice time. If we assess one area of a course, we forgo the opportunity to assess another area in more depth. Aim for a more rigorous assessment and there is a danger of making it all too complicated and time consuming and costly.
The nature of assessment forces the coach developer to accept a trade-off of some kind. No matter what assessment path they take. There are no solutions, only trade-offs. (1).
MENTAL MODELS TO THE RESCUE
Given this dilemma (only trade-offs available), it’s important that we revisit our mental models (MM) that underpin coach learning. These models are based on the observations, beliefs, and values we have about the kind of learning experiences and outcomes for our coaches.
For example, our mental models might be underpinned by:
- A desire for simplicity
- An understanding of how volunteers would like to receive their training and their attitude towards being assessed.
- A consideration of whether summative assessments add value for particular groups of coaches.
- The time available and the suitability of the CD workforce.
- The opportunity cost of one method over another. Where do we focus to get the biggest bang for the buck?
- Beliefs about the importance of coaches learning in a practical context with an emphasis on ‘whole coaching’.
A theme running through these points is summed up by the saying: Don’t let the perfect be the enemy of the good. Be open to balancing competing forces regarding the choice of assessment.
The mental models you choose might be underpinned by different assumptions, beliefs, and values to those in the example above. Particularly as you take your context into consideration.
What would you include in your list of MM attributes?
THE TENSION BETWEEN RELIABILITY & VALIDITY
The graphic below draws together some of the ideas outlined above.
Multiple combinations of reliability and validity are possible. In the graphic all four arrows are within striking distance of the bull’s eye. That is, relatively high validity. But notice the spread is broad, leading to relatively low reliability.

“The key thing in assessment is being clear about why you are assessing, what conclusions you want to draw and how well the evidence supports the conclusions.” Dylan Wiliam
TAKE OUTS
- Reliability is a pre-requisite of validity.
- There is always a tension between reliability and validity.
- Trade-offs between the two are commonplace.
- More reliability is not necessarily better!
- There is no such thing as a valid assessment task. Validity is a property of inferences based on the assessment outcomes, just as a picture cannot tell the whole story of an event. (2)
- A lot of what we do in coach development involves outcomes that are not specific, measurable, or easily observable.
- Coaching is a people business, and this calls for CDs and coaches to exercise their professional judgement. This often means making judgements by stepping back to look at the big picture – as ‘fuzzy’ as it may be.
- The context and prior achievements of the coaches will determine the type of assessment adopted.
- In some instances (e.g., volunteers, advanced coaches), the assessment might be embedded (3) into the learning with no formal summative assessments.
References
- Christodoulou, Daisy. (2024) Designing the Perfect Assessment System, Part 3, The are no solutions, only trade-offs. The expression was originally used by the writer and critic Thomas Sowell. Daisy’s article is available here.
- Wiliam, Dylan. (2020) How to Think About Assessment, in Donarski, Sarah, and Bennett, Tom, Assessment: An Evidence-Informed Guide for Teachers, John Catt.
- Northern Illinois University, Formative and Summative Assessment. Available here.
- Bjerede, Marie (2015). Embedded Formative Assessment: Tests without Stress. Available here.
Acknowledgments: Thanks to reviewers Lawrie Woodman, Andrea Woodburn, Melanie Schembri Waite, and Dawn Ho.