Research has shown that comparative judgement leads to very reliable results. This is because comparative judgement is cognitively a simpler task than, say, grading using a criteria list. When you, as a teacher, compare tasks in pairs, you will – thanks to your expertise – be able to answer effortlessly which piece of work is the better of the two.
Because comparing is easier, you will also make more consistent decisions. Therefore, the same piece of work will stand out in a comparison every time, regardless of the time of day or what tasks you’ve seen before. So as an evaluator, you can be pretty confident in your judgment.
Validity has to do with actually assessing the competency that needs to be assessed. However, the problem with complex competencies is that they usually cannot be captured in too tight a framework. Research finds that with analytical assessment methods, there is a lot of overlap between the aspects that need to be distinguished. Moreover, you neglect the bigger picture by zooming in on the parts. All this makes assessment difficult as well as inefficient.
By assessing in a more holistic way, which is the case with comparative judgement, assessors automatically take more criteria into account. Even those criteria that are not made explicit in criteria lists or rubrics, but that are nonetheless relevant. Moreover, by assessing together with some colleagues and combining different perspectives, the different (minor) aspects of a competency are better scrutinized. Thus, you assess the competency as a whole.
Another advantage of comparative judgement is that it saves you a certain amount of time. Don’t get me wrong, grading is a time-consuming job anyway, no matter which way you approach it. But, if you think that comparative judgement gives you twice the amount of work because you’re grading in pairs, you’re not quite right either. For pairs with a clear difference in quality, it is quickly obvious which product is the best. If the products are of similar quality, then logically making a decision takes more time. But the overall assessment time will usually be no longer than with a criteria list assessment.
The biggest time savings in comparative judgement? No criteria lists need to be developed, validated and calibrated. Comparative judgement relies on the expertise of assessors, which has proven to be very reliable. Moreover, it works intuitively. Assessors do not need to be trained to learn to look at (the same) aspects.
More learning opportunities for students
However you use the method, it provides great learning opportunities for students. For example, after all the products have been compared, you could have students look at the rank order of the works. This will help them better estimate where they themselves stand. After all, they get a chance to look at better and less good examples of the task and consequently they can discover why those differ from their own work in competency level.
When you use the method for peer assessment, even more learning opportunities are added. For example, (anonymously) assessing peer work is a learning opportunity in itself. Indeed, by weighing the works in a comparison, students learn bottom-up to recognize key aspects in quality tasks. By explicitly naming these in the feedback they give their fellow students, they activate this knowledge in themselves as well, which (hopefully) benefits their follow-up works.