design limits claims of broad generalisability, the argument produces clear, testable propositions for
comparative and experimental work—most pressingly, whether exemplar-based calibration and targeted
supervisor development measurably reduce evaluative variance and unequal student burdens.
For policymakers, the study suggests that fairness requires both rule specification and capacity building:
regulatory instruments (rubrics, examiner guidelines) should be coupled with funded examiner
calibration and supervisor training. Implementing such paired reforms at departmental and national
levels will be the clearest route to aligning written expectations with enacted judgements.
References
Bastola, N., & Hu, G. (2020). Supervisory feedback across disciplines: Does it meet students’ expectations?
Assessment and Evaluation in Higher Education, 46, 407–423.
https://doi.org/10.1080/02602938.2020.1780562
Belcher, B., Rasmussen, K., Kemshaw, M., & Zornes, D. (2016). Defining and assessing research quality in a
transdisciplinary context. Research Evaluation, 25(1), 1–17. https://doi.org/10.1093/reseval/rvv025
Benbouabdallah, H., & Benmekhlouf, I. (2023). Teachers’ opinions regarding the main standards for evaluating a
master thesis: The case of EFL teachers at the Department of English, Batna 2 University. [Unpublished
Master’s dissertation, University of Batna 2, Batna, Algeria].
Bourdieu, P. (1988). Homo academicus. Stanford, United States: Stanford University Press.
Bourke, S., & Holbrook, A. (2013). Examining PhD and research masters theses. Assessment and Evaluation in
Higher Education, 38(4), 407–416. https://doi.org/10.1080/02602938.2011.638738
Bukhari, N., Jamal, J., Ismail, A., & Shamsuddin, J. (2021). Assessment rubric for research report writing: A tool
for supervision. Malaysian Journal of Learning and Instruction, 18(2), 1–43.
https://doi.org/10.32890/mjli2021.18.2.1
Cheung, K. K. C. (2023). The use of intercoder reliability in qualitative interview data analysis in science
education. International Journal of Science Education. https://doi.org/10.1080/02635143.2021.1993179
Chugh, R., Macht, S., & Harreveld, B. (2021). Supervisory feedback to postgraduate research students: A literature
review. Assessment and Evaluation in Higher Education, 47(5), 683–697.
https://doi.org/10.1080/02602938.2021.1955241
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2),
77–101. https://doi.org/10.1191/1478088706qp063oa
Crowe, M., Slater, P., & McKenna, H. 32(3). (2024). Demonstrating research quality. Journal of Psychiatric and
Mental Health Nursing, 32(3), 686-688. https://doi.org/10.1111/jpm.13145
Goodman, P., Robert, R., & Johnson, J. (2020). Rigor in PhD dissertation research. Nursing Forum, 55(4).
https://doi.org/10.1111/nuf.12477
Holbrook, A., Bourke, S., Lovat, T., & Dally, K. (2004). Investigating PhD thesis examination reports.
International Journal of Educational Research, 41, 98–120.
Homer, M., & Ababei, V. (2026). Evidencing improvement in examiner calibration in OSCEs. Medical teacher,
1–11. Advance online publication. https://doi.org/10.1080/0142159X.2026.2621959
Hsiao, Y. P. A. (2024). Ensuring bachelor’s thesis assessment quality: A case study at a Dutch technical university.
Higher Education Evaluation & Development, 18(1), 2–16. https://doi.org/10.1108/HEED-08-2022-0033
Knorr-Cetina, K. (1999). Epistemic cultures: How the sciences make knowledge. Cambridge, United States:
Harvard University Press.
Kumar, V., & Stracke, E. (2011). Examiners’ reports on theses: Feedback or assessment? Journal of English for
Academic Purposes, 10, 211–222. https://doi.org/10.1016/j.jeap.2011.06.001
Lee, A. (2018). How can we develop supervisors for the modern doctorate? Studies in Higher Education, 43, 878–
890. https://doi.org/10.1080/03075079.2018.1438116
Mafora, P., & Lessing, A. (2016). The voice of the external examiner: Experiences from South African higher
education. South African Journal of Higher Education, 28, 1295–1314.
Man, D., Xu, Y., Chau, M., O’Toole, J., & Shunmugam, K. (2020). Assessment feedback in examiner reports on
master’s dissertations in translation studies. Studies in Educational Evaluation, 64, 100823.
https://doi.org/10.1016/j.stueduc.2019.100823
Morse, J. M. (2015). Critical analysis of strategies for determining rigor in qualitative inquiry. Qualitative Health
Research, 25, 1212–1222.
Mullins, G., & Kiley, M. (2002). “It’s a PhD, not a Nobel Prize”: How experienced examiners assess research
theses. Studies in Higher Education, 27(3), 369–386. https://doi.org/10.1080/0307507022000011507
O’Donovan, B., Sadler, I., & Reimann, N. (2024). Social moderation and calibration versus codification: a way
forward for academic standards in higher education? Studies in Higher Education, 49(12), 2693–2706.
https://doi.org/10.1080/03075079.2024.2321504