The document discusses the challenges and methodologies in achieving consensus in crowdsourcing, particularly in a human computation context, with a focus on Amazon Mechanical Turk (MTurk) and the importance of evaluating both objective and subjective tasks. It outlines the significance of benchmarking, qualitative measures, and psychometrics in understanding and enhancing relevance judgments, while also emphasizing the need for systematic quality assurance in subjective evaluations. Additionally, it proposes using structural equation modeling and exploratory factor analysis as a framework for analyzing and modeling multi-dimensional relevance judgments.
Related topics: