1 - 2 of 2
Item5 − 4 ≠ 4 − 3: On the Uneven Gaps between Different Levels of Graded User Satisfaction in Interactive Information Retrieval Evaluation( 2023-01-03)Similar to other ground truth measures, graded user satisfaction has been frequently employed as a continuous variable in information retrieval evaluation based on the assumption that intervals between adjacent grades are quantitatively equal. To examine the validity of equal-gap assumption and explore dynamic perceptual thresholds triggering grade changes in search evaluation, we investigate the extent to which users are sensitive to changes in search efforts and outcomes across different gaps of graded satisfaction. Experiments on four user study datasets (15,337 queries) indicate that 1) User satisfaction sensitivity, especially to offline evaluation metrics, changes significantly across gaps in satisfaction scale; 2) the size and direction of changes in sensitivity vary across study settings, search types, and intentions, especially within “3-5” scale subrange. This study speaks to the fundamentals of user-centered evaluation and advances the knowledge of heterogeneity in satisfaction sensitivity to search efforts and gains and implicit changes in evaluation thresholds.