Discovering User Perceptions of Semantic Similarity in Near-duplicate Multimedia Files

TitleDiscovering User Perceptions of Semantic Similarity in Near-duplicate Multimedia Files
Publication TypeConference Paper
Year of Publication2012
AuthorsVliegendhart, R, Larson, MA, Pouwelse, JA
Conference NameCrowdSearch 2012: First International Workshop on Crowdsourcing Web Search
Pagination54-58
Date Published04/2012
Conference LocationLyon, France
Abstract

We address the problem of discovering new notions of user-perceived similarity between near-duplicate multimedia files. We focus on file-sharing, since in this setting, users have a well-developed understanding of the available content, but what constitutes a near-duplicate is nonetheless nontrivial. We elicited judgments of semantic similarity by implementing triadic elicitation as a crowdsourcing task and ran it on Amazon Mechanical Turk. We categorized the judgments and arrived at 44 different dimensions of semantic similarity perceived by users. These discovered dimensions can be used for clustering items in search result lists. The challenge in performing elicitations in this way is to ensure that workers are encouraged to answer seriously and remain engaged.

URLhttp://ceur-ws.org/Vol-842/crowdsearch-vliegendhart.pdf
AttachmentSize
crowdsearch2012-vliegendhart.pdf99.35 KB