Abstract
The proliferation of the Web 2.0 and the ubiquitousness of social media yield a huge flood of heterogenous data that is voluntarily published and shared by billions of individual users all over the world. As a result, the representation of an entity (such as a real person) in this data may consist of various data types, including location and other numeric attributes, textual descriptions, images, videos, social network information and other types of information. Searching similar entities in this multi-enriched data exploiting the information of multiple representations simultaneously promises to yield more interesting and relevant information than searching among each data type individually. While efficient similarity search on single representations is a well studied problem, existing studies lacks appropriate solutions for multi-enriched data taking into account the combination of all representations as a whole. In this paper, we address the problem of index-supported similarity search on multi-enriched (a.k.a. multi-represented) objects based on a set of metrics, one metric for each representation. We define multi-metric similarity search queries by employing user-defined weight function specifying the impact of each metric at query time. Our main contribution is an index structure which combines all metrics into a single multi-dimensional access method that works for arbitrary weights preferences. The experimental evaluation shows that our proposed index structure is more efficient than existing multi-metric access methods considering different cost criteria and tremendously outperforms traditional approaches when querying very large sets of multi-enriched objects.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Mathematik, Informatik und Statistik > Informatik |
Themengebiete: | 000 Informatik, Informationswissenschaft, allgemeine Werke > 004 Informatik |
ISSN: | 1084-4627 |
Sprache: | Englisch |
Dokumenten ID: | 47463 |
Datum der Veröffentlichung auf Open Access LMU: | 27. Apr. 2018, 08:13 |
Letzte Änderungen: | 13. Aug. 2024, 12:55 |