loading
Documents

Research.Publish.Connect.

Paper

Authors: Luiz Olmes Carvalho ; Lucio F. D. Santos ; Willian D. Oliveira ; Agma J. M. Traina and Caetano Traina Jr.

Affiliation: University of São Paulo, Brazil

ISBN: 978-989-758-187-8

Keyword(s): Similarity Search, Similarity Join, Query Operators, Wide-join, Near-duplicate Detection.

Related Ontology Subjects/Areas/Topics: Databases and Information Systems Integration ; Enterprise Information Systems ; Query Languages and Query Processing

Abstract: Crowdsourcing information is being increasingly employed to improve and support decision making in emergency situations. However, the gathered records quickly become too similar among themselves and handling several similar reports does not add valuable knowledge to assist the helping personnel at the control center in their decision making tasks. The usual approaches to detect and handle the so-called near-duplicate data rely on costly twofold processing. Aimed at reducing the cost and also improving the ability of duplication detection, we developed a framework model based on the similarity wide-join database operator. We extended the wide-join definition empowering it to surpass its restrictions and accomplish the near-duplicate task too. In this paper, we also provide an efficient algorithm based on pivots that speeds up the entire process, which enables retrieving the top similar elements in a single-pass processing. Experiments using real datasets show that our framework is up t o three orders of magnitude faster than the competing techniques in the literature, whereas also improving the quality of the result in about 35 percent. (More)

PDF ImageFull Text

Download
Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 54.92.164.184

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Carvalho L., Santos L., Oliveira W., Traina A. and Jr. C. (2016). Efficient Self-similarity Range Wide-joins Fostering Near-duplicate Image Detection in Emergency Scenarios.In Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS, ISBN 978-989-758-187-8, pages 81-91. DOI: 10.5220/0005868900810091

@conference{iceis16,
author={Luiz Olmes Carvalho and Lucio F. D. Santos and Willian D. Oliveira and Agma J. M. Traina and Caetano Traina Jr.},
title={Efficient Self-similarity Range Wide-joins Fostering Near-duplicate Image Detection in Emergency Scenarios},
booktitle={Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS,},
year={2016},
pages={81-91},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005868900810091},
isbn={978-989-758-187-8},
}

TY - CONF

JO - Proceedings of the 18th International Conference on Enterprise Information Systems - Volume 1: ICEIS,
TI - Efficient Self-similarity Range Wide-joins Fostering Near-duplicate Image Detection in Emergency Scenarios
SN - 978-989-758-187-8
AU - Carvalho L.
AU - Santos L.
AU - Oliveira W.
AU - Traina A.
AU - Jr. C.
PY - 2016
SP - 81
EP - 91
DO - 10.5220/0005868900810091

Login or register to post comments.

Comments on this Paper: Be the first to review this paper.