Abstract
Tremendous amounts of data are generated every day by different sources and stored in heterogeneous databases. Providing an integrated view by fusion of data is essential to enhance data utilization. An indispensable type of data is spatial data, with diverse application domains, including GIS, e-commerce, military, and tourism. The concept of location forms a key part of user-generated data with serious challenges, including uncertainty. A particular location may have different names, and conversely, various locations may have the same name. Furthermore, geographical coordinates of locations may not be expressed accurately in datasets. More challenges also exist that have received less attention. Various data sources might describe locations in different levels of detail. This increases data inconsistency and decreases the quality of data fusion. This paper focuses on spatial data granulation to deal with this variety. If these diversities are not taken into consideration, the different descriptions of a location may be interpreted differently and, in turn, not be fused. The contribution of this paper are: (a) Introducing a granular approach to measure the similarity between two place description for managing apparent differences. The proposed method improves the quality of the geocoding and data fusion phases, (b) Introducing a novel data blocking method to decrease pairwise comparisons based on geographical features. For result evaluation, we developed a dataset from two real aviation accident datasets. The evaluation shows that the quality of entity recognition and data fusion improved by using our proposed data granulation technique.
| Original language | English |
|---|---|
| Pages (from-to) | 6104-6123 |
| Number of pages | 20 |
| Journal | Applied Intelligence |
| Volume | 51 |
| Issue number | 8 |
| DOIs | |
| Publication status | Published - Aug 2021 |
Bibliographical note
Publisher Copyright:© 2021, Springer Science+Business Media, LLC, part of Springer Nature.
Fingerprint
Dive into the research topics of 'A novel similarity measure for spatial entity resolution based on data granularity model : managing inconsistencies in place descriptions'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver