当前位置:
X-MOL 学术
›
arXiv.cs.DB
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
An End-to-end Point of Interest (POI) Conflation Framework
arXiv - CS - Databases Pub Date : 2021-09-13 , DOI: arxiv-2109.06073 Raymond Low, Zeynep D. Tekler, Lynette Cheah
arXiv - CS - Databases Pub Date : 2021-09-13 , DOI: arxiv-2109.06073 Raymond Low, Zeynep D. Tekler, Lynette Cheah
Point of interest (POI) data serves as a valuable source of semantic
information for places of interest and has many geospatial applications in real
estate, transportation, and urban planning. With the availability of different
data sources, POI conflation serves as a valuable technique for enriching data
quality and coverage by merging the POI data from multiple sources. This study
proposes a novel end-to-end POI conflation framework consisting of six steps,
starting with data procurement, schema standardisation, taxonomy mapping, POI
matching, POI unification, and data verification. The feasibility of the
proposed framework was demonstrated in a case study conducted in the eastern
region of Singapore, where the POI data from five data sources was conflated to
form a unified POI dataset. Based on the evaluation conducted, the resulting
unified dataset was found to be more comprehensive and complete than any of the
five POI data sources alone. Furthermore, the proposed approach for identifying
POI matches between different data sources outperformed all baseline approaches
with a matching accuracy of 97.6% with an average run time below 3 minutes when
matching over 12,000 POIs to result in 8,699 unique POIs, thereby demonstrating
the framework's scalability for large scale implementation in dense urban
contexts.
中文翻译:
端到端兴趣点 (POI) 融合框架
兴趣点 (POI) 数据是兴趣点语义信息的宝贵来源,在房地产、交通和城市规划中具有许多地理空间应用。随着不同数据源的可用性,POI 合并作为一种有价值的技术,通过合并来自多个源的 POI 数据来丰富数据质量和覆盖范围。本研究提出了一种新颖的端到端 POI 合并框架,包括六个步骤,从数据采购、模式标准化、分类映射、POI 匹配、POI 统一和数据验证开始。在新加坡东部地区进行的案例研究中证明了拟议框架的可行性,其中来自五个数据源的 POI 数据被合并以形成一个统一的 POI 数据集。根据进行的评估,结果发现统一的数据集比单独的五个 POI 数据源中的任何一个都更全面和完整。此外,所提出的用于识别不同数据源之间 POI 匹配的方法优于所有基线方法,匹配准确度为 97.6%,平均运行时间低于 3 分钟,匹配超过 12,000 个 POI 以产生 8,699 个唯一 POI,从而证明了框架的可扩展性在密集的城市环境中大规模实施。
更新日期:2021-09-14
中文翻译:
端到端兴趣点 (POI) 融合框架
兴趣点 (POI) 数据是兴趣点语义信息的宝贵来源,在房地产、交通和城市规划中具有许多地理空间应用。随着不同数据源的可用性,POI 合并作为一种有价值的技术,通过合并来自多个源的 POI 数据来丰富数据质量和覆盖范围。本研究提出了一种新颖的端到端 POI 合并框架,包括六个步骤,从数据采购、模式标准化、分类映射、POI 匹配、POI 统一和数据验证开始。在新加坡东部地区进行的案例研究中证明了拟议框架的可行性,其中来自五个数据源的 POI 数据被合并以形成一个统一的 POI 数据集。根据进行的评估,结果发现统一的数据集比单独的五个 POI 数据源中的任何一个都更全面和完整。此外,所提出的用于识别不同数据源之间 POI 匹配的方法优于所有基线方法,匹配准确度为 97.6%,平均运行时间低于 3 分钟,匹配超过 12,000 个 POI 以产生 8,699 个唯一 POI,从而证明了框架的可扩展性在密集的城市环境中大规模实施。