当前位置: X-MOL 学术Artif. Intell. Rev. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Postal address extraction from the web: a comprehensive survey
Artificial Intelligence Review ( IF 12.0 ) Pub Date : 2021-03-14 , DOI: 10.1007/s10462-021-09983-1
Mohammed Kayed , Sara Dakrory , A. A. Ali

The Web is a source of information for Location-Based Service (LBS) applications. These applications lack postal addresses for the user’s Point of Interests (POIs) such as schools, hospitals, restaurants, etc., as these locations are annotated manually by using the yellow pages or by the location owners (users/companies). Our study in this paper confirms that Google Maps, a common LBS application, only contains about \(32.5\%\) of the public schools that are registered officially in the documents provided by the Directorate of Education in Egypt. However, the remaining missed school addresses could be fished from the Web (e.g., social media). To the best of our knowledge, no prior survey has been published to compare the previous Web postal address extraction approaches. Additionally, all proposed approaches for address extraction are local (could be working in specific countries/locations with particular languages) and could not be used or even adapted to work in other countries/locations with other languages. Furthermore, the problem of Web postal address extraction is not addressed in many countries such as Arab countries (e.g. Egypt). This paper discusses the issue of address extraction, highlights and compares the recently used techniques in extracting addresses from Web pages. In addition, it investigates the discrepancy of knowledge among existing systems. Moreover, it provides a comprehensive review of the geographical Gazetteers used in the Web postal address approaches and compares their data quality dimensions.



中文翻译:

从网上提取邮政地址:全面调查

Web是基于位置的服务(LBS)应用程序的信息源。这些应用程序缺少用户兴趣点(POI)的邮政地址,例如学校,医院,饭店等,因为这些位置是通过使用黄页或位置所有者(用户/公司)手动注释的。我们在本文中的研究证实,常见的LBS应用程序Google Maps仅包含\(32.5 \%\)在埃及教育局提供的文件中已正式注册的公立学校。但是,可以从Web(例如社交媒体)上获取剩余的错过的学校地址。就我们所知,尚未发表任何调查来比较以前的Web邮政地址提取方法。此外,所有建议的地址提取方法都是本地的(可以在使用特定语言的特定国家/地区工作),甚至不能用于甚至适用于使用其他语言的其他国家/地区。此外,在许多国家中,例如阿拉伯国家(例如埃及),并未解决网络邮政地址提取的问题。本文讨论地址提取的问题,重点介绍并比较了最近从网页提取地址时使用的技术。此外,它还研究了现有系统之间的知识差异。此外,它提供了对网络邮政地址方法中使用的地理地名词典的全面回顾,并比较了它们的数据质量维度。

更新日期:2021-03-15
down
wechat
bug