当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning Neighborhood Representation from Multi-Modal Multi-Graph: Image, Text, Mobility Graph and Beyond
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-05-06 , DOI: arxiv-2105.02489
Tianyuan Huang, Zhecheng Wang, Hao Sheng, Andrew Y. Ng, Ram Rajagopal

Recent urbanization has coincided with the enrichment of geotagged data, such as street view and point-of-interest (POI). Region embedding enhanced by the richer data modalities has enabled researchers and city administrators to understand the built environment, socioeconomics, and the dynamics of cities better. While some efforts have been made to simultaneously use multi-modal inputs, existing methods can be improved by incorporating different measures of 'proximity' in the same embedding space - leveraging not only the data that characterizes the regions (e.g., street view, local businesses pattern) but also those that depict the relationship between regions (e.g., trips, road network). To this end, we propose a novel approach to integrate multi-modal geotagged inputs as either node or edge features of a multi-graph based on their relations with the neighborhood region (e.g., tiles, census block, ZIP code region, etc.). We then learn the neighborhood representation based on a contrastive-sampling scheme from the multi-graph. Specifically, we use street view images and POI features to characterize neighborhoods (nodes) and use human mobility to characterize the relationship between neighborhoods (directed edges). We show the effectiveness of the proposed methods with quantitative downstream tasks as well as qualitative analysis of the embedding space: The embedding we trained outperforms the ones using only unimodal data as regional inputs.

中文翻译:

从多模态多图学习邻域表示:图像,文本,移动图及其他

最近的城市化与地理标记数据(例如街景和兴趣点(POI))的丰富同时发生。丰富的数据模式增强了区域嵌入功能,使研究人员和城市管理员可以更好地了解建筑环境,社会经济学和城市动态。尽管已经做出一些努力来同时使用多模式输入的方法,但是可以通过在同一嵌入空间中采用不同的“接近度”度量来改进现有方法-不仅利用表征区域特征的数据(例如,街景,本地企业模式),也可以描述区域之间的关系(例如旅行,道路网络)。为此,我们提出了一种新颖的方法,根据多模态地理标记输入与邻域(例如,图块,人口普查块,邮政编码区域等)之间的关系,将多模态地理标记输入作为多图的节点或边缘特征进行集成。然后,我们从多张图中基于对比采样方案学习邻域表示。具体来说,我们使用街景图像和POI功能来表征邻域(节点),并使用人类可移动性来表征邻域之间的关系(有向边)。我们通过定量的下游任务以及对嵌入空间的定性分析,展示了所提出方法的有效性:我们训练的嵌入仅使用单峰数据作为区域输入而优于那些嵌入的方法。然后,我们从多张图中基于对比采样方案学习邻域表示。具体来说,我们使用街景图像和POI功能来表征邻域(节点),并使用人类可移动性来表征邻域之间的关系(有向边)。我们通过定量的下游任务以及对嵌入空间的定性分析,展示了所提出方法的有效性:我们训练的嵌入仅使用单峰数据作为区域输入而优于那些嵌入的方法。然后,我们从多张图中基于对比采样方案学习邻域表示。具体来说,我们使用街景图像和POI功能来表征邻域(节点),并使用人类可移动性来表征邻域之间的关系(有向边)。我们通过定量的下游任务以及对嵌入空间的定性分析,展示了所提出方法的有效性:我们训练的嵌入仅使用单峰数据作为区域输入而优于那些嵌入的方法。
更新日期:2021-05-07
down
wechat
bug