当前位置: X-MOL 学术Cluster Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cross-device matching approaches: word embedding and supervised learning
Cluster Computing ( IF 4.4 ) Pub Date : 2021-06-02 , DOI: 10.1007/s10586-021-03313-4
Frank Yeong-Sung Lin , Chiu-Han Hsiao , Si-Yuan Zhang , Yi-Ping Rung , Yu-Xuan Chen

Due to the rapid development of diversified technology, people may use multiple electronic devices, such as personal computers, tablets, and smartphones, to connect to the Internet in their daily lives. Switching between devices enables a user to use e-commerce on various platforms. The complexity of consumer behavior is directly proportional to the number of involved devices. Additionally, since the personal privacy regulations nowadays are getting more strict, the user data on the Internet starts to be anonymous. Thus, determining how the devices are related is an indispensable step in achieving precision marketing or developing customized applications. In this research, the dataset provided by the CIKM Cup 2016 Challenge is used. The representation of a device is created by extracting features from browsing logs. The computation cost is reduced by filtering candidates of a target device instead of comparing them in pairs. Latent semantic indexing representations and techniques of supervised learning are used to accomplish filtering. Performing word embedding can turn literature semantic into vectors through an unsupervised neural ensemble. The addition of feature engineering on the input vectors of supervised classification can enhance the classifier’s discrimination. The classification is used to determine the probability of any two instances belonging to the same user. The significant benefit of the implementation is to form the sequences mentioned above by a cross-device linking mechanism to provide a baseline for aligning with the computation limitation and boosting the performance.



中文翻译:

跨设备匹配方法:词嵌入和监督学习

由于多元化科技的快速发展,人们在日常生活中可能会使用多种电子设备,例如个人电脑、平板电脑和智能手机来连接互联网。设备之间的切换使用户能够在各种平台上使用电子商务。消费者行为的复杂性与所涉及设备的数量成正比。此外,由于如今个人隐私法规越来越严格,互联网上的用户数据开始匿名化。因此,确定设备之间的关联是实现精准营销或开发定制化应用不可或缺的一步。在本研究中,使用了 CIKM Cup 2016 Challenge 提供的数据集。通过从浏览日志中提取特征来创建设备的表示。通过过滤目标设备的候选而不是成对比较来降低计算成本。潜在语义索引表示和监督学习技术用于完成过滤。执行词嵌入可以通过无监督的神经集成将文学语义转化为向量。在监督分类的输入向量上加入特征工程可以增强分类器的辨别力。分类用于确定属于同一用户的任意两个实例的概率。该实现的显着好处是通过跨设备链接机制形成上述序列,为对齐计算限制和提升性能提供基线。

更新日期:2021-06-02
down
wechat
bug