当前期刊: arXiv - CS - Information Retrieval Go to current issue    加入关注   
显示样式:        排序: IF: - GO 导出
我的关注
我的收藏
您暂时未登录!
登录
  • The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
    arXiv.cs.IR Pub Date : 2021-01-14
    Ronak Pradeep; Rodrigo Nogueira; Jimmy Lin

    We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains. At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture. "Expando" refers to the use of document expansion techniques to enrich keyword representations

    更新日期:2021-01-15
  • $C^3DRec$: Cloud-Client Cooperative Deep Learning for Temporal Recommendation in the Post-GDPR Era
    arXiv.cs.IR Pub Date : 2021-01-13
    Jialiang Han; Yun Ma

    Mobile devices enable users to retrieve information at any time and any place. Considering the occasional requirements and fragmentation usage pattern of mobile users, temporal recommendation techniques are proposed to improve the efficiency of information retrieval on mobile devices by means of accurately recommending items via learning temporal interests with short-term user interaction behaviors

    更新日期:2021-01-15
  • Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter
    arXiv.cs.IR Pub Date : 2021-01-09
    Sarah Alqurashi; Btool Hamoui; Abdulaziz Alashaikh; Ahmad Alhindi; Eisa Alanazi

    The rapid growth of social media content during the current pandemic provides useful tools for disseminating information which has also become a root for misinformation. Therefore, there is an urgent need for fact-checking and effective techniques for detecting misinformation in social media. In this work, we study the misinformation in the Arabic content of Twitter. We construct a large Arabic dataset

    更新日期:2021-01-15
  • TrNews: Heterogeneous User-Interest Transfer Learning for News Recommendation
    arXiv.cs.IR Pub Date : 2021-01-12
    Guangneng Hu; Qiang Yang

    We investigate how to solve the cross-corpus news recommendation for unseen users in the future. This is a problem where traditional content-based recommendation techniques often fail. Luckily, in real-world recommendation services, some publisher (e.g., Daily news) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher (e.g., Political news). To

    更新日期:2021-01-15
  • Learning Student Interest Trajectory for MOOCThread Recommendation
    arXiv.cs.IR Pub Date : 2021-01-10
    Shalini Pandey; Andrew Lan; George Karypis; Jaideep Srivastava

    In recent years, Massive Open Online Courses (MOOCs) have witnessed immense growth in popularity. Now, due to the recent Covid19 pandemic situation, it is important to push the limits of online education. Discussion forums are primary means of interaction among learners and instructors. However, with growing class size, students face the challenge of finding useful and informative discussion forums

    更新日期:2021-01-15
  • Analysis of E-commerce Ranking Signals via Signal Temporal Logic
    arXiv.cs.IR Pub Date : 2021-01-14
    Tommaso DreossiAmazon Search; Giorgio BallardinAmazon Search; Parth GuptaAmazon Search; Jan BakusAmazon Search; Yu-Hsiang LinAmazon Search; Vamsi SalakaAmazon Search

    The timed position of documents retrieved by learning to rank models can be seen as signals. Signals carry useful information such as drop or rise of documents over time or user behaviors. In this work, we propose to use the logic formalism called Signal Temporal Logic (STL) to characterize document behaviors in ranking accordingly to the specified formulas. Our analysis shows that interesting document

    更新日期:2021-01-15
  • Knowledge-Enhanced Top-K Recommendation in Poincaré Ball
    arXiv.cs.IR Pub Date : 2021-01-13
    Chen Ma; Liheng Ma; Yingxue Zhang; Haolun Wu; Xue Liu; Mark Coates

    Personalized recommender systems are increasingly important as more content and services become available and users struggle to identify what might interest them. Thanks to the ability for providing rich information, knowledge graphs (KGs) are being incorporated to enhance the recommendation performance and interpretability. To effectively make use of the knowledge graph, we propose a recommendation

    更新日期:2021-01-14
  • Heterogeneous Network Embedding for Deep Semantic Relevance Match in E-commerce Search
    arXiv.cs.IR Pub Date : 2021-01-13
    Ziyang Liu; Zhaomeng Cheng; Yunjiang Jiang; Yue Shang; Wei Xiong; Sulong Xu; Bo Long; Di Jin

    Result relevance prediction is an essential task of e-commerce search engines to boost the utility of search engines and ensure smooth user experience. The last few years eyewitnessed a flurry of research on the use of Transformer-style models and deep text-match models to improve relevance. However, these two types of models ignored the inherent bipartite network structures that are ubiquitous in

    更新日期:2021-01-14
  • Probabilistic Metric Learning with Adaptive Margin for Top-K Recommendation
    arXiv.cs.IR Pub Date : 2021-01-13
    Chen Ma; Liheng Ma; Yingxue Zhang; Ruiming Tang; Xue Liu; Mark Coates

    Personalized recommender systems are playing an increasingly important role as more content and services become available and users struggle to identify what might interest them. Although matrix factorization and deep learning based methods have proved effective in user preference modeling, they violate the triangle inequality and fail to capture fine-grained preference information. To tackle this

    更新日期:2021-01-14
  • Discrete Knowledge Graph Embedding based on Discrete Optimization
    arXiv.cs.IR Pub Date : 2021-01-13
    Yunqi Li; Shuyuan Xu; Bo Liu; Zuohui Fu; Shuchang Liu; Xu Chen; Yongfeng Zhang

    This paper proposes a discrete knowledge graph (KG) embedding (DKGE) method, which projects KG entities and relations into the Hamming space based on a computationally tractable discrete optimization algorithm, to solve the formidable storage and computation cost challenges in traditional continuous graph embedding methods. The convergence of DKGE can be guaranteed theoretically. Extensive experiments

    更新日期:2021-01-14
  • Distributed storage algorithms with optimal tradeoffs
    arXiv.cs.IR Pub Date : 2021-01-13
    Michael Luby; Thomas Richardson

    One of the primary objectives of a distributed storage system is to reliably store large amounts of source data for long durations using a large number $N$ of unreliable storage nodes, each with $c$ bits of storage capacity. Storage nodes fail randomly over time and are replaced with nodes of equal capacity initialized to zeroes, and thus bits are erased at some rate $e$. To maintain recoverability

    更新日期:2021-01-14
  • LaDiff ULMFiT: A Layer Differentiated training approach for ULMFiT
    arXiv.cs.IR Pub Date : 2021-01-13
    Mohammed Azhan; Mohammad Ahmad

    In our paper, we present Deep Learning models with a layer differentiated training method which were used for the SHARED TASK@ CONSTRAINT 2021 sub-tasks COVID19 Fake News Detection in English and Hostile Post Detection in Hindi. We propose a Layer Differentiated training procedure for training a pre-trained ULMFiT arXiv:1801.06146 model. We used special tokens to annotate specific parts of the tweets

    更新日期:2021-01-14
  • On the Calibration and Uncertainty of Neural Learning to Rank Models
    arXiv.cs.IR Pub Date : 2021-01-12
    Gustavo Penha; Claudia Hauff

    According to the Probability Ranking Principle (PRP), ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad-hoc retrieval. The PRP holds when two conditions are met: [C1] the models are well calibrated, and, [C2] the probabilities of relevance are reported with certainty. We know however that deep neural networks (DNNs) are often not well

    更新日期:2021-01-13
  • Neural News Recommendation with Negative Feedback
    arXiv.cs.IR Pub Date : 2021-01-12
    Chuhan Wu; Fangzhao Wu; Yongfeng Huang; Xing Xie

    News recommendation is important for online news services. Precise user interest modeling is critical for personalized news recommendation. Existing news recommendation methods usually rely on the implicit feedback of users like news clicks to model user interest. However, news click may not necessarily reflect user interests because users may click a news due to the attraction of its title but feel

    更新日期:2021-01-13
  • AI- and HPC-enabled Lead Generation for SARS-CoV-2: Models and Processes to Extract Druglike Molecules Contained in Natural Language Text
    arXiv.cs.IR Pub Date : 2021-01-12
    Zhi Hong; J. Gregory Pauloski; Logan Ward; Kyle Chard; Ben Blaiszik; Ian Foster

    Researchers worldwide are seeking to repurpose existing drugs or discover new drugs to counter the disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). A promising source of candidates for such studies is molecules that have been reported in the scientific literature to be drug-like in the context of coronavirus research. We report here on a project that leverages both human

    更新日期:2021-01-13
  • Toward Effective Automated Content Analysis via Crowdsourcing
    arXiv.cs.IR Pub Date : 2021-01-12
    Jiele Wu; Chau-Wai Wong; Xinyan Zhao; Xianpeng Liu

    Many computer scientists use the aggregated answers of online workers to represent ground truth. Prior work has shown that aggregation methods such as majority voting are effective for measuring relatively objective features. For subjective features such as semantic connotation, online workers, known for optimizing their hourly earnings, tend to deteriorate in the quality of their responses as they

    更新日期:2021-01-13
  • Measuring Recommender System Effects with Simulated Users
    arXiv.cs.IR Pub Date : 2021-01-12
    Sirui Yao; Yoni Halpern; Nithum Thain; Xuezhi Wang; Kang Lee; Flavien Prost; Ed H. Chi; Jilin Chen; Alex Beutel

    Imagine a food recommender system -- how would we check if it is \emph{causing} and fostering unhealthy eating habits or merely reflecting users' interests? How much of a user's experience over time with a recommender is caused by the recommender system's choices and biases, and how much is based on the user's preferences and biases? Popularity bias and filter bubbles are two of the most well-studied

    更新日期:2021-01-13
  • Locality Sensitive Hashing for Efficient Similar Polygon Retrieval
    arXiv.cs.IR Pub Date : 2021-01-12
    Haim Kaplan; Jay Tenenbaum

    Locality Sensitive Hashing (LSH) is an effective method of indexing a set of items to support efficient nearest neighbors queries in high-dimensional spaces. The basic idea of LSH is that similar items should produce hash collisions with higher probability than dissimilar items. We study LSH for (not necessarily convex) polygons, and use it to give efficient data structures for similar shape retrieval

    更新日期:2021-01-13
  • Quantum Mathematics in Artificial Intelligence
    arXiv.cs.IR Pub Date : 2021-01-12
    Dominic Widdows; Kirsty Kitto; Trevor Cohen

    In the decade since 2010, successes in artificial intelligence have been at the forefront of computer science and technology, and vector space models have solidified a position at the forefront of artificial intelligence. At the same time, quantum computers have become much more powerful, and announcements of major advances are frequently in the news. The mathematical techniques underlying both these

    更新日期:2021-01-13
  • Disentangled Self-Attentive Neural Networks for Click-Through Rate Prediction
    arXiv.cs.IR Pub Date : 2021-01-11
    Yanqiao Zhu; Yichen Xu; Feng Yu; Qiang Liu; Shu Wu; Liang Wang

    Click-through rate (CTR) prediction, which aims to predict the probability that whether of a user will click on an item, is an essential task for many online applications. Due to the nature of data sparsity and high dimensionality in CTR prediction, a key to making effective prediction is to model high-order feature interactions among feature fields. To explicitly model high-order feature interactions

    更新日期:2021-01-12
  • Transfer Learning and Augmentation for Word Sense Disambiguation
    arXiv.cs.IR Pub Date : 2021-01-10
    Harsh Kohli

    Many downstream NLP tasks have shown significant improvement through continual pre-training, transfer learning and multi-task learning. State-of-the-art approaches in Word Sense Disambiguation today benefit from some of these approaches in conjunction with information sources such as semantic relationships and gloss definitions contained within WordNet. Our work builds upon these systems and uses data

    更新日期:2021-01-12
  • Towards Long-term Fairness in Recommendation
    arXiv.cs.IR Pub Date : 2021-01-10
    Yingqiang Ge; Shuchang Liu; Ruoyuan Gao; Yikun Xian; Yunqi Li; Xiangyu Zhao; Changhua Pei; Fei Sun; Junfeng Ge; Wenwu Ou; Yongfeng Zhang

    As Recommender Systems (RS) influence more and more people in their daily life, the issue of fairness in recommendation is becoming more and more important. Most of the prior approaches to fairness-aware recommendation have been situated in a static or one-shot setting, where the protected groups of items are fixed, and the model provides a one-time fairness solution based on fairness-constrained optimization

    更新日期:2021-01-12
  • Context-Aware Target Apps Selection and Recommendation for Enhancing Personal Mobile Assistants
    arXiv.cs.IR Pub Date : 2021-01-09
    Mohammad Aliannejadi; Hamed Zamani; Fabio Crestani; W. Bruce Croft

    Users install many apps on their smartphones, raising issues related to information overload for users and resource management for devices. Moreover, the recent increase in the use of personal assistants has made mobile devices even more pervasive in users' lives. This paper addresses two research problems that are vital for developing effective personal mobile assistants: target apps selection and

    更新日期:2021-01-12
  • Generate Natural Language Explanations for Recommendation
    arXiv.cs.IR Pub Date : 2021-01-09
    Hanxiong Chen; Xu Chen; Shaoyun Shi; Yongfeng Zhang

    Providing personalized explanations for recommendations can help users to understand the underlying insight of the recommendation results, which is helpful to the effectiveness, transparency, persuasiveness and trustworthiness of recommender systems. Current explainable recommendation models mostly generate textual explanations based on pre-defined sentence templates. However, the expressiveness power

    更新日期:2021-01-12
  • Selection of Optimal Parameters in the Fast K-Word Proximity Search Based on Multi-component Key Indexes
    arXiv.cs.IR Pub Date : 2021-01-09
    Alexander B. Veretennikov

    Proximity full-text search is commonly implemented in contemporary full-text search systems. Let us assume that the search query is a list of words. It is natural to consider a document as relevant if the queried words are near each other in the document. The proximity factor is even more significant for the case where the query consists of frequently occurring words. Proximity full-text search requires

    更新日期:2021-01-12
  • An Unsupervised Normalization Algorithm for Noisy Text: A Case Study for Information Retrieval and Stance Detection
    arXiv.cs.IR Pub Date : 2021-01-09
    Anurag Roy; Shalmoli Ghosh; Kripabandhu Ghosh; Saptarshi Ghosh

    A large fraction of textual data available today contains various types of 'noise', such as OCR noise in digitized documents, noise due to informal writing style of users on microblogging sites, and so on. To enable tasks such as search/retrieval and classification over all the available data, we need robust algorithms for text normalization, i.e., for cleaning different kinds of noise in the text

    更新日期:2021-01-12
  • Evaluating Deep Learning Approaches for Covid19 Fake News Detection
    arXiv.cs.IR Pub Date : 2021-01-11
    Apurva Wani; Isha Joshi; Snehal Khandve; Vedangi Wagh; Raviraj Joshi

    Social media platforms like Facebook, Twitter, and Instagram have enabled connection and communication on a large scale. It has revolutionized the rate at which information is shared and enhanced its reach. However, another side of the coin dictates an alarming story. These platforms have led to an increase in the creation and spread of fake news. The fake news has not only influenced people in the

    更新日期:2021-01-12
  • Investigating the Vision Transformer Model for Image Retrieval Tasks
    arXiv.cs.IR Pub Date : 2021-01-11
    Socratis Gkelios; Yiannis Boutalis; Savvas A. Chatzichristofis

    This paper introduces a plug-and-play descriptor that can be effectively adopted for image retrieval tasks without prior initialization or preparation. The description method utilizes the recently proposed Vision Transformer network while it does not require any training data to adjust parameters. In image retrieval tasks, the use of Handcrafted global and local descriptors has been very successfully

    更新日期:2021-01-12
  • Summaformers @ LaySumm 20, LongSumm 20
    arXiv.cs.IR Pub Date : 2021-01-10
    Sayar Ghosh Roy; Nikhil Pinnaparaju; Risubh Jain; Manish Gupta; Vasudeva Varma

    Automatic text summarization has been widely studied as an important task in natural language processing. Traditionally, various feature engineering and machine learning based systems have been proposed for extractive as well as abstractive text summarization. Recently, deep learning based, specifically Transformer-based systems have been immensely popular. Summarization is a cognitively challenging

    更新日期:2021-01-12
  • Leveraging Multilingual Transformers for Hate Speech Detection
    arXiv.cs.IR Pub Date : 2021-01-08
    Sayar Ghosh Roy; Ujwal Narayan; Tathagata Raha; Zubair Abid; Vasudeva Varma

    Detecting and classifying instances of hate in social media text has been a problem of interest in Natural Language Processing in the recent years. Our work leverages state of the art Transformer language models to identify hate speech in a multilingual setting. Capturing the intent of a post or a comment on social media involves careful evaluation of the language style, semantic content and additional

    更新日期:2021-01-12
  • Application of Knowledge Graphs to Provide Side Information for Improved Recommendation Accuracy
    arXiv.cs.IR Pub Date : 2021-01-07
    Yuhao Mao; Serguei A. Mokhov; Sudhir P. Mudur

    Personalized recommendations are popular in these days of Internet driven activities, specifically shopping. Recommendation methods can be grouped into three major categories, content based filtering, collaborative filtering and machine learning enhanced. Information about products and preferences of different users are primarily used to infer preferences for a specific user. Inadequate information

    更新日期:2021-01-11
  • Spatial Object Recommendation with Hints: When Spatial Granularity Matters
    arXiv.cs.IR Pub Date : 2021-01-08
    Hui Luo; Jingbo Zhou; Zhifeng Bao; Shuangli Li; J. Shane Culpepper; Haochao Ying; Hao Liu; Hui Xiong

    Existing spatial object recommendation algorithms generally treat objects identically when ranking them. However, spatial objects often cover different levels of spatial granularity and thereby are heterogeneous. For example, one user may prefer to be recommended a region (say Manhattan), while another user might prefer a venue (say a restaurant). Even for the same user, preferences can change at different

    更新日期:2021-01-11
  • Dynamic Graph Collaborative Filtering
    arXiv.cs.IR Pub Date : 2021-01-08
    Xiaohan Li; Mengqi Zhang; Shu Wu; Zheng Liu; Liang Wang; Philip S. Yu

    Dynamic recommendation is essential for modern recommender systems to provide real-time predictions based on sequential data. In real-world scenarios, the popularity of items and interests of users change over time. Based on this assumption, many previous works focus on interaction sequences and learn evolutionary embeddings of users and items. However, we argue that sequence-based models are not able

    更新日期:2021-01-11
  • Multistage BiCross Encoder: Team GATE Entry for MLIA Multilingual Semantic Search Task 2
    arXiv.cs.IR Pub Date : 2021-01-08
    Iknoor Singh; Carolina Scarton; Kalina Bontcheva

    The Coronavirus (COVID-19) pandemic has led to a rapidly growing `infodemic' online. Thus, the accurate retrieval of reliable relevant data from millions of documents about COVID-19 has become urgently needed for the general public as well as for other stakeholders. The COVID-19 Multilingual Information Access (MLIA) initiative is a joint effort to ameliorate exchange of COVID-19 related information

    更新日期:2021-01-11
  • Scalable Cross-lingual Document Similarity through Language-specific Concept Hierarchies
    arXiv.cs.IR Pub Date : 2020-12-15
    Carlos Badenes-Olmedo; Jose-Luis Redondo García; Oscar Corcho

    With the ongoing growth in number of digital articles in a wider set of languages and the expanding use of different languages, we need annotation methods that enable browsing multi-lingual corpora. Multilingual probabilistic topic models have recently emerged as a group of semi-supervised machine learning models that can be used to perform thematic explorations on collections of texts in multiple

    更新日期:2021-01-11
  • Towards Meaningful Statements in IR Evaluation. Mapping Evaluation Measures to Interval Scales
    arXiv.cs.IR Pub Date : 2021-01-07
    Marco Ferrante; Nicola Ferro; Norbert Fuhr

    Recently, it was shown that most popular IR measures are not interval-scaled, implying that decades of experimental IR research used potentially improper methods, which may have produced questionable results. However, it was unclear if and to what extent these findings apply to actual evaluations and this opened a debate in the community with researchers standing on opposite positions about whether

    更新日期:2021-01-08
  • Metric Learning for Session-based Recommendations
    arXiv.cs.IR Pub Date : 2021-01-07
    Bartłomiej Twardowski; Paweł Zawistowski; Szymon Zaborowski

    Session-based recommenders, used for making predictions out of users' uninterrupted sequences of actions, are attractive for many applications. Here, for this task we propose using metric learning, where a common embedding space for sessions and items is created, and distance measures dissimilarity between the provided sequence of users' events and the next action. We discuss and compare metric learning

    更新日期:2021-01-08
  • Attitudes toward Open Access, Open Peer Review, and Altmetrics among Contributors to Spanish Scholarly Journals
    arXiv.cs.IR Pub Date : 2021-01-07
    Francisco Segado-Boj; Juan Martin-Quevedo; Juan Jose Prieto-Gutierrez

    This paper aims to gain a better understanding of the perspectives of contributors to Spanish academic journals regarding open access, open peer review, and altmetrics. It also explores how age, gender, professional experience, career history, and perception and use of social media influence authors opinions toward these developments in scholarly publishing. A sample of contributors (n-1254) to Spanish

    更新日期:2021-01-08
  • Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity
    arXiv.cs.IR Pub Date : 2021-01-07
    Ankush Chopra; Shruti Agrawal; Sohom Ghosh

    Search is one of the most common platforms used to seek information. However, users mostly get overloaded with results whenever they use such a platform to resolve their queries. Nowadays, direct answers to queries are being provided as a part of the search experience. The question-answer (QA) retrieval process plays a significant role in enriching the search experience. Most off-the-shelf Semantic

    更新日期:2021-01-08
  • Transformer-based approach towards music emotion recognition from lyrics
    arXiv.cs.IR Pub Date : 2021-01-06
    Yudhik Agrawal; Ramaguru Guru Ravi Shanker; Vinoo Alluri

    The task of identifying emotions from a given music track has been an active pursuit in the Music Information Retrieval (MIR) community for years. Music emotion recognition has typically relied on acoustic features, social tags, and other metadata to identify and classify music emotions. The role of lyrics in music emotion recognition remains under-appreciated in spite of several studies reporting

    更新日期:2021-01-07
  • A Multilayer Correlated Topic Model
    arXiv.cs.IR Pub Date : 2021-01-02
    Ye Tian

    We proposed a novel multilayer correlated topic model (MCTM) to analyze how the main ideas inherit and vary between a document and its different segments, which helps understand an article's structure. The variational expectation-maximization (EM) algorithm was derived to estimate the posterior and parameters in MCTM. We introduced two potential applications of MCTM, including the paragraph-level document

    更新日期:2021-01-07
  • Investigating the efficacy of music version retrieval systems for setlist identification
    arXiv.cs.IR Pub Date : 2021-01-06
    Furkan Yesiler; Emilio Molina; Joan Serrà; Emilia Gómez

    The setlist identification (SLI) task addresses a music recognition use case where the goal is to retrieve the metadata and timestamps for all the tracks played in live music events. Due to various musical and non-musical changes in live performances, developing automatic SLI systems is still a challenging task that, despite its industrial relevance, has been under-explored in the academic literature

    更新日期:2021-01-07
  • COVID-19: Comparative Analysis of Methods for Identifying Articles Related to Therapeutics and Vaccines without Using Labeled Data
    arXiv.cs.IR Pub Date : 2021-01-05
    Mihir Parmar; Ashwin Karthik Ambalavanan; Hong Guan; Rishab Banerjee; Jitesh Pabla; Murthy Devarakonda

    Here we proposed an approach to analyze text classification methods based on the presence or absence of task-specific terms (and their synonyms) in the text. We applied this approach to study six different transfer-learning and unsupervised methods for screening articles relevant to COVID-19 vaccines and therapeutics. The analysis revealed that while a BERT model trained on search-engine results generally

    更新日期:2021-01-07
  • SF-QA: Simple and Fair Evaluation Library for Open-domain Question Answering
    arXiv.cs.IR Pub Date : 2021-01-06
    Xiaopeng Lu; Kyusong Lee; Tiancheng Zhao

    Although open-domain question answering (QA) draws great attention in recent years, it requires large amounts of resources for building the full system and is often difficult to reproduce previous results due to complex configurations. In this paper, we introduce SF-QA: simple and fair evaluation framework for open-domain QA. SF-QA framework modularizes the pipeline open-domain QA system, which makes

    更新日期:2021-01-07
  • Taxonomy Completion via Triplet Matching Network
    arXiv.cs.IR Pub Date : 2021-01-06
    Jieyu Zhang; Xiangchen Song; Ying Zeng; Jiaze chen; Jiaming Shen; Yuning Mao; Lei Li

    Automatically constructing taxonomy finds many applications in e-commerce and web search. One critical challenge is as data and business scope grow in real applications, new concepts are emerging and needed to be added to the existing taxonomy. Previous approaches focus on the taxonomy expansion, i.e. finding an appropriate hypernym concept from the taxonomy for a new query concept. In this paper,

    更新日期:2021-01-07
  • Contrastive Learning for Recommender System
    arXiv.cs.IR Pub Date : 2021-01-05
    Zhuang Liu; Yunpu Ma; Yuanxin Ouyang; Zhang Xiong

    Recommender systems, which analyze users' preference patterns to suggest potential targets, are indispensable in today's society. Collaborative Filtering (CF) is the most popular recommendation model. Specifically, Graph Neural Network (GNN) has become a new state-of-the-art for CF. In the GNN-based recommender system, message dropout is usually used to alleviate the selection bias in the user-item

    更新日期:2021-01-06
  • Generating Informative CVE Description From ExploitDB Posts by Extractive Summarization
    arXiv.cs.IR Pub Date : 2021-01-05
    Jiamou Sun; Zhenchang Xing; Hao Guo; Deheng Ye; Xiaohong Li; Xiwei Xu; Liming Zhu

    ExploitDB is one of the important public websites, which contributes a large number of vulnerabilities to official CVE database. Over 60\% of these vulnerabilities have high- or critical-security risks. Unfortunately, over 73\% of exploits appear publicly earlier than the corresponding CVEs, and about 40\% of exploits do not even have CVEs. To assist in documenting CVEs for the ExploitDB posts, we

    更新日期:2021-01-06
  • Presenting a Dataset for Collaborator Recommending Systems in Academic Social Network: a Case Study on ReseachGate
    arXiv.cs.IR Pub Date : 2020-12-29
    Zahra Roozbahani; Jalal Rezaeenour; Roshan Shahrooei; Hanif Emamgholizadeh

    Collaborator finding systems are a special type of expert finding models. There is a long-lasting challenge for research in the collaborator recommending research area, which is the lack of a structured dataset to be used by the researchers. We introduce two datasets to fill this gap. The first dataset is prepared for designing a consistent, collaborator finding system. The next one, called a co-author

    更新日期:2021-01-05
  • Improving reference mining in patents with BERT
    arXiv.cs.IR Pub Date : 2021-01-04
    Ken Voskuil; Suzan Verberne

    References in patents to scientific literature provide relevant information for studying the relation between science and technological inventions. These references allow us to answer questions about the types of scientific work that leads to inventions. Most prior work analysing the citations between patents and scientific publications focussed on the front-page citations, which are well structured

    更新日期:2021-01-05
  • Coreference Resolution in Research Papers from Multiple Domains
    arXiv.cs.IR Pub Date : 2021-01-04
    Arthur Brack; Daniel Uwe Müller; Anett Hoppe; Ralph Ewerth

    Coreference resolution is essential for automatic text understanding to facilitate high-level information retrieval tasks such as text summarisation or question answering. Previous work indicates that the performance of state-of-the-art approaches (e.g. based on BERT) noticeably declines when applied to scientific papers. In this paper, we investigate the task of coreference resolution in research

    更新日期:2021-01-05
  • Scalable representation learning and retrieval for display advertising
    arXiv.cs.IR Pub Date : 2021-01-04
    Olivier Koch; Amine Benhalloum; Guillaume Genthial; Denis Kuzin; Dmitry Parfenchik

    Over the past decades, recommendation has become a critical component of many online services such as media streaming and e-commerce. Recent advances in algorithms, evaluation methods and datasets have led to continuous improvements of the state-of-the-art. However, much work remains to be done to make these methods scale to the size of the internet. Online advertising offers a unique testbed for recommendation

    更新日期:2021-01-05
  • Recommending Accurate and Diverse Items Using Bilateral Branch Network
    arXiv.cs.IR Pub Date : 2021-01-04
    Yile Liang; Tieyun Qian

    Recommender systems have played a vital role in online platforms due to the ability of incorporating users' personal tastes. Beyond accuracy, diversity has been recognized as a key factor in recommendation to broaden user's horizons as well as to promote enterprises' sales. However, the trading-off between accuracy and diversity remains to be a big challenge, and the data and user biases have not been

    更新日期:2021-01-05
  • An Elo-like System for Massive Multiplayer Competitions
    arXiv.cs.IR Pub Date : 2021-01-02
    Aram Ebtekar; Paul Liu

    Rating systems play an important role in competitive sports and games. They provide a measure of player skill, which incentivizes competitive performances and enables balanced match-ups. In this paper, we present a novel Bayesian rating system for contests with many participants. It is widely applicable to competition formats with discrete ranked matches, such as online programming competitions, obstacle

    更新日期:2021-01-05
  • CRSLab: An Open-Source Toolkit for Building Conversational Recommender System
    arXiv.cs.IR Pub Date : 2021-01-04
    Kun Zhou; Xiaolei Wang; Yuanhang Zhou; Chenzhan Shang; Yuan Cheng; Wayne Xin Zhao; Yaliang Li; Ji-Rong Wen

    In recent years, conversational recommender system (CRS) has received much attention in the research community. However, existing studies on CRS vary in scenarios, goals and techniques, lacking unified, standardized implementation or comparison. To tackle this challenge, we propose an open-source CRS toolkit CRSLab, which provides a unified and extensible framework with highly-decoupled modules to

    更新日期:2021-01-05
  • Searching Personalized $k$-wing in Large and Dynamic Bipartite Graphs
    arXiv.cs.IR Pub Date : 2021-01-04
    Aman Abidi; Lu Chen; Rui Zhou; Chengfei Liu

    There are extensive studies focusing on the application scenario that all the bipartite cohesive subgraphs need to be discovered in a bipartite graph. However, we observe that, for some applications, one is interested in finding bipartite cohesive subgraphs containing a specific vertex. In this paper, we study a new query dependent bipartite cohesive subgraph search problem based on $k$-wing model

    更新日期:2021-01-05
  • A multi-modal approach towards mining social media data during natural disasters -- a case study of Hurricane Irma
    arXiv.cs.IR Pub Date : 2021-01-02
    Somya D. Mohanty; Brown Biggers; Saed Sayedahmed; Nastaran Pourebrahim; Evan B. Goldstein; Rick Bunch; Guangqing Chi; Fereidoon Sadri; Tom P. McCoy; Arthur Cosby

    Streaming social media provides a real-time glimpse of extreme weather impacts. However, the volume of streaming data makes mining information a challenge for emergency managers, policy makers, and disciplinary scientists. Here we explore the effectiveness of data learned approaches to mine and filter information from streaming social media data from Hurricane Irma's landfall in Florida, USA. We use

    更新日期:2021-01-05
  • Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
    arXiv.cs.IR Pub Date : 2021-01-02
    Omar Khattab; Christopher Potts; Matei Zaharia

    Multi-hop reasoning (i.e., reasoning across two or more documents) at scale is a key step toward NLP models that can exhibit broad world knowledge by leveraging large collections of documents. We propose Baleen, a system that improves the robustness and scalability of multi-hop reasoning over current approaches. Baleen introduces a per-hop condensed retrieval pipeline to mitigate the size of the search

    更新日期:2021-01-05
  • Assessing Emoji Use in Modern Text Processing Tools
    arXiv.cs.IR Pub Date : 2021-01-02
    Abu Awal Md Shoeb; Gerard de Melo

    Emojis have become ubiquitous in digital communication, due to their visual appeal as well as their ability to vividly convey human emotion, among other factors. The growing prominence of emojis in social media and other instant messaging also leads to an increased need for systems and tools to operate on text containing emojis. In this study, we assess this support by considering test sets of tweets

    更新日期:2021-01-05
  • Reader-Guided Passage Reranking for Open-Domain Question Answering
    arXiv.cs.IR Pub Date : 2021-01-01
    Yuning Mao; Pengcheng He; Xiaodong Liu; Yelong Shen; Jianfeng Gao; Jiawei Han; Weizhu Chen

    Current open-domain question answering (QA) systems often follow a Retriever-Reader (R2) architecture, where the retriever first retrieves relevant passages and the reader then reads the retrieved passages to form an answer. In this paper, we propose a simple and effective passage reranking method, Reader-guIDEd Reranker (Rider), which does not involve any training and reranks the retrieved passages

    更新日期:2021-01-05
  • De-identifying Hospital Discharge Summaries: An End-to-End Framework using Ensemble of De-Identifiers
    arXiv.cs.IR Pub Date : 2021-01-01
    Leibo Liu; Oscar Perez-Concha; Anthony Nguyen; Vicki Bennett; Louisa Jorm

    Objective:Electronic Medical Records (EMRs) contain clinical narrative text that is of great potential value to medical researchers. However, this information is mixed with Protected Health Information (PHI) that presents risks to patient and clinician confidentiality. This paper presents an end-to-end de-identification framework to automatically remove PHI from hospital discharge summaries. Materials

    更新日期:2021-01-05
Contents have been reproduced by permission of the publishers.
导出
全部期刊列表>>
微生物研究
亚洲大洋洲地球科学
NPJ欢迎投稿
自然科研论文编辑
ERIS期刊投稿
欢迎阅读创刊号
自然职场,为您触达千万科研人才
spring&清华大学出版社
城市可持续发展前沿研究专辑
Springer 纳米技术权威期刊征稿
全球视野覆盖
施普林格·自然新
chemistry
物理学研究前沿热点精选期刊推荐
自然职位线上招聘会
欢迎报名注册2020量子在线大会
化学领域亟待解决的问题
材料学研究精选新
GIANT
ACS ES&T Engineering
ACS ES&T Water
屿渡论文,编辑服务
阿拉丁试剂right
上海中医药大学
清华大学
复旦大学
南科大
北京理工大学
上海交通大学
隐藏1h前已浏览文章
课题组网站
新版X-MOL期刊搜索和高级搜索功能介绍
ACS材料视界
清华大学-1
武汉大学
浙江大学
天合科研
x-mol收录
试剂库存
down
wechat
bug