-
Sentiment-aware Enhancements of PageRank-based Citation Metric, Impact Factor, and H-index for Ranking the Authors of Scholarly Articles arXiv.cs.DL Pub Date : 2024-03-13 Shikha Gupta, Animesh Kumar
Heretofore, the only way to evaluate an author has been frequency-based citation metrics that assume citations to be of a neutral sentiment. However, considering the sentiment behind citations aids in a better understanding of the viewpoints of fellow researchers for the scholarly output of an author.
-
BrainKnow -- Extracting, Linking, and Associating Neuroscience Knowledge arXiv.cs.DL Pub Date : 2024-03-07 Cunqing Huangfu, Yi Zeng, Yuwei Wang, Dongsheng Wang, Zizhe Ruan
The vast accumulation of neuroscience knowledge presents a challenge for researchers to timely and accurately locate the specific information they require. Constructing a knowledge engine that automatically extracts and organizes information from academic papers can provide researchers with timely and accurate informational services. We present the Brain Knowledge Engine (BrainKnow), which extracts
-
PaperWeaver: Enriching Topical Paper Alerts by Contextualizing Recommended Papers with User-collected Papers arXiv.cs.DL Pub Date : 2024-03-05 Yoonjoo Lee, Hyeonsu B. Kang, Matt Latzke, Juho Kim, Jonathan Bragg, Joseph Chee Chang, Pao Siangliulue
With the rapid growth of scholarly archives, researchers subscribe to "paper alert" systems that periodically provide them with recommendations of recently published papers that are similar to previously collected papers. However, researchers sometimes struggle to make sense of nuanced connections between recommended papers and their own research context, as existing systems only present paper titles
-
AceMap: Knowledge Discovery through Academic Graph arXiv.cs.DL Pub Date : 2024-03-05 Xinbing Wang, Luoyi Fu, Xiaoying Gan, Ying Wen, Guanjie Zheng, Jiaxin Ding, Liyao Xiang, Nanyang Ye, Meng Jin, Shiyu Liang, Bin Lu, Haiwen Wang, Yi Xu, Cheng Deng, Shao Zhang, Huquan Kang, Xingli Wang, Qi Li, Zhixin Guo, Jiexing Qi, Pan Liu, Yuyang Ren, Lyuwen Wu, Jungang Yang, Jianping Zhou, Chenghu Zhou
The exponential growth of scientific literature requires effective management and extraction of valuable insights. While existing scientific search engines excel at delivering search results based on relational databases, they often neglect the analysis of collaborations between scientific entities and the evolution of ideas, as well as the in-depth analysis of content within scientific publications
-
Preserving Tangible and Intangible Cultural Heritage: the Cases of Volterra and Atari arXiv.cs.DL Pub Date : 2024-03-05 Maciej Grzeszczuk, Kinga Skorupska, Paweł Grabarczyk, Władysław Fuchs, Paul F. Aubin, Mark E. Dietrick, Barbara Karpowicz, Rafał Masłyk, Pavlo Zinevych, Wiktor Stawski, Stanisław Knapiński, Wiesław Kopeć
At first glance, the ruins of the Roman Theatre in the Italian town of Volterra have little in common with cassette tapes containing Atari games. One is certainly considered an important historical landmark, while the consensus on the importance of the other is partial at best. Still, both are remnants of times vastly different from the present and are at risk of oblivion. Unearthed architectural structures
-
Astronomy in Colombia: a bibliometric perspective arXiv.cs.DL Pub Date : 2024-03-04 Sofía Guevara-Montoya, Felipe Ortiz-Ferreira, María Paula Silva-Arévalo, Paola A. Niño-Muñoz, Jaime E. Forero-Romero
In Colombia, astronomical research is experiencing accelerated growth. In order to better understand its evolution and current state, we conducted a bibliometric study using data from the Astrophysics Data System (ADS) and Web of Science (WoS). In ADS, we identified 422 peer-reviewed publications from 1980, the year of the first publication, until 2023, which was the cutoff date for our study. Among
-
It Takes a Village: A Distributed Training Model for AI-based Chatbots arXiv.cs.DL Pub Date : 2024-03-03 Colleen Estes, Beth Twomey, Annie Johnson
In Summer 2023, staff from the information technology and reference departments at the University of Delaware Library, Museums and Press came together in a unique partnership to pilot a low-cost AI-powered chatbot. The goal of the pilot is to learn more about student and faculty interest in engaging with this tool, and to better understand the labor required on the staff side. Reference librarians
-
Talent hat, cross-border mobility, and career development in China arXiv.cs.DL Pub Date : 2024-02-29 Yurui Huang, Xuesen Cheng, Chaolin Tian, Xunyi Jiang, Langtian Ma, Yifang Ma
This study aims to investigate the influence of cross-border recruitment program in China, which confers scientists with a 'talent hat' including a startup package comprising significant bonuses, pay, and funding, on their future performance and career development. By curating a unique dataset from China's 10-year talent recruitment program, we employed multiple matching designs to quantify the effects
-
A Randomized Controlled Trial on Anonymizing Reviewers to Each Other in Peer Review Discussions arXiv.cs.DL Pub Date : 2024-03-01 Charvi Rastogi, Xiangchen Song, Zhijing Jin, Ivan Stelmakh, Hal Daumé III, Kun Zhang, Nihar B. Shah
Peer review often involves reviewers submitting their independent reviews, followed by a discussion among reviewers of each paper. A question among policymakers is whether the reviewers of a paper should be anonymous to each other during the discussion. We shed light on this by conducting a randomized controlled trial at the UAI 2022 conference. We randomly split the reviewers and papers into two conditions--one
-
How open are hybrid journals included in transformative agreements? arXiv.cs.DL Pub Date : 2024-02-28 Najko Jahn
The ongoing controversy surrounding transformative agreements, which aim to transition journal publishing to full open access, highlight the need for large-scale studies assessing the uptake of open access in hybrid journals. This includes evaluating the extent to which transformative agreements enabled open access. By combining publicly available data from various sources, including cOAlition S Journal
-
Handling Open Research Data within the Max Planck Society -- Looking Closer at the Year 2020 arXiv.cs.DL Pub Date : 2024-02-28 Martin Boosen, Michael Franke, Yves Vincent Grossmann, Sy Dat Ho, Larissa Leiminger, Jan Matthiesen
This paper analyses the practice of publishing research data within the Max Planck Society in the year 2020. The central finding of the study is that up to 40\% of the empirical text publications had research data available. The aggregation of the available data is predominantly analysed. There are differences between the sections of the Max Planck Society but they are not as great as one might expect
-
PST-Bench: Tracing and Benchmarking the Source of Publications arXiv.cs.DL Pub Date : 2024-02-25 Fanjin Zhang, Kun Cao, Yukuo Cen, Jifan Yu, Da Yin, Jie Tang
Tracing the source of research papers is a fundamental yet challenging task for researchers. The billion-scale citation relations between papers hinder researchers from understanding the evolution of science efficiently. To date, there is still a lack of an accurate and scalable dataset constructed by professional researchers to identify the direct source of their studied papers, based on which automatic
-
Mapping Literacies in the Tourism Labor Market: A Cross-Database Comparison arXiv.cs.DL Pub Date : 2024-02-23 Eddy Soria Leyva, Ana Beatriz Hernandez Lara
This book chapter conducts a comparative bibliometric analysis of literacies in the tourism labor market, drawing from the Web of Science (WoS) and Scopus databases. The objective is to assess scientific outputs and identify key patterns of scientific collaboration. Findings suggest a statistically significant difference between the two databases with an overlap level of 35.71%. However, there is a
-
Analyzing the Dynamics of COVID-19 Lockdown Success: Insights from Regional Data and Public Health Measures arXiv.cs.DL Pub Date : 2024-02-25 Md. Motaleb Hossen Manik, Md. Ahsan Habib, Md. Zabirul Islam, Tanim Ahmed, Fabliha Haque
The COVID-19 pandemic caused by the coronavirus had a significant effect on social, economic, and health systems globally. The virus emerged in Wuhan, China, and spread worldwide resulting in severe disease, death, and social interference. Countries implemented lockdowns in various regions to limit the spread of the virus. Some of them were successful and some failed. Here, several factors played a
-
Enhancing Cloud-Based Large Language Model Processing with Elasticsearch and Transformer Models arXiv.cs.DL Pub Date : 2024-02-24 Chunhe Ni, Jiang Wu, Hongbo Wang, Wenran Lu, Chenwei Zhang
Large Language Models (LLMs) are a class of generative AI models built using the Transformer network, capable of leveraging vast datasets to identify, summarize, translate, predict, and generate language. LLMs promise to revolutionize society, yet training these foundational models poses immense challenges. Semantic vector search within large language models is a potent technique that can significantly
-
EOSC CZ: Towards the development of Czech national ecosystem for FAIR research data arXiv.cs.DL Pub Date : 2024-02-20 Matej Antol, Jiri Marek, Michaela Capandova, Jaroslav Juracek, Ludek Matyska
This short paper presents a compact overview of the Czech approach to implementing the European Open Science Cloud and plans for developing a Czech national infrastructure for FAIR research data. Its purpose is to provide an all-encompassing summary of the near future of research data management in Czechia. As such, we deliberately attempt to explain complicated concepts in minimum words, sacrificing
-
A Literature Review of Literature Reviews in Pattern Analysis and Machine Intelligence arXiv.cs.DL Pub Date : 2024-02-20 Penghai Zhao, Xin Zhang, Ming-Ming Cheng, Jian Yang, Xiang Li
By consolidating scattered knowledge, the literature review provides a comprehensive understanding of the investigated topic. However, excessive reviews, especially in the booming field of pattern analysis and machine intelligence (PAMI), raise concerns for both researchers and reviewers. In response to these concerns, this Analysis aims to provide a thorough review of reviews in the PAMI field from
-
Citation Amnesia: NLP and Other Academic Fields Are in a Citation Age Recession arXiv.cs.DL Pub Date : 2024-02-19 Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad
This study examines the tendency to cite older work across 20 fields of study over 43 years (1980--2023). We put NLP's propensity to cite older work in the context of these 20 other fields to analyze whether NLP shows similar temporal citation patterns to these other fields over time or whether differences can be observed. Our analysis, based on a dataset of approximately 240 million papers, reveals
-
Thinking Outside the Black Box: Insights from a Digital Exhibition in the Humanities arXiv.cs.DL Pub Date : 2024-02-19 Sebastian Barzaghi, Alice Bordignon, Bianca Gualandi, Silvio Peroni
One of the main goals of Open Science is to make research more reproducible. There is no consensus, however, on what exactly "reproducibility" is, as opposed for example to "replicability", and how it applies to different research fields. After a short review of the literature on reproducibility/replicability with a focus on the humanities, we describe how the creation of the digital twin of the temporary
-
Research status of the Mendeleev Periodic Table: a bibliometric analysis arXiv.cs.DL Pub Date : 2024-02-18 Kamna Sharma, Deepak Kumar Das, Saibal Ray
In this paper, we present a bibliometric analysis of the Mendeleev Periodic Table. We have conducted a comprehensive analysis of the Scopus-based database using the keyword "Mendeleev Periodic Table". Our findings suggest that the Mendeleev Periodic Table is an influential topic in the field of Inorganic as well as Organic Chemistry. Future researchers may focus on expanding our analysis to include
-
Towards Development of Automated Knowledge Maps and Databases for Materials Engineering using Large Language Models arXiv.cs.DL Pub Date : 2024-02-17 Deepak Prasad, Mayur Pimpude, Alankar Alankar
In this work a Large Language Model (LLM) based workflow is presented that utilizes OpenAI ChatGPT model GPT-3.5-turbo-1106 and Google Gemini Pro model to create summary of text, data and images from research articles. It is demonstrated that by using a series of processing, the key information can be arranged in tabular form and knowledge graphs to capture underlying concepts. Our method offers efficiency
-
HTML papers on arXiv -- why it is important, and how we made it happen arXiv.cs.DL Pub Date : 2024-02-14 Charles Frankston, Jonathan Godfrey, Shamsi Brinn, Alison Hofer, Mark Nazzaro
In October 2023, arXiv made HTML formatted papers available to readers. This was the exciting outcome of over a year of accessibility research and development with the scientific community. Currently, only 2.4% of research outputs meet accessibility guidelines. Informed by scientists who rely on assistive technology, our analysis demonstrates that offering HTML is the most impactful step arXiv can
-
Interleaved snowballing: Reducing the workload of literature curators arXiv.cs.DL Pub Date : 2024-02-13 Ralf Stephan
We formally define the literature (reference) snowballing method and present a refined version of it. We show that the improved algorithm can substantially reduce curator work, even before application of text classification, by reducing the number of candidates to classify. We also present a desktop application named LitBall that implements this and other literature collection methods, through access
-
Cultural gems linked open data: Mapping culture and intangible heritage in European cities arXiv.cs.DL Pub Date : 2024-02-12 Sergio Consoli, Valentina Alberti, Cinzia Cocco, Francesco Panella, Valentina Montalto
The recovery and resilience of the cultural and creative sectors after the COVID-19 pandemic is a current topic with priority for the European Commission. Cultural gems is a crowdsourced web platform managed by the Joint Research Centre of the European Commission aimed at creating community-led maps as well as a common repository for cultural and creative places across European cities and towns. More
-
Ontology Engineering to Model the European Cultural Heritage: The Case of Cultural Gems arXiv.cs.DL Pub Date : 2024-02-12 Valentina Alberti, Cinzia Cocco, Sergio Consoli, Valentina Montalto, Francesco Panella
Cultural gems is a web application conceived by the European Commission's Joint Research Centre (DG JRC), which aims at engaging people and organisations across Europe to create a unique repository of cultural and creative places. The main goal is to provide a vision of European culture in order to strengthen a sense of identity within a single European cultural realm. Cultural gems maps more than
-
A Maturity Model for Urban Dataset Meta-data arXiv.cs.DL Pub Date : 2024-02-07 Mark S. Fox, Bart Gajderowicz, Dishu Lyu
In the current environment of data generation and publication, there is an ever-growing number of datasets available for download. This growth precipitates an existing challenge: sourcing and integrating relevant datasets for analysis is becoming more complex. Despite efforts by open data platforms, obstacles remain, predominantly rooted in inadequate metadata, unsuitable data presentation, complications
-
The Howard-Harvard effect: Institutional reproduction of intersectional inequalities arXiv.cs.DL Pub Date : 2024-02-06 Diego Kozlowski, Thema Monroe-White, Vincent Larivière, Cassidy R. Sugimoto
The US higher education system concentrates the production of science and scientists within a few institutions. This has implications for minoritized scholars and the topics with which they are disproportionately associated. This paper examines topical alignment between institutions and authors of varying intersectional identities, and the relationship with prestige and scientific impact. We observe
-
[Citation needed] Data usage and citation practices in medical imaging conferences arXiv.cs.DL Pub Date : 2024-02-05 Théo Sourget, Ahmet Akkoç, Stinna Winther, Christine Lyngbye Galsgaard, Amelia Jiménez-Sánchez, Dovile Juodelyte, Caroline Petitjean, Veronika Cheplygina
Medical imaging papers often focus on methodology, but the quality of the algorithms and the validity of the conclusions are highly dependent on the datasets used. As creating datasets requires a lot of effort, researchers often use publicly available datasets, there is however no adopted standard for citing the datasets used in scientific papers, leading to difficulty in tracking dataset usage. In
-
HERITRACE: Tracing Evolution and Bridging Data for Streamlined Curatorial Work in the GLAM Domain arXiv.cs.DL Pub Date : 2024-02-01 Arcangelo MassariDigital Humanities Advanced Research CentreResearch Centre for Open Scholarly Metadata, Department of Classical Philology and Italian Studies, University of Bologna, Bologna, Italy, Silvio PeroniDigital Humanities Advanced Research CentreResearch Centre for Open Scholarly Metadata, Department of Classical Philology and Italian Studies, University of Bologna, Bologna, Italy
HERITRACE is a semantic data management system tailored for the GLAM sector. It is engineered to streamline data curation for non-technical users while also offering an efficient administrative interface for technical staff. The paper compares HERITRACE with other established platforms such as OmekaS, Semantic MediaWiki, Research Space, and CLEF, emphasizing its advantages in user friendliness, provenance
-
University Students Motives and Challenges in Utilising Institutional Repository Resources arXiv.cs.DL Pub Date : 2024-01-31 Suzan Masawe, Paul Muneja, Vincent Msonge
One of the core functions of an academic institution is to generate knowledge, disseminate it to the intended audiences, and preserve it for future use. Academic institutions are now establishing Institutional Repositories (IRs) to collect produced resources to facilitate accessibility, dissemination, utilization, and management of intellectual materials produced within an institution. This study aimed
-
Reading yesterday's news. Layout recognition by segmentation of historical newspaper pages arXiv.cs.DL Pub Date : 2024-01-30 Christian SchultzeHigh-Performance Computing and Analytics, Niklas KerkfeldHigh-Performance Computing and Analytics, Kara KuebartInstitut für Geschichtswissenschaft Universität Bonn, Princilia WeberInstitut für Geschichtswissenschaft Universität Bonn, Moritz WolterHigh-Performance Computing and Analytics, Felix SelgertInstitut für Geschichtswissenschaft Universität Bonn
Newspapers are important sources for historians interested in past societies' cultural values, social structures, and their changes. Since the 19th century, newspapers have been widely available and spread regionally. Today, historical newspapers are digitized but unavailable in a separate metadata-enhanced form. Machine-readable metadata, however, is a prerequisite for a mass statistical analysis
-
WikiTexVC: MediaWiki's native LaTeX to MathML converter for Wikipedia arXiv.cs.DL Pub Date : 2024-01-30 Johannes Stegmüller, Moritz Schubotz
MediaWiki and Wikipedia authors usually use LaTeX to define mathematical formulas in the wiki text markup. In the Wikimedia ecosystem, these formulas were processed by a long cascade of web services and finally delivered to users' browsers in rendered form for visually readable representation as SVG. With the latest developments of supporting MathML Core in Chromium-based browsers, MathML continues
-
Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus arXiv.cs.DL Pub Date : 2024-01-29 Jack Culbert, Anne Hobert, Najko Jahn, Nick Haupka, Marion Schmidt, Paul Donner, Philipp Mayr
OpenAlex is a promising open source of scholarly metadata, and competitor to the established proprietary sources, the Web of Science and Scopus. As OpenAlex provides its data freely and openly, it permits researchers to perform bibliometric studies that can be reproduced in the community without licensing barriers. However, as OpenAlex is a rapidly evolving source and the data contained within is expanding
-
Textual Entailment for Effective Triple Validation in Object Prediction arXiv.cs.DL Pub Date : 2024-01-29 Andrés García-Silva, Cristian Berrío, José Manuel Gómez-Pérez
Knowledge base population seeks to expand knowledge graphs with facts that are typically extracted from a text corpus. Recently, language models pretrained on large corpora have been shown to contain factual knowledge that can be retrieved using cloze-style strategies. Such approach enables zero-shot recall of facts, showing competitive results in object prediction compared to supervised baselines
-
ChemDFM: Dialogue Foundation Model for Chemistry arXiv.cs.DL Pub Date : 2024-01-26 Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Xin Chen, Kai Yu
Large language models (LLMs) have established great success in the general domain of natural language processing. Their emerging task generalization and free-form dialogue capabilities can greatly help to design Chemical General Intelligence (CGI) to assist real-world research in chemistry. However, the existence of specialized language and knowledge in the field of chemistry, such as the highly informative
-
Visualization of rank-citation curves for fast detection of h-index anomalies in university metrics arXiv.cs.DL Pub Date : 2024-01-24 Serhii Nazarovets
University rankings, despite facing criticism, continue to maintain their popularity. In the 2023 Scopus Ranking of Ukrainian Universities, certain institutions stood out due to their high h-index, despite modest publication and citation numbers. This phenomenon can be attributed to influential research topics or involvement in international collaborative research. However, these results may also be
-
Organizing Scientific Knowledge From Energy System Research Using the Open Research Knowledge Graph arXiv.cs.DL Pub Date : 2024-01-24 Oliver Karras, Jan Göpfert, Patrick Kuckertz, Tristan Pelser, Sören Auer
Engineering sciences, such as energy system research, play an important role in developing solutions to technical, environmental, economic, and social challenges of our modern society. In this context, the transformation of energy systems into climate-neutral systems is one of the key strategies for mitigating climate change. For the transformation of energy systems, engineers model, simulate and analyze
-
Decoding University Hierarchy and Prestige in China through Domestic Ph.D. Hiring Network arXiv.cs.DL Pub Date : 2024-01-23 Chaolin Tian, Xunyi Jiang, Yurui Huang, Langtian Ma, Yifang Ma
The academic job market for fresh Ph.D. students to pursue postdoctoral and junior faculty positions plays a crucial role in shaping the future orientations, developments, and status of the global academic system. In this work, we focus on the domestic Ph.D. hiring network among universities in China by exploring the doctoral education and academic employment of nearly 28,000 scientists across all
-
From Knowledge Organization to Knowledge Representation and Back arXiv.cs.DL Pub Date : 2024-01-22 Fausto Giunchiglia, Mayukh Bagchi, Subhashis Das
Knowledge Organization (KO) and Knowledge Representation (KR) have been the two mainstream methodologies of knowledge modelling in the Information Science community and the Artificial Intelligence community, respectively. The facet-analytical tradition of KO has developed an exhaustive set of guiding canons for ensuring quality in organising and managing knowledge but has remained limited in terms
-
A multi-dimensional analysis of usage counts, Mendeley readership, and citations for journal and conference papers arXiv.cs.DL Pub Date : 2024-01-19 Wencan Tian, Zhichao Fang, Xianwen Wang, Rodrigo Costas
This study analyzed 16,799 journal papers and 98,773 conference papers published by IEEE Xplore in 2016 to investigate the relationships among usage counts, Mendeley readership, and citations through descriptive, regression, and mediation analyses. Differences in the relationship among these metrics between journal and conference papers are also studied. Results showed that there is no significant
-
Towards a Quality Indicator for Research Data publications and Research Software publications -- A vision from the Helmholtz Association arXiv.cs.DL Pub Date : 2024-01-16 Wolfgang zu Castell, Doris Dransch, Guido Juckeland, Marcel Meistring, Bernadette Fritzsch, Ronny Gey, Britta Höpfner, Martin Köhler, Christian Meeßen, Hela Mehrtens, Felix Mühlbauer, Sirko Schindler, Thomas Schnicke, Roland Bertelmann
Research data and software are widely accepted as an outcome of scientific work. However, in comparison to text-based publications, there is not yet an established process to assess and evaluate quality of research data and research software publications. This paper presents an attempt to fill this gap. Initiated by the Working Group Open Science of the Helmholtz Association the Task Group Helmholtz
-
The extension of zbMATH Open by arXiv preprints arXiv.cs.DL Pub Date : 2024-01-16 Isabel Beckenbach, Klaus Hulek, Olaf Teschke
zbMATH Open has started a new feature -- relevant preprints posted at arXiv will also be displayed in the database. In this article we introduce this new feature and the underlying editorial policy. We also describe some of the technical issues involved and discuss the challenges this presents for future developments.
-
Streamlining the Selection Phase of Systematic Literature Reviews (SLRs) Using AI-Enabled GPT-4 Assistant API arXiv.cs.DL Pub Date : 2024-01-14 Seyed Mohammad Ali Jafari
The escalating volume of academic literature presents a formidable challenge in staying updated with the newest research developments. Addressing this, this study introduces a pioneering AI-based tool, configured specifically to streamline the efficiency of the article selection phase in Systematic Literature Reviews (SLRs). Utilizing the robust capabilities of OpenAI's GPT-4 Assistant API, the tool
-
Cited But Not Archived: Analyzing the Status of Code References in Scholarly Articles arXiv.cs.DL Pub Date : 2024-01-10 Emily Escamilla, Martin Klein, Talya Cooper, Vicky Rampin, Michele C. Weigle, Michael L. Nelson
One in five arXiv articles published in 2021 contained a URI to a Git Hosting Platform (GHP), which demonstrates the growing prevalence of GHP URIs in scholarly publications. However, GHP URIs are vulnerable to the same reference rot that plagues the Web at large. The disappearance of software hosting platforms, like Gitorious and Google Code, and the source code they contain threatens research reproducibility
-
Identifying Fabricated Networks within Authorship-for-Sale Enterprises arXiv.cs.DL Pub Date : 2024-01-08 Simon J. Porter, Leslie D. McIntosh
Fabricated papers do not just need text, images, and data, they also require a fabricated or partially fabricated network of authors. Most `authors' on a fabricated paper have not been associated with the research, but rather are added through a transaction. This lack of deeper connection means that there is a low likelihood that co-authors on fabricated papers will ever appear together on the same
-
A Content-Based Novelty Measure for Scholarly Publications: A Proof of Concept arXiv.cs.DL Pub Date : 2024-01-08 Haining Wang
Novelty, akin to gene mutation in evolution, opens possibilities for scientific advancement. Despite peer review being the gold standard for evaluating novelty in scholarly communication and resource allocation, the vast volume of submissions necessitates an automated measure of scientific novelty. Adopting a perspective that views novelty as the atypical combination of existing knowledge, we introduce
-
Effective Communication of Scientific Results arXiv.cs.DL Pub Date : 2024-01-09 José Nelson Amaral
Communication is essential for the advancement of Science. Technology advances and the proliferation of personal devices have changed the ways in which people communicate in all aspects of life. Scientific communication has also been profoundly affected by such changes, and thus it is important to reflect on effective ways to communicate scientific results to scientists that are flooded with information
-
Application of Module to Coding Theory: A Systematic Literature Review arXiv.cs.DL Pub Date : 2024-01-03 Muhammad Faldiyan, Sisilia Sylviani
A systematic literature review is a research process that identifies, evaluates, and interprets all relevant study findings connected to specific research questions, topics, or phenomena of interest. In this work, a thorough review of the literature on the issue of the link between module structure and coding theory was done. A literature search yielded 470 articles from the Google Scholar, Dimensions
-
Dimensionality Reduced Clustered Data and Order Partition and Stepwise Dimensionality Increasing Indices arXiv.cs.DL Pub Date : 2024-01-05 Alexander Thomasian
One of the goals of NASA funded project at IBM T. J. Watson Research Center was to build an index for similarity searching satellite images, which were characterized by high-dimensional feature image texture vectors. Reviewed is our effort on data clustering, dimensionality reduction via Singular Value Decomposition - SVD and indexing to build a smaller index and more efficient k-Nearest Neighbor -
-
Examining the Challenges in Archiving Instagram arXiv.cs.DL Pub Date : 2024-01-04 Rachel Zheng, Michele C. Weigle
To prevent the spread of disinformation on Instagram, we need to study the accounts and content of disinformation actors. However, due to their malicious nature, Instagram often bans accounts that are responsible for spreading disinformation, making these accounts inaccessible from the live web. The only way we can study the content of banned accounts is through public web archives such as the Internet
-
$Φ$ index: A standardized scale-independent citation indicator arXiv.cs.DL Pub Date : 2024-01-02 Manolis Antonoyiannakis
The sensitivity of Impact Factors (IFs) to journal size causes systematic bias in IF rankings, in a process akin to {\it stacking the cards}: A random ``journal'' of $n$ papers can attain a range of IF values that decreases rapidly with size, as $\sim 1/\sqrt{n}$ . The Central Limit Theorem, which underlies this effect, also allows us to correct it by standardizing citation averages for scale {\it
-
Trends in Practical Student Peer-review arXiv.cs.DL Pub Date : 2024-01-02 Helen C. Purchase, John Hamer
While much of the literature on student peer-review focusses on the success (or otherwise) of individual activities in specific classes (often implemented as part of scholarly research projects) there is little by way of published data giving an overview of the range and variety of such activities as used in practice. As the creators, administrators and maintainers of the Aropa Peer review tool, we
-
Mapping bibliographic metadata collections: the case of OpenCitations Meta and OpenAlex arXiv.cs.DL Pub Date : 2023-12-27 Elia Rizzetto, Silvio Peroni
This study describes the methodology and analyses the results of the process of mapping entities between two large open bibliographic metadata collections, OpenCitations Meta and OpenAlex. The primary objective of this mapping is to integrate OpenAlex internal identifiers into the existing metadata of bibliographic resources in OpenCitations Meta, thereby interlinking and aligning these collections
-
Recursos lexicográficos electrónicos multilingües y plurilingües: definición y clasificación tipológico-descriptiva arXiv.cs.DL Pub Date : 2023-12-26 María José Domínguez Vázquez
The aim of this paper is to provide a classification of multilingual and plurilingual electronic lexicographic resources which would enable, one the one hand, the implementation of quantitative and qualitative criteria to produce a typological taxonomy of lexicographical tools, such as dictionaries, as opposed to platforms and websites and, on the other, the distinction of multilingual and plurilingual
-
Implementation of the IIIF Presentation API 3.0 based on Software Support: Use Case of an Incremental IIIF Deployment within a Citizen Science Project arXiv.cs.DL Pub Date : 2023-12-18 Raemy, Julien Antoine, Demleitner, Adrian
As part of the Participatory Knowledge Practices in Analogue and Digital Image Archives (PIA) research project, we have been implementing Linked Open Usable Data (LOUD) standards including the International Image Interoperability Framework (IIIF) specifications to disseminate digital objects, their related metadata and streamline our processes. We have taken an incremental approach to IIIF deployment
-
Sustainable Data Management: Indefinite Static Data at Rest with Machine-Readable Printed Optical Data Sheets (MRPODS) arXiv.cs.DL Pub Date : 2023-12-16 Artem Doll
In an era where both commercial and private sectors place a premium on the longevity of digital data storage, the imperative to bolster resilience of digital information while simultaneously curbing costs and reducing failure rates becomes paramount. This study delves into the unique attributes of optical encoding methodologies, which are poised to offer enduring stability for digital data. Despite
-
A Comprehensive Approach to Ensuring Quality in Spreadsheet-Based Metadata arXiv.cs.DL Pub Date : 2023-12-14 Martin J. O'Connor, Marcos Martínez-Romero, Mete Ugur Akdogan, Josef Hardi, Mark A. Musen
While scientists increasingly recognize the importance of metadata in describing their data, spreadsheets remain the preferred tool for supplying this information despite their limitations in ensuring compliance and quality. Various tools have been developed to address these limitations, but they suffer from their own shortcomings, such as steep learning curves and limited customization. In this paper
-
Recording provenance of workflow runs with RO-Crate arXiv.cs.DL Pub Date : 2023-12-13 Simone Leo, Michael R. Crusoe, Laura Rodríguez-Navas, Raül Sirvent, Alexander Kanitz, Paul De Geest, Rudolf Wittner, Luca Pireddu, Daniel Garijo, José M. Fernández, Iacopo Colonnelli, Matej Gallo, Tazro Ohta, Hirotaka Suetake, Salvador Capella-Gutierrez, Renske de Wit, Bruno P. Kinoshita, Stian Soiland-Reyes
Recording the provenance of scientific computation results is key to the support of traceability, reproducibility and quality assessment of data products. Several data models have been explored to address this need, providing representations of workflow plans and their executions as well as means of packaging the resulting information for archiving and sharing. However, existing approaches tend to
-
Who Are Tweeting About Academic Publications? A Cochrane Systematic Review and Meta-Analysis of Altmetric Studies arXiv.cs.DL Pub Date : 2023-12-11 Ashraf Maleki, Kim Holmberg
Previous studies have developed different categorizations of Twitter users who interact with scientific publications online, reflecting the difficulty in creating a unified approach. Using Cochrane Review meta-analysis to analyse earlier research (including 79,014 Twitter users, over twenty million tweets, and over five million tweeted publications from 23 studies), we created a consolidated robust
-
Web of Science Core Collection's coverage expansion:The forgotten Arts & Humanities Citation Index? arXiv.cs.DL Pub Date : 2023-12-09 Weishu Liu, Rong Ni, Guangyuan Hu
The expansion of Web of Science Core Collection (WoSCC) over the recent years has partially accounted for the "norm" of growth of research output in many bibliometric analysis studies. However, the expansion patterns of different citation indexes may be different, which may benefit some disciplines but hinder others. Utilizing Science Citation Index Expanded (SCIE), Social Sciences Citation Index (SSCI)