-
Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization arXiv.cs.SE Pub Date : 2024-03-15 Ratnadira Widyasari, Jia Wei Ang, Truong Giang Nguyen, Neil Sharma, David Lo
Fault localization is a critical process that involves identifying specific program elements responsible for program failures. Manually pinpointing these elements, such as classes, methods, or statements, which are associated with a fault is laborious and time-consuming. To overcome this challenge, various fault localization tools have been developed. These tools typically generate a ranked list of
-
An Empirical Study on Developers Shared Conversations with ChatGPT in GitHub Pull Requests and Issues arXiv.cs.SE Pub Date : 2024-03-15 Huizi Hao, Kazi Amit Hasan, Hong Qin, Marcos Macedo, Yuan Tian, Steven H. H. Ding, Ahmed E. Hassan
ChatGPT has significantly impacted software development practices, providing substantial assistance to developers in a variety of tasks, including coding, testing, and debugging. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. In this paper, we analyze a dataset of 210 and 370 developers shared conversations with ChatGPT in
-
A Conceptual Model for the Analysis of Investigation Elements in Games arXiv.cs.SE Pub Date : 2024-03-15 Pedro Marques, Marcus Parreiras, Joshua Kritz, Geraldo Xexeo
This paper presents the 4E conceptual model, developed to formally analyze investigation games from a game design perspective. The model encompasses four components: Exploration, Elicitation, Experimentation, and Evaluation. Grounded Theory was employed as the methodology for constructing the model, allowing for an in-depth understanding of the underlying concepts. The resulting model was then compared
-
A Vocabulary of Board Game Dynamics arXiv.cs.SE Pub Date : 2024-03-15 Joshua Kritz, Geraldo Xexéo
In recent years, significant advances have been made in the field of game research. However, there has been a noticeable dearth of scholarly research focused on the domain of dynamics, despite the widespread recognition among researchers of its existence and importance. The objective of this paper is to address this research gap by presenting a vocabulary dedicated to boardgame dynamics. To achieve
-
Large Language Models to Generate System-Level Test Programs Targeting Non-functional Properties arXiv.cs.SE Pub Date : 2024-03-15 Denis Schwachhofer, Peter Domanski, Steffen Becker, Stefan Wagner, Matthias Sauer, Dirk Pflüger, Ilia Polian
System-Level Test (SLT) has been a part of the test flow for integrated circuits for over a decade and still gains importance. However, no systematic approaches exist for test program generation, especially targeting non-functional properties of the Device under Test (DUT). Currently, test engineers manually compose test suites from off-the-shelf software, approximating the end-user environment of
-
Repoformer: Selective Retrieval for Repository-Level Code Completion arXiv.cs.SE Pub Date : 2024-03-15 Di Wu, Wasi Uddin Ahmad, Dejiao Zhang, Murali Krishna Ramanathan, Xiaofei Ma
Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion. However, the invariable use of retrieval in existing methods exposes issues in both efficiency and robustness, with a large proportion of the retrieved contexts proving unhelpful or harmful to code language models (code LMs). To tackle the challenges, this paper proposes a selective
-
Reality Bites: Assessing the Realism of Driving Scenarios with Large Language Models arXiv.cs.SE Pub Date : 2024-03-14 Jiahui Wu, Chengjie Lu, Aitor Arrieta, Tao Yue, Shaukat Ali
Large Language Models (LLMs) are demonstrating outstanding potential for tasks such as text generation, summarization, and classification. Given that such models are trained on a humongous amount of online knowledge, we hypothesize that LLMs can assess whether driving scenarios generated by autonomous driving testing techniques are realistic, i.e., being aligned with real-world driving conditions.
-
Gamified GUI testing with Selenium in the IntelliJ IDE: A Prototype Plugin arXiv.cs.SE Pub Date : 2024-03-14 Giacomo Garaccione, Tommaso Fulcini, Paolo Stefanut Bodnarescul, Riccardo Coppola, Luca Ardito
Software testing is a crucial phase in software development, enabling the detection of issues and defects that may arise during the development process. Addressing these issues enhances software applications' quality, reliability, user experience, and performance. Graphical User Interface (GUI) testing, one such technique, involves mimicking a regular user's interactions with an application to identify
-
Teaching Machines to Code: Smart Contract Translation with LLMs arXiv.cs.SE Pub Date : 2024-03-13 Rabimba Karanjai, Lei Xu, Weidong Shi
The advent of large language models (LLMs) has marked a significant milestone in the realm of artificial intelligence, with their capabilities often matching or surpassing human expertise in various domains. Among these achievements, their adeptness in translation tasks stands out, closely mimicking the intricate and preliminary processes undertaken by human translators to ensure the fidelity and quality
-
ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment arXiv.cs.SE Pub Date : 2024-03-15 Xiaofeng Wu, Jia Rao, Wei Chen
The advent of the Transformer architecture has propelled the growth of natural language processing (NLP) models, leading to remarkable achievements in numerous NLP tasks. Yet, the absence of specialized hardware like expansive GPU memory and high-speed interconnects poses challenges for training large-scale models. This makes it daunting for many users to experiment with pre-training and fine-tuning
-
A Tale of Two Communities: Exploring Academic References on Stack Overflow arXiv.cs.SE Pub Date : 2024-03-14 Run Huang, Souti Chattopadhyay
Stack Overflow is widely recognized by software practitioners as the go-to resource for addressing technical issues and sharing practical solutions. While it is not typically seen as a forum for scholarly discourse, users on Stack Overflow often refer to academic sources in their discussions. Yet, little is known about these referenced works from the academic community and how they intersect the needs
-
Welcome Your New AI Teammate: On Safety Analysis by Leashing Large Language Models arXiv.cs.SE Pub Date : 2024-03-14 Ali Nouri, Beatriz Cabrero-Daniel, Fredrik Törner, Hȧkan Sivencrona, Christian Berger
DevOps is a necessity in many industries, including the development of Autonomous Vehicles. In those settings, there are iterative activities that reduce the speed of SafetyOps cycles. One of these activities is "Hazard Analysis & Risk Assessment" (HARA), which is an essential step to start the safety requirements specification. As a potential approach to increase the speed of this step in SafetyOps
-
Analyzing and Mitigating (with LLMs) the Security Misconfigurations of Helm Charts from Artifact Hub arXiv.cs.SE Pub Date : 2024-03-14 Francesco Minna, Fabio Massacci, Katja Tuma
Background: Helm is a package manager that allows defining, installing, and upgrading applications with Kubernetes (K8s), a popular container orchestration platform. A Helm chart is a collection of files describing all dependencies, resources, and parameters required for deploying an application within a K8s cluster. Objective: The goal of this study is to mine and empirically evaluate the security
-
Code Revert Prediction with Graph Neural Networks: A Case Study at J.P. Morgan Chase arXiv.cs.SE Pub Date : 2024-03-14 Yulong Pei, Salwa Alamir, Rares Dolga, Sameena Shah
Code revert prediction, a specialized form of software defect detection, aims to forecast or predict the likelihood of code changes being reverted or rolled back in software development. This task is very important in practice because by identifying code changes that are more prone to being reverted, developers and project managers can proactively take measures to prevent issues, improve code quality
-
LLM-based agents for automating the enhancement of user story quality: An early report arXiv.cs.SE Pub Date : 2024-03-14 Zheying Zhang, Maruf Rayhan, Tomas Herda, Manuel Goisauf, Pekka Abrahamsson
In agile software development, maintaining high-quality user stories is crucial, but also challenging. This study explores the use of large language models to automatically improve the user story quality in Austrian Post Group IT agile teams. We developed a reference model for an Autonomous LLM-based Agent System and implemented it at the company. The quality of user stories in the study and the effectiveness
-
An Extensible Framework for Architecture-Based Data Flow Analysis for Information Security arXiv.cs.SE Pub Date : 2024-03-14 Nicolas Boltz, Sebastian Hahner, Christopher Gerking, Robert Heinrich
The growing interconnection between software systems increases the need for security already at design time. Security-related properties like confidentiality are often analyzed based on data flow diagrams (DFDs). However, manually analyzing DFDs of large software systems is bothersome and error-prone, and adjusting an already deployed software is costly. Additionally, closed analysis ecosystems limit
-
Leveraging the Crowd for Dependency Management: An Empirical Study on the Dependabot Compatibility Score arXiv.cs.SE Pub Date : 2024-03-14 Benjamin Rombaut, Filipe R. Cogo, Ahmed E. Hassan
Dependabot, a popular dependency management tool, includes a compatibility score feature that helps client packages assess the risk of accepting a dependency update by leveraging knowledge from "the crowd". For each dependency update, Dependabot calculates this compatibility score as the proportion of successful updates performed by other client packages that use the same provider package as a dependency
-
Bugs in Large Language Models Generated Code arXiv.cs.SE Pub Date : 2024-03-13 Florian Tambon, Arghavan Moradi Dakhel, Amin Nikanjam, Foutse Khomh, Michel C. Desmarais, Giuliano Antoniol
Large Language Models (LLMs) for code have gained significant attention recently. They can generate code in different programming languages based on provided prompts, fulfilling a long-lasting dream in Software Engineering (SE), i.e., automatic code generation. Similar to human-written code, LLM-generated code is prone to bugs, and these bugs have not yet been thoroughly examined by the community.
-
Loop unrolling (for test coverage): formal definition arXiv.cs.SE Pub Date : 2024-03-13 Bertrand Meyer
Techniques to achieve various forms of test coverage, such as branch coverage, typically do not iterate loops; in other words, they treat a loop as a conditional, executed zero or one time. Existing work by the author and collaborators produces test suites guaranteeing full branch coverage. More recent work has shown that by unrolling loops the approach can find significantly more bugs. The present
-
QCSHQD: Quantum computing as a service for Hybrid classical-quantum software development: A Vision arXiv.cs.SE Pub Date : 2024-03-13 Arif Ali Khan, Maryam Tavassoli Sabzevari, Davide Taibi, Matteo Esposito
Quantum Computing (QC) is transitioning from theoretical frameworks to an indispensable powerhouse of computational capability, resulting in extensive adoption across both industrial and academic domains. QC presents exceptional advantages, including unparalleled processing speed and the potential to solve complex problems beyond the capabilities of classical computers. Nevertheless, academic researchers
-
CAM: A Collection of Snapshots of GitHub Java Repositories Together with Metrics arXiv.cs.SE Pub Date : 2024-03-13 Yegor Bugayenko
Even though numerous researchers require stable datasets along with source code and basic metrics calculated on them, neither GitHub nor any other code hosting platform provides such a resource. Consequently, each researcher must download their own data, compute the necessary metrics, and then publish the dataset somewhere to ensure it remains accessible indefinitely. Our CAM (stands for ``Classes
-
Understanding and Evaluating Developer Behaviour in Programming Tasks arXiv.cs.SE Pub Date : 2024-03-13 Martin Schröer, Rainer Koschke
To evaluate how developers perform differently in solving programming tasks, i.e., which actions and behaviours are more beneficial to them than others and if there are any specific strategies and behaviours that may indicate good versus poor understanding of the task and program given to them, we used the MIMESIS plug-in to record developers' interactions with the IDE. In a series of three studies
-
Search-based Optimisation of LLM Learning Shots for Story Point Estimation arXiv.cs.SE Pub Date : 2024-03-13 Vali Tawosi, Salwa Alamir, Xiaomo Liu
One of the ways Large Language Models (LLMs) are used to perform machine learning tasks is to provide them with a few examples before asking them to produce a prediction. This is a meta-learning process known as few-shot learning. In this paper, we use available Search-Based methods to optimise the number and combination of examples that can improve an LLM's estimation performance, when it is used
-
Software Vulnerability and Functionality Assessment using LLMs arXiv.cs.SE Pub Date : 2024-03-13 Rasmus Ingemann Tuffveson Jensen, Vali Tawosi, Salwa Alamir
While code review is central to the software development process, it can be tedious and expensive to carry out. In this paper, we investigate whether and how Large Language Models (LLMs) can aid with code reviews. Our investigation focuses on two tasks that we argue are fundamental to good reviews: (i) flagging code with security vulnerabilities and (ii) performing software functionality validation
-
System for systematic literature review using multiple AI agents: Concept and an empirical evaluation arXiv.cs.SE Pub Date : 2024-03-13 Abdul Malik Sami, Zeeshan Rasheed, Kai-Kristian Kemell, Muhammad Waseem, Terhi Kilamo, Mika Saari, Anh Nguyen Duc, Kari Systä, Pekka Abrahamsson
Systematic Literature Reviews (SLRs) have become the foundation of evidence-based studies, enabling researchers to identify, classify, and combine existing studies based on specific research questions. Conducting an SLR is largely a manual process. Over the previous years, researchers have made significant progress in automating certain phases of the SLR process, aiming to reduce the effort and time
-
A Picture Is Worth a Thousand Words: Exploring Diagram and Video-Based OOP Exercises to Counter LLM Over-Reliance arXiv.cs.SE Pub Date : 2024-03-13 Bruno Pereira Cipriano, Pedro Alves, Paul Denny
Much research has highlighted the impressive capabilities of large language models (LLMs), like GPT and Bard, for solving introductory programming exercises. Recent work has shown that LLMs can effectively solve a range of more complex object-oriented programming (OOP) exercises with text-based specifications. This raises concerns about academic integrity, as students might use these models to complete
-
Log Summarisation for Defect Evolution Analysis arXiv.cs.SE Pub Date : 2024-03-13 Rares Dolga, Ran Zmigrod, Rui Silva, Salwa Alamir, Sameena Shah
Log analysis and monitoring are essential aspects in software maintenance and identifying defects. In particular, the temporal nature and vast size of log data leads to an interesting and important research question: How can logs be summarised and monitored over time? While this has been a fundamental topic of research in the software engineering community, work has typically focused on heuristic-
-
When Code Smells Meet ML: On the Lifecycle of ML-specific Code Smells in ML-enabled Systems arXiv.cs.SE Pub Date : 2024-03-13 Gilberto Recupito, Giammaria Giordano, Filomena Ferrucci, Dario Di Nucci, Fabio Palomba
Context. The adoption of Machine Learning (ML)--enabled systems is steadily increasing. Nevertheless, there is a shortage of ML-specific quality assurance approaches, possibly because of the limited knowledge of how quality-related concerns emerge and evolve in ML-enabled systems. Objective. We aim to investigate the emergence and evolution of specific types of quality-related concerns known as ML-specific
-
AutoDev: Automated AI-Driven Development arXiv.cs.SE Pub Date : 2024-03-13 Michele Tufano, Anisha Agarwal, Jinu Jang, Roshanak Zilouchian Moghaddam, Neel Sundaresan
The landscape of software development has witnessed a paradigm shift with the advent of AI-powered assistants, exemplified by GitHub Copilot. However, existing solutions are not leveraging all the potential capabilities available in an IDE such as building, testing, executing code, git operations, etc. Therefore, they are constrained by their limited capabilities, primarily focusing on suggesting code
-
Assessing the Influence of Toxic and Gender Discriminatory Communication on Perceptible Diversity in OSS Projects arXiv.cs.SE Pub Date : 2024-03-12 Sayma Sultana, Gias Uddin, Amiangshu Bosu
The presence of toxic and gender-identity derogatory language in open-source software (OSS) communities has recently become a focal point for researchers. Such comments not only lead to frustration and disengagement among developers but may also influence their leave from the OSS projects. Despite ample evidence suggesting that diverse teams enhance productivity, the existence of toxic or gender identity
-
Lessons from a Pioneering Software Engineering Environment: Design Principles of Software through Pictures arXiv.cs.SE Pub Date : 2024-03-12 Anthony I.Tony, Wasserman
This paper describes the historical background that led to the development of the innovative Software through Pictures multi-user development environment, and the principles for its integration with other software products to create a software engineering environment covering multiple tasks in the software development lifecycle.
-
Bus Factor Explorer arXiv.cs.SE Pub Date : 2024-03-12 Egor Klimov, Muhammad Umair Ahmed, Nikolai Sviridov, Pouria Derakhshanfar, Eray Tüzün, Vladimir Kovalenko
Bus factor (BF) is a metric that tracks knowledge distribution in a project. It is the minimal number of engineers that have to leave for a project to stall. Despite the fact that there are several algorithms for calculating the bus factor, only a few tools allow easy calculation of bus factor and convenient analysis of results for projects hosted on Git-based providers. We introduce Bus Factor Explorer
-
DevBench: A Comprehensive Benchmark for Software Development arXiv.cs.SE Pub Date : 2024-03-13 Bowen Li, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, Ping Yang, Dahua Lin, Chao Peng, Kai Chen
Recent advancements in large language models (LLMs) have significantly enhanced their coding capabilities. However, existing benchmarks predominantly focused on simplified or isolated aspects of programming, such as single-file code generation or repository issue debugging, falling short of measuring the full spectrum of challenges raised by real-world programming activities. To this end, we propose
-
An Integrated Usability Framework for Evaluating Open Government Data Portals: Comparative Analysis of EU and GCC Countries arXiv.cs.SE Pub Date : 2024-03-13 Fillip Molodtsov, Anastasija Nikiforova
This study explores the critical role of open government data (OGD) portals in fostering transparency and collaboration between diverse stakeholders. Recognizing the challenges of usability, communication with diverse populations, and strategic value creation, this paper develops an integrated framework for evaluating OGD portal effectiveness that accommodates user diversity (regardless of their data
-
Translating between SQL Dialects for Cloud Migration arXiv.cs.SE Pub Date : 2024-03-13 Ran Zmigrod, Salwa Alamir, Xiaomo Liu
Migrations of systems from on-site premises to the cloud has been a fundamental endeavor by many industrial institutions. A crucial component of such cloud migrations is the transition of databases to be hosted online. In this work, we consider the difficulties of this migration for SQL databases. While SQL is one of the prominent methods for storing database procedures, there are a plethora of different
-
Augmenting Interpolation-Based Model Checking with Auxiliary Invariants (Extended Version) arXiv.cs.SE Pub Date : 2024-03-12 Dirk Beyer, Po-Chun Chien, Nian-Ze Lee
Software model checking is a challenging problem, and generating relevant invariants is a key factor in proving the safety properties of a program. Program invariants can be obtained by various approaches, including lightweight procedures based on data-flow analysis and intensive techniques using Craig interpolation. Although data-flow analysis runs efficiently, it often produces invariants that are
-
Supporting Error Chains in Static Analysis for Precise Evaluation Results and Enhanced Usability arXiv.cs.SE Pub Date : 2024-03-12 Anna-Katharina Wickert, Michael Schlichtig, Marvin Vogel, Lukas Winter, Mira Mezini, Eric Bodden
Context: Static analyses are well-established to aid in understanding bugs or vulnerabilities during the development process or in large-scale studies. A low false-positive rate is essential for the adaption in practice and for precise results of empirical studies. Unfortunately, static analyses tend to report where a vulnerability manifests rather than the fix location. This can cause presumed false
-
SATDAUG -- A Balanced and Augmented Dataset for Detecting Self-Admitted Technical Debt arXiv.cs.SE Pub Date : 2024-03-12 Edi Sutoyo, Andrea Capiluppi
Self-admitted technical debt (SATD) refers to a form of technical debt in which developers explicitly acknowledge and document the existence of technical shortcuts, workarounds, or temporary solutions within the codebase. Over recent years, researchers have manually labeled datasets derived from various software development artifacts: source code comments, messages from the issue tracker and pull request
-
A Flexible Cell Classification for ML Projects in Jupyter Notebooks arXiv.cs.SE Pub Date : 2024-03-12 Miguel Perez, Selin Aydin, Horst Lichter
Jupyter Notebook is an interactive development environment commonly used for rapid experimentation of machine learning (ML) solutions. Describing the ML activities performed along code cells improves the readability and understanding of Notebooks. Manual annotation of code cells is time-consuming and error-prone. Therefore, tools have been developed that classify the cells of a notebook concerning
-
Process Modeling With Large Language Models arXiv.cs.SE Pub Date : 2024-03-12 Humam Kourani, Alessandro Berti, Daniel Schuster, Wil M. P. van der Aalst
In the realm of Business Process Management (BPM), process modeling plays a crucial role in translating complex process dynamics into comprehensible visual representations, facilitating the understanding, analysis, improvement, and automation of organizational processes. Traditional process modeling methods often require extensive expertise and can be time-consuming. This paper explores the integration
-
Robustness, Security, Privacy, Explainability, Efficiency, and Usability of Large Language Models for Code arXiv.cs.SE Pub Date : 2024-03-12 Zhou Yang, Zhensu Sun, Terry Zhuo Yue, Premkumar Devanbu, David Lo
Large language models for code (LLM4Code), which demonstrate strong performance (e.g., high accuracy) in processing source code, have significantly transformed software engineering. Many studies separately investigate the non-functional properties of LM4Code, but there is no systematic review of how these properties are evaluated and enhanced. This paper fills this gap by thoroughly examining 146 relevant
-
Fixing Smart Contract Vulnerabilities: A Comparative Analysis of Literature and Developer's Practices arXiv.cs.SE Pub Date : 2024-03-12 Francesco Salzano, Simone Scalabrino, Rocco Oliveto, Remo Pareschi
Smart Contracts are programs running logic in the Blockchain network by executing operations through immutable transactions. The Blockchain network validates such transactions, storing them into sequential blocks of which integrity is ensured. Smart Contracts deal with value stakes, if a damaging transaction is validated, it may never be reverted, leading to unrecoverable losses. To prevent this, security
-
Digital Twin Evolution for Sustainable Smart Ecosystems arXiv.cs.SE Pub Date : 2024-03-11 Istvan David, Judith Michael, Dominik Bork
Smart ecosystems are the drivers of modern society. They control critical infrastructures, ensuring their stable and sustainable operation. Smart ecosystems are governed by digital twins -- real-time virtual representations of physical infrastructure. To support the open-ended and reactive traits of smart ecosystems, digital twins need to be able to evolve in reaction to changing conditions. However
-
Exploring Safety Generalization Challenges of Large Language Models via Code arXiv.cs.SE Pub Date : 2024-03-12 Qibing Ren, Chang Gao, Jing Shao, Junchi Yan, Xin Tan, Wai Lam, Lizhuang Ma
The rapid advancement of Large Language Models (LLMs) has brought about remarkable capabilities in natural language processing but also raised concerns about their potential misuse. While strategies like supervised fine-tuning and reinforcement learning from human feedback have enhanced their safety, these methods primarily focus on natural languages, which may not generalize to other domains. This
-
PROSKILL: A formal skill language for acting in robotics arXiv.cs.SE Pub Date : 2024-03-12 Félix IngrandLAAS-CNRS, Université de Toulouse, Toulouse, France
Acting is an important decisional function for autonomous robots. Acting relies on skills to implement and to model the activities it oversees: refinement, local recovery, temporal dispatching, external asynchronous events, and commands execution, all done online. While sitting between planning and the robotic platform, acting often relies on programming primitives and an interpreter which executes
-
Comparison of Static Analysis Architecture Recovery Tools for Microservice Applications arXiv.cs.SE Pub Date : 2024-03-11 Simon Schneider, Alexander Bakhtin, Xiaozhou Li, Jacopo Soldani, Antonio Brogi, Tomas Cerny, Riccardo Scandariato, Davide Taibi
Architecture recovery tools help software engineers obtain an overview of their software systems during all phases of the software development lifecycle. This is especially important for microservice applications because their distributed nature makes it more challenging to oversee the architecture. Various tools and techniques for this task are presented in academic and grey literature sources. Practitioners
-
NLP4RE Tools: Classification, Overview, and Management arXiv.cs.SE Pub Date : 2024-03-11 Julian Frattini, Michael Unterkalmsteiner, Davide Fucci, Daniel Mendez
Tools constitute an essential contribution to natural language processing for requirements engineering (NLP4RE) research. They are executable instruments that make research usable and applicable in practice. In this chapter, we first introduce a systematic classification of NLP4RE tools to improve the understanding of their types and properties. Then, we extend an existing overview with a systematic
-
SmartML: Towards a Modeling Language for Smart Contracts arXiv.cs.SE Pub Date : 2024-03-11 Adele Veschetti, Richard Bubel, Reiner Hähnle
Smart contracts codify real-world transactions and automatically execute the terms of the contract when predefined conditions are met. This paper proposes SmartML, a modeling language for smart contracts that is platform independent and easy to comprehend. We detail the formal semantics and the type system, focusing on its role in addressing security vulnerabilities and attacks. Through case studies
-
Technical Debt Management: The Road Ahead for Successful Software Delivery arXiv.cs.SE Pub Date : 2024-03-11 Paris Avgeriou, Ipek Ozkaya, Alexander Chatzigeorgiou, Marcus Ciolkowski, Neil A. Ernst, Ronald J. Koontz, Eltjo Poort, Forrest Shull
Technical Debt, considered by many to be the 'silent killer' of software projects, has undeniably become part of the everyday vocabulary of software engineers. We know it compromises the internal quality of a system, either deliberately or inadvertently. We understand Technical Debt is not all derogatory, often serving the purpose of expediency. But, it is associated with a clear risk, especially for
-
LLMs Still Can't Avoid Instanceof: An Investigation Into GPT-3.5, GPT-4 and Bard's Capacity to Handle Object-Oriented Programming Assignments arXiv.cs.SE Pub Date : 2024-03-10 Bruno Pereira Cipriano, Pedro Alves
Large Language Models (LLMs) have emerged as promising tools to assist students while solving programming assignments. However, object-oriented programming (OOP), with its inherent complexity involving the identification of entities, relationships, and responsibilities, is not yet mastered by these tools. Contrary to introductory programming exercises, there exists a research gap with regard to the
-
RepoHyper: Better Context Retrieval Is All You Need for Repository-Level Code Completion arXiv.cs.SE Pub Date : 2024-03-10 Huy N. Phan, Hoang N. Phan, Tien N. Nguyen, Nghi D. Q. Bui
Code Large Language Models (CodeLLMs) have demonstrated impressive proficiency in code completion tasks. However, they often fall short of fully understanding the extensive context of a project repository, such as the intricacies of relevant files and class hierarchies, which can result in less precise completions. To overcome these limitations, we present RepoHyper, a multifaceted framework designed
-
Integrating Static Code Analysis Toolchains arXiv.cs.SE Pub Date : 2024-03-09 Matthias Kern, Ferhat Erata, Markus Iser, Carsten Sinz, Frederic Loiret, Stefan Otten, Eric Sax
This paper proposes an approach for a tool-agnostic and heterogeneous static code analysis toolchain in combination with an exchange format. This approach enhances both traceability and comparability of analysis results. State of the art toolchains support features for either test execution and build automation or traceability between tests, requirements and design information. Our approach combines
-
A Novel Refactoring and Semantic Aware Abstract Syntax Tree Differencing Tool and a Benchmark for Evaluating the Accuracy of Diff Tools arXiv.cs.SE Pub Date : 2024-03-09 Pouria Alikhanifard, Nikolaos Tsantalis
Software undergoes constant changes to support new requirements, address bugs, enhance performance, and ensure maintainability. Thus, developers spend a great portion of their workday trying to understand and review the code changes of their teammates. Abstract Syntax Tree (AST) diff tools were developed to overcome the limitations of line-based diff tools, which are used by the majority of developers
-
LEGION: Harnessing Pre-trained Language Models for GitHub Topic Recommendations with Distribution-Balance Loss arXiv.cs.SE Pub Date : 2024-03-09 Yen-Trang Dang, Thanh-Le Cong, Phuc-Thanh Nguyen, Anh M. T. Bui, Phuong T. Nguyen, Bach Le, Quyet-Thang Huynh
Open-source development has revolutionized the software industry by promoting collaboration, transparency, and community-driven innovation. Today, a vast amount of various kinds of open-source software, which form networks of repositories, is often hosted on GitHub - a popular software development platform. To enhance the discoverability of the repository networks, i.e., groups of similar repositories
-
Engineering Formality and Software Risk in Debian Python Packages arXiv.cs.SE Pub Date : 2024-03-08 Matthew Gaughan, Kaylea Champion, Sohyeon Hwang
While free/libre and open source software (FLOSS) is critical to global computing infrastructure, the maintenance of widely-adopted FLOSS packages is dependent on volunteer developers who select their own tasks. Risk of failure due to the misalignment of engineering supply and demand -- known as underproduction -- has led to code base decay and subsequent cybersecurity incidents such as the Heartbleed
-
Mining Issue Trackers: Concepts and Techniques arXiv.cs.SE Pub Date : 2024-03-08 Lloyd Montgomery, Clara Lüders, Walid Maalej
An issue tracker is a software tool used by organisations to interact with users and manage various aspects of the software development lifecycle. With the rise of agile methodologies, issue trackers have become popular in open and closed-source settings alike. Internal and external stakeholders report, manage, and discuss "issues", which represent different information such as requirements and maintenance
-
Explaining Code with a Purpose: An Integrated Approach for Developing Code Comprehension and Prompting Skills arXiv.cs.SE Pub Date : 2024-03-10 Paul Denny, David H. Smith IV, Max Fowler, James Prather, Brett A. Becker, Juho Leinonen
Reading, understanding and explaining code have traditionally been important skills for novices learning programming. As large language models (LLMs) become prevalent, these foundational skills are more important than ever given the increasing need to understand and evaluate model-generated code. Brand new skills are also needed, such as the ability to formulate clear prompts that can elicit intended
-
Digital Wellbeing Redefined: Toward User-Centric Approach for Positive Social Media Engagement arXiv.cs.SE Pub Date : 2024-03-08 Yixue Zhao, Tianyi Li, Michael Sobolev
The prevalence of social media and its escalating impact on mental health has highlighted the need for effective digital wellbeing strategies. Current digital wellbeing interventions have primarily focused on reducing screen time and social media use, often neglecting the potential benefits of these platforms. This paper introduces a new perspective centered around empowering positive social media
-
Scalable Software as a Service Architecture arXiv.cs.SE Pub Date : 2024-03-08 Ardy Dedase
This paper explores the architecture of Software as a Service (SaaS) platforms, emphasizing scalability and maintainability. SaaS, a flexible software distribution model suitable for individuals and organizations, has become prevalent with the advent of Cloud services. This paper aims to provide a high-level design reference for establishing a scalable and maintainable SaaS architecture.
-
Bug Priority Change: An Empirical Study on Apache Projects arXiv.cs.SE Pub Date : 2024-03-08 Zengyang Li, Guangzong Cai, Qinyi Yu, Peng Liang, Ran Mo, Hui Liu
In issue tracking systems, each bug is assigned a priority level (e.g., Blocker, Critical, Major, Minor, or Trivial in JIRA from highest to lowest), which indicates the urgency level of the bug. In this sense, understanding bug priority changes helps to arrange the work schedule of participants reasonably, and facilitates a better analysis and resolution of bugs. According to the data extracted from