-
NLP Verification: Towards a General Methodology for Certifying Robustness arXiv.cs.PL Pub Date : 2024-03-15 Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Omri Isac, Matthew L. Daggitt, Guy Katz, Verena Rieser, Oliver Lemon
Deep neural networks have exhibited substantial success in the field of Natural Language Processing (NLP) and ensuring their safety and reliability is crucial: there are safety critical contexts where such models must be robust to variability or attack, and give guarantees over their output. Unlike Computer Vision, NLP lacks a unified verification methodology and, despite recent advancements in literature
-
StarMalloc: A Formally Verified, Concurrent, Performant, and Security-Oriented Memory Allocator arXiv.cs.PL Pub Date : 2024-03-14 Antonin Reitz, Aymeric Fromherz, Jonathan Protzenko
In this work, we present StarMalloc, a verified, security-oriented, concurrent memory allocator that can be used as a drop-in replacement in real-world projects. Using the Steel separation logic framework, we show how to specify and verify StarMalloc, relying on dependent types and modular abstractions to enable efficient verification. As part of StarMalloc, we also develop several generic datastructures
-
Formalizing Date Arithmetic and Statically Detecting Ambiguities for the Law arXiv.cs.PL Pub Date : 2024-03-13 Raphaël Monat, Aymeric Fromherz, Denis Merigoux
Legal expert systems routinely rely on date computations to determine the eligibility of a citizen to social benefits or whether an application has been filed on time. Unfortunately, date arithmetic exhibits many corner cases, which are handled differently from one library to the other, making faithfully transcribing the law into code error-prone, and possibly leading to heavy financial and legal consequences
-
Flexible Non-intrusive Dynamic Instrumentation for WebAssembly arXiv.cs.PL Pub Date : 2024-03-12 Ben L. Titzer, Elizabeth Gilbert, Bradley Wei Jie Teo, Yash Anand, Kazuyuki Takayama, Heather Miller
A key strength of managed runtimes over hardware is the ability to gain detailed insight into the dynamic execution of programs with instrumentation. Analyses such as code coverage, execution frequency, tracing, and debugging, are all made easier in a virtual setting. As a portable, low-level bytecode, WebAssembly offers inexpensive in-process sandboxing with high performance. Yet to date, Wasm engines
-
A bargain for mergesorts (functional pearl) -- How to prove your mergesort correct and stable, almost for free arXiv.cs.PL Pub Date : 2024-03-13 Cyril Cohen, Kazuhiko Sakaguchi
We present a novel characterization of stable mergesort functions using relational parametricity, and show that it implies the correctness of mergesort. As a result, one can prove the correctness of several variations of mergesort (e.g., top-down, bottom-up, tail-recursive, non-tail-recursive, smooth, and non-smooth mergesorts) by proving the characterization property for each variation. To further
-
Deriving Dependently-Typed OOP from First Principles -- Extended Version with Additional Appendices arXiv.cs.PL Pub Date : 2024-03-11 David Binder, Ingo Skupin, Tim Süberkrüb, Klaus Ostermann
The expression problem describes how most types can easily be extended with new ways to produce the type or new ways to consume the type, but not both. When abstract syntax trees are defined as an algebraic data type, for example, they can easily be extended with new consumers, such as print or eval, but adding a new constructor requires the modification of all existing pattern matches. The expression
-
A Minority of C++ Objects Account for the Majority of Allocation CPU Time arXiv.cs.PL Pub Date : 2024-03-11 Eugene Darashkevich, Roman Rusyaev, Roman Korostinskiy, Yegor Bugayenko
In C++, an object can be allocated in static memory, on the stack, or on the heap, where the latter is by the order of magnitude more expensive operation, performance wise, than the first two. However, it is not clear how much overall performance loss may be attributed to the use of on-heap objects in C++ applications. This study aims to fill this gap by analyzing object allocation practices in open-source
-
Towards Fixed-Point Formats Determination for Faust Programs arXiv.cs.PL Pub Date : 2024-03-11 Agathe HerrouGRAME, Florent de DinechinINSA Lyon, Stéphane LetzGRAME, Yann Orlarey, Anastasia Volkova
Modern programmable digital signal processing relies on floating-point numbers for their ease of use. Fixed-point number formats have the potential to save resources and improve execution time, but realising this potential burdens the programmer with the need to define each format, at every step of the computation. This article reviews existing methods to automatically determine fixed-point formats
-
A Tool for Automated Reasoning About Traces Based on Configurable Formal Semantics arXiv.cs.PL Pub Date : 2024-03-09 Ferhat Erata, Arda Goknil, Bedir Tekinerdogan, Geylani Kardas
We present Tarski, a tool for specifying configurable trace semantics to facilitate automated reasoning about traces. Software development projects require that various types of traces be modeled between and within development artifacts. For any given artifact (e.g., requirements, architecture models and source code), Tarski allows the user to specify new trace types and their configurable semantics
-
Realizability in Semantics-Guided Synthesis Done Eagerly arXiv.cs.PL Pub Date : 2024-03-08 Roland Meyer, Jakob Tepe, Sebastian Wolff
We present realizability and realization logic, two program logics that jointly address the problem of finding solutions in semantics-guided synthesis. What is new is that we proceed eagerly and not only analyze a single candidate program but a whole set. Realizability logic computes information about the set of candidate programs in a forward fashion. Realization logic uses this information as guidance
-
We Know I Know You Know; Choreographic Programming With Multicast and Multiply Located Values arXiv.cs.PL Pub Date : 2024-03-08 Mako Bates, Joe Near
Concurrent distributed systems are notoriously difficult to construct and reason about. Choreographic programming is a recent paradigm that describes a distributed system in a single global program called a choreography. Choreographies simplify reasoning about distributed systems and can ensure deadlock freedom by static analysis. In previous choreographic programming languages, each value is located
-
Sound and Complete Witnesses for Template-based Verification of LTL Properties on Polynomial Programs arXiv.cs.PL Pub Date : 2024-03-08 Krishnendu Chatterjee, Amir Kafshdar Goharshady, Ehsan Kafshdar Goharshady, Mehrdad Karrabi, Đorđe Žikelić
We study the classical problem of verifying programs with respect to formal specifications given in the linear temporal logic (LTL). We first present novel \emph{sound and complete} witnesses for LTL verification over imperative programs. Our witnesses are applicable to both universal (all runs) and existential (some run) settings. We then consider LTL formulas in which atomic propositions can be polynomial
-
Modeling Dynamic (De)Allocations of Local Memory for Translation Validation arXiv.cs.PL Pub Date : 2024-03-08 Abhishek Rose, Sorav Bansal
End-to-End Translation Validation is the problem of verifying the executable code generated by a compiler against the corresponding input source code for a single compilation. This becomes particularly hard in the presence of dynamically-allocated local memory where addresses of local memory may be observed by the program. In the context of validating the translation of a C procedure to executable
-
LLM4Decompile: Decompiling Binary Code with Large Language Models arXiv.cs.PL Pub Date : 2024-03-08 Hanzhuo Tan, Qi Luo, Jing Li, Yuqun Zhang
Decompilation aims to restore compiled code to human-readable source code, but struggles with details like names and structure. Large language models (LLMs) show promise for programming tasks, motivating their application to decompilation. However, there does not exist any open-source LLM for decompilation. Moreover, existing decompilation evaluation systems mainly consider token-level accuracy and
-
Cedar: A New Language for Expressive, Fast, Safe, and Analyzable Authorization (Extended Version) arXiv.cs.PL Pub Date : 2024-03-07 Joseph Cutler, Craig Disselkoen, Aaron Eline, Shaobo He, Kyle Headley, Michael Hicks, Kesha Hietala, Elefterios Ioannidis, John Kastner, Anwar Mamat, Darin McAdams, Matt McCutchen, Neha Rungta, Emina Torlak, Andrew Wells
Cedar is a new authorization policy language designed to be ergonomic, fast, safe, and analyzable. Rather than embed authorization logic in an application's code, developers can write that logic as Cedar policies and delegate access decisions to Cedar's evaluation engine. Cedar's simple and intuitive syntax supports common authorization use-cases with readable policies, naturally leveraging concepts
-
Message-Observing Sessions arXiv.cs.PL Pub Date : 2024-03-07 Ryan Kavanagh, Brigitte Pientka
We present Most, a process language with message-observing session types. Message-observing session types extend binary session types with type-level computation to specify communication protocols that vary based on messages observed on other channels. Hence, Most allows us to express global invariants about processes, rather than just local invariants, in a bottom-up, compositional way. We give Most
-
Strong Priority and Determinacy in Timed CCS arXiv.cs.PL Pub Date : 2024-03-07 Luigi Liquori, Michael Mendler
Building on the classical theory of process algebra with priorities, we identify a new scheduling mechanism, called "sequentially constructive reduction" which is designed to capture the essence of synchronous programming. The distinctive property of this evaluation strategy is to achieve determinism-by-construction for multi-cast concurrent communication. In particular, it permits us to model shared
-
Conjugate operators for transparent, explorable research outputs arXiv.cs.PL Pub Date : 2024-03-07 Joseph Bond, Cristina David, Minh Nguyen, Dominic Orchard, Roly Perera
Charts, figures, and text derived from data play an important role in decision making, from data-driven policy development to day-to-day choices informed by online articles. Making sense of, or fact-checking, outputs means understanding how they relate to the underlying data. Even for domain experts with access to the source code and data sets, this poses a significant challenge. In this paper we introduce
-
QRtree -- Decision Tree dialect specification of QRscript arXiv.cs.PL Pub Date : 2024-03-07 Stefano Scanzio, Matteo Rosani, Mattia Scamuzzi, Gianluca Cena
This specification document specifies the syntax and semantics of QRtree, which is a specific dialect of QRscript particularly suited to represent decision trees without chance nodes. The term dialect identifies one of the possible sub-languages that can be encoded inside of an eQR code via QRscript. This specification will describe an intermediate representation of QRtree, made through a language
-
QRscript specification arXiv.cs.PL Pub Date : 2024-03-07 Stefano Scanzio, Matteo Rosani, Mattia Scamuzzi, Gianluca Cena
This specification document specifies the syntax and semantics of QRscript. The current document only shows the part related to the QRscript header, i.e., the first part of the binary code that must be inserted into the QR code. A QR code containing an executable code is called an executable QR code (eQR code). QRscript supports different dialects, i.e., sublanguages with implementation characteristics
-
Generative Explanations for Program Synthesizers arXiv.cs.PL Pub Date : 2024-03-06 Amirmohammad Nazari, Souti Chattopadhyay, Swabha Swayamdipta, Mukund Raghothaman
Despite great advances in program synthesis techniques, they remain algorithmic black boxes. Although they guarantee that when synthesis is successful, the implementation satisfies the specification, they provide no additional information regarding how the implementation works or the manner in which the specification is realized. One possibility to answer these questions is to use large language models
-
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators arXiv.cs.PL Pub Date : 2024-03-06 Indraneil Paul, Jun Luo, Goran Glavaš, Iryna Gurevych
Code understanding and generation have fast become some of the most popular applications of language models (LMs). Nonetheless, research on multilingual aspects of Code-LMs (i.e., LMs for code generation) such as cross-lingual transfer between different programming languages, language-specific data augmentation, and post-hoc LM adaptation, alongside exploitation of data sources other than the original
-
Automated Software Verification of Hyperliveness arXiv.cs.PL Pub Date : 2024-03-05 Raven Beutner
Hyperproperties relate multiple executions of a program and are commonly used to specify security and information-flow policies. Most existing work has focused on the verification of $k$-safety properties, i.e., properties that state that all $k$-tuples of execution traces satisfy a given property. In this paper, we study the automated verification of richer properties that combine universal and existential
-
VeriEQL: Bounded Equivalence Verification for Complex SQL Queries with Integrity Constraints arXiv.cs.PL Pub Date : 2024-03-05 Yang He, Pinhan Zhao, Xinyu Wang, Yuepeng Wang
The task of SQL query equivalence checking is important in various real-world applications (including query rewriting and automated grading) that involve complex queries with integrity constraints; yet, state-of-the-art techniques are very limited in their capability of reasoning about complex features (e.g., those that involve sorting, case statement, rich integrity constraints, etc.) in real-life
-
Mars 2.0: A Toolchain for Modeling, Analysis, Verification and Code Generation of Cyber-Physical Systems arXiv.cs.PL Pub Date : 2024-03-05 Bohua Zhan, Xiong Xu, Qiang Gao, Zekun Ji, Xiangyu Jin, Shuling Wang, Naijun Zhan
We introduce Mars 2.0 for modeling, analysis, verification and code generation of Cyber-Physical Systems. Mars 2.0 integrates Mars 1.0 with several important extensions and improvements, allowing the design of cyber-physical systems using the combination of AADL and Simulink/Stateflow, which provide a unified graphical framework for modeling the functionality, physicality and architecture of the system
-
Abstracting Denotational Interpreters arXiv.cs.PL Pub Date : 2024-03-05 Sebastian Graf, Simon Peyton Jones, Sven Keidel
We explore denotational interpreters: denotational semantics that produce coinductive traces of a corresponding small-step operational semantics. By parameterising our denotational interpreter over the semantic domain and then varying it, we recover dynamic semantics with different evaluation strategies as well as summary-based static analyses such as type analysis, all from the same generic interpreter
-
Broadening the View of Live Programmers: Integrating a Cross-Cutting Perspective on Run-Time Behavior into a Live Programming Environment arXiv.cs.PL Pub Date : 2024-03-04 Patrick ReinHasso Plattner Institute - University of Potsdam, Germany, Christian FlachHasso Plattner Institute - University of Potsdam, Germany, Stefan RamsonHasso Plattner Institute - University of Potsdam, Germany, Eva KrebsHasso Plattner Institute - University of Potsdam, Germany, Robert HirschfeldHasso Plattner Institute - University of Potsdam, Germany
Live programming provides feedback on run-time behavior by visualizing concrete values of expressions close to the source code. When using such a local perspective on run-time behavior, programmers have to mentally reconstruct the control flow if they want to understand the relation between observed values. As this requires complete and correct knowledge of all relevant code, this reconstruction is
-
Dr Wenowdis: Specializing dynamic language C extensions using type information arXiv.cs.PL Pub Date : 2024-03-04 Maxwell Bernstein, CF Bolz-Tereick
C-based interpreters such as CPython make extensive use of C "extension" code, which is opaque to static analysis tools and faster runtimes with JIT compilers, such as PyPy. Not only are the extensions opaque, but the interface between the dynamic language types and the C types can introduce impedance. We hypothesise that frequent calls to C extension code introduce significant overhead that is often
-
Arrays in Practice: An Empirical Study of Array Access Patterns on the JVM arXiv.cs.PL Pub Date : 2024-03-04 Beatrice ÅkerblomStockholm University, Sweden, Elias CastegrenUppsala University, Sweden
The array is a data structure used in a wide range of programs. Its compact storage and constant time random access makes it highly efficient, but arbitrary indexing complicates the analysis of code containing array accesses. Such analyses are important for compiler optimisations such as bounds check elimination. The aim of this work is to gain a better understanding of how arrays are used in real-world
-
Privacy-Respecting Type Error Telemetry at Scale arXiv.cs.PL Pub Date : 2024-03-04 Ben GreenmanBrown University, USA / University of Utah, USA, Alan JeffreyRoblox, USA, Shriram KrishnamurthiBrown University, USA, Mitesh ShahRoblox, USA
Context: Roblox Studio lets millions of creators build interactive experiences by programming in a variant of Lua called Luau. The creators form a broad group, ranging from novices writing their first script to professional developers; thus, Luau must support a wide audience. As part of its efforts to support all kinds of programmers, Luau includes an optional, gradual type system and goes to great
-
Let a Thousand Flowers Bloom: An Algebraic Representation for Edge Graphs arXiv.cs.PL Pub Date : 2024-03-04 Jack Liell-CockUniversity of Oxford, United Kingdom, Tom SchrijversKU Leuven, Belgium
Context: Edge graphs are graphs whose edges are labelled with identifiers, and nodes can have multiple edges between them. They are used to model a wide range of systems, including networks with distances or degrees of connection and complex relational data. Inquiry: Unfortunately, the homogeneity of this graph structure prevents an effective representation in (functional) programs. Either their interface
-
Reactive Programming without Functions arXiv.cs.PL Pub Date : 2024-03-04 Bjarno OeyenVrije Universiteit Brussel, Belgium, Joeri De KosterVrije Universiteit Brussel, Belgium, Wolfgang De MeuterVrije Universiteit Brussel, Belgium
Context: Reactive programming (RP) is a declarative programming paradigm suitable for expressing the handling of events. It enables programmers to create applications that react automatically to changes over time. Whenever a time-varying signal changes -- e.g. in response to values produced by event stream (e.g., sensor data, user input...) -- the program state is updated automatically in tandem with
-
Scheduling Garbage Collection for Energy Efficiency on Asymmetric Multicore Processors arXiv.cs.PL Pub Date : 2024-03-04 Marina ShimchenkoUppsala University, Sweden, Erik ÖsterlundOracle, Sweden, Tobias WrigstadUppsala University, Sweden
The growing concern for energy efficiency in the Information and Communication Technology (ICT) sector has prompted the exploration of resource management techniques. While hardware architectures, such as single-ISA asymmetric multicore processors (AMP), offer potential energy savings, there is still untapped potential for software optimizations. This paper aims to bridge this gap by investigating
-
Collective Allocator Abstraction to Control Object Spatial Locality in C++ arXiv.cs.PL Pub Date : 2024-03-04 Takato HideshimaUniversity of Tokyo, Japan, Shigeyuki SatoUniversity of Electro-Communications, Japan, Tomoharu UgawaUniversity of Tokyo, Japan
Disaggregated memory is promising for improving memory utilization in computer clusters in which memory demands significantly vary across computer nodes under utilization. It allows applications with high memory demands to use memory in other computer nodes. However, disaggregated memory is not easy to use for implementing data structures in C++ because the C++ standard does not provide an adequate
-
LiveRec: Prototyping Probes by Framing Debug Protocols arXiv.cs.PL Pub Date : 2024-03-04 Jean-Baptiste DöderleinENS Rennes, France, Riemer van RozenCWI, Netherlands, Tijs van der StormCWI, Netherlands / University of Groningen, Netherlands
Context: In the first part of his 2012 presentation "Inventing on Principle", Bret Victor gives a demo of a live code editor for Javascript which shows the dynamic history of values of variables in real time. This form of live programming has become known as "probes". Probes provide the programmer with permanent and continuous insight into the dynamic evolution of function or method variables, thus
-
Circular Programs and Self-Referential Structures arXiv.cs.PL Pub Date : 2024-03-04 Lloyd Allison
A circular program creates a data structure whose computation depends upon itself or refers to itself. The technique is used to implement the classic data structures circular and doubly-linked lists, threaded trees and queues, in a functional programming language. These structures are normally thought to require updatable variables found in imperative languages. For example, a functional program to
-
Making Hybrid Languages: A Recipe arXiv.cs.PL Pub Date : 2024-03-02 Leif Andersen, Cameron Moy, Stephen Chang, Matthias Felleisen
The dominant programming languages support only linear text to express ideas. Visual languages offer graphical representations for entire programs, when viewed with special tools. Hybrid languages, with support from existing tools, allow developers to express their ideas with a mix of textual and graphical syntax tailored to an application domain. This mix puts both kinds of syntax on equal footing
-
CatCode: A Comprehensive Evaluation Framework for LLMs On the Mixture of Code and Text arXiv.cs.PL Pub Date : 2024-03-04 Zhenru Lin, Yiqun Yao, Yang Yuan
Large language models (LLMs) such as ChatGPT are increasingly proficient in understanding and generating a mixture of code and text. Evaluation based on such $\textit{mixture}$ can lead to a more comprehensive understanding of the models' abilities in solving coding problems. However, in this context, current evaluation methods are either limited in task coverage or lack standardization. To address
-
SoD$^2$: Statically Optimizing Dynamic Deep Neural Network arXiv.cs.PL Pub Date : 2024-02-29 Wei Niu, Gagan Agrawal, Bin Ren
Though many compilation and runtime systems have been developed for DNNs in recent years, the focus has largely been on static DNNs. Dynamic DNNs, where tensor shapes and sizes and even the set of operators used are dependent upon the input and/or execution, are becoming common. This paper presents SoD$^2$, a comprehensive framework for optimizing Dynamic DNNs. The basis of our approach is a classification
-
Data Transfer Optimizations for Host-CPU and Accelerators in AXI4MLIR arXiv.cs.PL Pub Date : 2024-02-29 Jude Haris, Nicolas Bohm Agostini, Antonino Tumeo, David Kaeli, José Cano
As custom hardware accelerators become more prevalent, it becomes increasingly important to automatically generate efficient host-driver code that can fully leverage the capabilities of these accelerators. This approach saves time and reduces the likelihood of errors that can occur during manual implementation. AXI4MLIR extends the MLIR compiler framework to generate host-driver code for custom accelerators
-
Algorithmically Expressive, Always-Terminating Model for Reversible Computation arXiv.cs.PL Pub Date : 2024-02-29 Matteo PalazzoUniversità di Torino, Luca RoversiUniversità di Torino
Concerning classical computational models able to express all the Primitive Recursive Functions (PRF), there are interesting results regarding limits on their algorithmic expressiveness or, equivalently, efficiency, namely the ability to express algorithms with minimal computational cost. By introducing the reversible programming model Forest, at our knowledge, we provide a first study of analogous
-
Verification of Neural Networks' Global Robustness arXiv.cs.PL Pub Date : 2024-02-29 Anan Kabaha, Dana Drachsler-Cohen
Neural networks are successful in various applications but are also susceptible to adversarial attacks. To show the safety of network classifiers, many verifiers have been introduced to reason about the local robustness of a given input to a given perturbation. While successful, local robustness cannot generalize to unseen inputs. Several works analyze global robustness properties, however, neither
-
Bluebell: An Alliance of Relational Lifting and Independence For Probabilistic Reasoning arXiv.cs.PL Pub Date : 2024-02-28 Jialu Bao, Emanuele D'Osualdo, Azadeh Farzan
We present Bluebell, a program logic for reasoning about probabilistic programs where unary and relational styles of reasoning come together to create new reasoning tools. Unary-style reasoning is very expressive and is powered by foundational mechanisms to reason about probabilistic behaviour like independence and conditioning. The relational style of reasoning, on the other hand, naturally shines
-
Rose: Efficient and Extensible Autodiff on the Web arXiv.cs.PL Pub Date : 2024-02-27 Sam EstepCarnegie Mellon University, Raven RothkopfBarnard College, Columbia University, Wode NiCarnegie Mellon University, Joshua SunshineCarnegie Mellon University
Automatic differentiation (AD) has become the backbone for a new wave of optimization-driven domains such as computer graphics and machine learning over the past decade. However, existing AD systems face limitations, either lacking support for in-browser development or failing to harness more recent, compilerbased approaches to achieve both expressiveness and size-preserving differentiation. This work
-
A Constraint-based Mathematical Modeling Library in Prolog with Answer Constraint Semantics arXiv.cs.PL Pub Date : 2024-02-27 François FagesLifeware
Constraint logic programming emerged in the late 80's as a highly declarative class of programming languages based on first-order logic and theories with decidable constraint languages, thereby subsuming Prolog restricted to equality constraints over the Herbrand's term domain. This approach has proven extremely successfull in solving combinatorial problems in the industry which quickly led to the
-
Synthesizing Tight Privacy and Accuracy Bounds via Weighted Model Counting arXiv.cs.PL Pub Date : 2024-02-26 Lisa Oakley, Steven Holtzen, Alina Oprea
Programmatically generating tight differential privacy (DP) bounds is a hard problem. Two core challenges are (1) finding expressive, compact, and efficient encodings of the distributions of DP algorithms, and (2) state space explosion stemming from the multiple quantifiers and relational properties of the DP definition. We address the first challenge by developing a method for tight privacy and accuracy
-
Less is More Revisit arXiv.cs.PL Pub Date : 2024-02-26 Nobuko Yoshida, Ping Hou
Multiparty session types (MPST) is a type discipline where a programmer or architect specifies a whole view of communications as a global protocol, and each distributed program is locally type-checked against its end-point projection. After 10 years from the birth of MPST, Scalas and Yoshida have discovered that the proofs of type safety in the literature which use the end-point projection with mergeability
-
Weak-linearity, globality and in-place update arXiv.cs.PL Pub Date : 2024-02-26 Hector Gramaglia
Computational interpretations of linear logic allow static control of memory resources: the data produced by the program are endowed through its type with attributes that determine its life cycle. This has promoted numerous investigations into safe introduction of in-place update. Various type systems have been proposed for this aim, but linearity and correctness of in-place update are properties that
-
Formally Verified C Code Generation from Hybrid Communicating Sequential Processes arXiv.cs.PL Pub Date : 2024-02-24 Shuling Wang, Zekun Ji, Bohua Zhan, Xiong Xu, Qiang Gao, Naijun Zhan
Hybrid Communicating Sequential Processes (HCSP) is a formal model for hybrid systems, including primitives for evolution along an ordinary differential equation (ODE), communication, and parallel composition. Code generation is needed to convert HCSP models into code that can be executed in practice, and the correctness of this conversion is essential to ensure that the generated code accurately reflects
-
Equational Bit-Vector Solving via Strong Gröbner Bases arXiv.cs.PL Pub Date : 2024-02-26 Jiaxin Song, Hongfei Fu, Charles Zhang
Bit-vectors, which are integers in a finite number of bits, are ubiquitous in software and hardware systems. In this work, we consider the satisfiability modulo theories (SMT) of bit-vectors. Unlike normal integers, the arithmetics of bit-vectors are modular upon integer overflow. Therefore, the SMT solving of bit-vectors needs to resolve the underlying modular arithmetics. In the literature, two prominent
-
How Do Humans Write Code? Large Models Do It the Same Way Too arXiv.cs.PL Pub Date : 2024-02-24 Long Li
Large Language Models (LLMs) often make errors when performing numerical calculations. In contrast to traditional chain-of-thought reasoning, the program-of-thoughts approach involves generating executable code to solve problems. By executing this code, it achieves more precise results. Using generated executable code instead of natural language can reduce computational errors. However, we observe
-
Getting into the Flow: Towards Better Type Error Messages for Constraint-Based Type Inference arXiv.cs.PL Pub Date : 2024-02-20 Ishan Bhanuka, Lionel Parreaux, David Binder, Jonathan Immanuel Brachthäuser
Creating good type error messages for constraint-based type inference systems is difficult. Typical type error messages reflect implementation details of the underlying constraint-solving algorithms rather than the specific factors leading to type mismatches. We propose using subtyping constraints that capture data flow to classify and explain type errors. Our algorithm explains type errors as faulty
-
LTL learning on GPUs arXiv.cs.PL Pub Date : 2024-02-19 Mojtaba Valizadeh, Nathanaël Fijalkow, Martin Berger
Linear temporal logic (LTL) is widely used in industrial verification. LTL formulae can be learned from traces. Scaling LTL formula learning is an open problem. We implement the first GPU-based LTL learner using a novel form of enumerative program synthesis. The learner is sound and complete. Our benchmarks indicate that it handles traces at least 2048 times more numerous, and on average at least 46
-
Weak-Linear Types arXiv.cs.PL Pub Date : 2024-02-19 Hector Gramaglia
Computational interpretations of linear logic allow static control of memory resources: the data produced by the program are endowed through its type with attributes that determine its life cycle, and guarantee safe deallocation. The use of linear types encounters limitations in practice, since linear data, in the traditional sense, do not so often appear in actual programs. Several alternatives have
-
A Cartesian Closed Category for Random Variables arXiv.cs.PL Pub Date : 2024-02-18 Pietro Di Gianantonio, Abbas Edalat
We present a novel, yet rather simple construction within the traditional framework of Scott domains to provide semantics to probabilistic programming, thus obtaining a solution to a long-standing open problem in this area. Unlike current main approaches that employ some probability measures or continuous valuations on non-standard or rather complex structures, we use the Scott domain of random variables
-
SPML: A DSL for Defending Language Models Against Prompt Attacks arXiv.cs.PL Pub Date : 2024-02-19 Reshabh K Sharma, Vinayak Gupta, Dan Grossman
Large language models (LLMs) have profoundly transformed natural language applications, with a growing reliance on instruction-based definitions for designing chatbots. However, post-deployment the chatbot definitions are fixed and are vulnerable to attacks by malicious users, emphasizing the need to prevent unethical applications and financial losses. Existing studies explore user prompts' impact
-
Theoretical foundations for programmatic reinforcement learning arXiv.cs.PL Pub Date : 2024-02-18 Guruprerana Shabadi, Nathanaël Fijalkow, Théo Matricon
The field of Reinforcement Learning (RL) is concerned with algorithms for learning optimal policies in unknown stochastic environments. Programmatic RL studies representations of policies as programs, meaning involving higher order constructs such as control loops. Despite attracting a lot of attention at the intersection of the machine learning and formal methods communities, very little is known
-
The Vienna Architecture Description Language arXiv.cs.PL Pub Date : 2024-02-14 Simon Himmelbauer, Christoph Hochrainer, Benedikt Huber, Niklas Mischkulnig, Philipp Paulweber, Tobias Schwarzinger, Andreas Krall
The Vienna Architecture Description Language (VADL) is a powerful processor description language (PDL) that enables the concise formal specification of processor architectures. By utilizing a single VADL processor specification, the VADL system exhibits the capability to automatically generate a range of artifacts necessary for rapid design space exploration. These include assemblers, compilers, linkers
-
An Executable Specification of Oncology Dose-Escalation Protocols with Prolog arXiv.cs.PL Pub Date : 2024-02-13 David C. Norris, Markus Triska
We present, as a pure Prolog program, the first executable specification of the 3 + 3 dose-escalation protocol commonly used in early-phase oncology drug development. In this program, the imperative operations of the protocol emerge as consequences of clinically meaningful anticipatory-regret scenarios that are declared as CLP(Z) constraints. This 'regret-constrained' (RC) specification yields a robust
-
Three Subtyping Algorithms for Binary Session Types and their Complexity Analyses arXiv.cs.PL Pub Date : 2024-02-10 Thien Udomsrirungruang, Nobuko Yoshida
Session types are a type discipline for describing and specifying communication behaviours of concurrent processes. Session subtyping, firstly introduced by Gay and Hole, is widely used for enlarging typability of session programs. This paper gives the complexity analysis of three algorithms for subtyping of synchronous binary session types. First, we analyse the complexity of the algorithm from the