Skip to main content
Log in

Reproducible execution of POSIX programs with DiOS

  • Special Section Paper
  • Published:
Software and Systems Modeling Aims and scope Submit manuscript

Abstract

In this paper, we describe DiOS , a lightweight model operating system, which can be used to execute programs that make use of POSIX APIs. Such executions are fully reproducible: running the same program with the same inputs twice will result in two exactly identical instruction traces, even if the program uses threads for parallelism. DiOS is implemented almost entirely in portable C and C++: although its primary platform is DiVM , a verification-oriented virtual machine, it can be configured to also run in KLEE, a symbolic executor. Finally, it can be compiled into machine code to serve as a user-mode kernel. Additionally, DiOS is modular and extensible. Its various components can be combined to match both the capabilities of the underlying platform and to provide services required by a particular program. Components can be added to cover additional system calls or APIs or removed to reduce overhead. The experimental evaluation has three parts. DiOS is first evaluated as a component of a program verification platform based on DiVM . In the second part, we consider its portability and modularity by combining it with the symbolic executor KLEE. Finally, we consider its use as a standalone user-mode kernel.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. One exception to this rule is the Go programming language, which at least in some configurations bypasses the C interface and interacts directly with the operating system kernel using system-specific conventions.

  2. https://divine.fi.muni.cz/2020/dios/.

  3. Ideally, since reproducibility is the main motivation for DiOS , a given program on given inputs would lead to an identical state space across all supported platforms. Considering the difficulty of the problem, this was not a major priority. Instead, we aim for a less ambitious variant of this goal, that is, the program exhibits the same higher-level semantics. In particular, considering any pair of platforms, the program contains the same set of errors (as long as all errors in the program fall into the intersection of error classes detected on both). Alas, it is much harder to demonstrate this modified thesis rigorously, though our evaluation does support it.

  4. Single instruction, multiple data.

  5. Non-determinism is required to implement two important classes of features: scheduling (threads, processes) and fault injection (which is, however, always optional). Additionally, a number of modules can transparently pass through indeterminate values (e.g. in the context of symbolic model checking), but do not directly make any use of non-determinism themselves.

  6. If execution A creates a file and leaves it around, execution B might deviate from its expected course when it tries to create the same file, or might detect its presence and behave differently.

  7. Here, we primarily refer to the C language, as opposed to the standard C library. In most use-cases, the C library is, in fact, provided by DiOS and for this reason, DiOS itself cannot rely on library facilities like setjmp. An exception to this rule is the ‘native’ port of DiOS , which uses a small set of functions from the host C library (see also Sect. 4.3).

  8. The main exception is KLEE, where the execution stack is completely inaccessible to the program under test and only the virtual machine can access the information stored in it. See also Sect. 4.2.

  9. A version of KLEE with fixes for those problems is available online, along with other supplementary material, from https://divine.fi.muni.cz/2020/dios/.

  10. In DIVINE [1], a model checker based on DiVM , interrupt points are dynamically enabled when the executing thread performs a visible action. Thread identification is supplied by the scheduler in DiOS using a platform-specific (hypercall) interface.

  11. The basis of this claim is largely empirical, based on years of experience in writing, reading and verifying multi-threaded code. The stubbed-out functions are designed for highly specialized scenarios—all the function intended for general use are implemented. An example of a lesser-used function would be pthread_barrierattr_getpshared, used to obtain the value of the process-shared attribute of its argument.

  12. For instance, on contemporary x86-64 processors, this interface is available via the syscall and sysret instructions.

  13. The list of system calls is only fixed relative to the host operating system. To allow the system call proxy component to function properly, the list needs to match what is available on the host. For instance, creat, uname or fdatasync are system calls on Linux but standard libc functions on OpenBSD.

  14. Of the 16,000 lines of code in libc++, the version in DiOS differs from upstream in 29 lines (19 lines added, 10 removed). In case of libc++abi, 7 preprocessor directives have been added to make the library build in C++17 mode.

  15. Unless DiOS is configured for system call proxying, in which case it has no control over the outcomes of interactions with the host operating system. However, in this case, those interactions are recorded, and the recorded execution can then be replayed at will.

  16. In a preemptive system, the executing thread does not need to perform any special action to be interrupted and removed from the processor (i.e. preempted). Systems based on this approach are more robust against misbehaving threads, at the expense of reduced efficiency and less intuitive behaviour.

  17. LART is a comprehensive tool for transforming and instrumenting LLVM bitcode and is described in more detail in [20]. Appropriate calls to LART are performed automatically by compilation scripts included with DiOS .

  18. There are typically more opportunities for state space reductions in inter-process concurrency (when compared to thread-based concurrency) due to less shared state, but this is currently outside the scope of DiOS .

  19. Specifically, hash tables and binary search trees that use pointers as keys are vulnerable to this problem.

  20. Of course, the effects of multiple concurrent calls to write may be ordered arbitrarily, and the scheduler will in fact ensure that all possible orderings of writes are explored.

  21. The existence and semantics of /dev/null, along with /dev/zero, are mandated by POSIX, but currently not available on DiOS . This is expected to be fixed in a future revision.

  22. Not all aspects of the ABI are relevant at the bitcode level. For example, the function calling convention used by a given platform is specified in terms of low-level, architecture-specific notions, like the names of CPU registers or the minutiae of stack management. These do not affect DiOS directly, and we simply rely on the LLVM native code generator to deal with this part of the ABI correctly.

  23. Transparent (or non-opaque) types in the sense that user programs are allowed to directly access their fields by name or via macro expansion. In those cases, the compiler computes field offsets into the struct at compile time and hard-codes the results into the generated bitcode or machine code.

  24. All test programs are available online at http://divine.fi.muni.cz/2020/dios/, including scripts to reproduce the results reported in this and in the following sections.

  25. Each test program contains the list of its assigned tags near the top (first or second line) embedded in a comment in a machine-readable format. Names of all parent directories in which the test cases are stored are appended to this list. Please note that the tags are assigned and reviewed mostly manually, and hence it is possible that minor inaccuracies have crept in.

References

  1. Baranová, Z., Barnat, J., Kejstová, K., Kučera, T., Lauko, H., Mrázek, J., Ročkai, P., Štill, V.: Model checking of C and C++ with DIVINE 4. In: ATVA 2017, Volume 10482 of LNCS, pp. 201–207. Springer. https://divine.fi.muni.cz/2017/divine4 (2017)

  2. Beyer, D.: Reliable and reproducible competition results with BenchExec and witnesses report on SV-COMP 2016. In: TACAS, pp. 887–904. Springer (2016). https://doi.org/10.1007/978-3-662-49674-9_55

  3. Cadar, C., Dunbar, D., Engler, D.R.: KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In: OSDI, pp. 209–224. USENIX Association (2008)

  4. Chipounov, V., Kuznetsov, V., Candea, G.: The S2E platform: design, implementation, and applications. ACM Trans. Comput. Syst. 30(1), 1–49 (2012). https://doi.org/10.1145/2110356.2110358

    Article  Google Scholar 

  5. Chirigati, F., Shasha, D., Freire, J.: Reprozip: using provenance to support computational reproducibility. In: Theory and Practice of Provenance, pp. 1:1–1:4. USENIX Association, Berkeley (2013). http://dl.acm.org/citation.cfm?id=2482949.2482951

  6. Daum, M., Schirmer, N., Schmidt, M.: From operating-system correctness to pervasively veried applications. In: Integrated Formal Methods, vol. 10, pp. 105–120 (2010). https://doi.org/10.1007/978-3-642-16265-7_9

  7. Frew, J., Metzger, D., Slaughter, P.: Automatic capture and reconstruction of computational provenance. Concurr. Comput. Pract. Exp. 20(5), 485–496 (2008). https://doi.org/10.1002/cpe.v20:5

    Article  Google Scholar 

  8. Goel, S., Hunt, W.A., Kaufmann, M., Ghosh, S.: Simulation and formal verification of x86 machine-code programs that make system calls. In: 2014 Formal Methods in Computer-Aided Design (FMCAD), pp. 91–98 (2014)

  9. Inverso, O., Nguyen, T.L., Fischer, B., Torre, S.L., Parlato, G.: Lazy-CSeq: a context-bounded model checking tool for multi-threaded C-programs. In: Automated Software Engineering, pp. 807–812 (2015). https://doi.org/10.1109/ASE.2015.108

  10. Joshi, S., Orso, A.: SCARPE: a technique and tool for selective capture and replay of program executions. In: International Conference on Software Maintenance, pp. 234–243 (2007). ISBN 978-1-4244-1256-3, https://doi.org/10.1109/ICSM.2007.4362636

  11. Kejstová, K.: Model Checking with System Call Traces. Master’s Thesis, Masarykova univerzita, Fakulta informatiky, Brno (2019). http://is.muni.cz/th/tukvk/

  12. Kejstová, K., Ročkai, P., Barnat, J.: From model checking to runtime verification and back. In: Runtime Verification, Volume 10548 of LNCS, pp. 225–240. Springer (2017). https://doi.org/10.1007/978-3-319-67531-2_14

  13. Kong, S., Tillmann, N., de Halleux, J.: Automated testing of environment-dependent programs: a case study of modeling the file system for Pex. In: International Conference on Information Technology: New Generations, pp. 758–762 (2009). https://doi.org/10.1109/ITNG.2009.80

  14. Krekel, H., Oliveira, B., Pfannschmidt, R., Bruynooghe, F., Laugher, B., Bruhin, F.: pytest 4.5 (2004). https://github.com/pytest-dev/pytest

  15. Lauko, H., Štill, V., Ročkai, P., Barnat, J.: Extending DIVINE with symbolic verification using SMT. In: TACAS, pp. 204–208. Springer, Cham (2019)

  16. Leungwattanakit, W., Artho, C., Hagiya, M., Tanabe, Y., Yamamoto, M., Takahashi, K.: Modular software model checking for distributed systems. IEEE Trans. Softw. Eng. 40, 483–501 (2014). https://doi.org/10.1109/TSE.2013.49

    Article  Google Scholar 

  17. Mackinnon, T., Freeman, S., Craig, P.: Extreme programming examined. In: Chapter Endo-Testing: Unit Testing with Mock Objects, pp. 287–301. Addison-Wesley Longman Publishing Co., Inc., Boston (2001). ISBN 0-201-71040-4. http://dl.acm.org/citation.cfm?id=377517.377534

  18. Mostafa, S., Wang, X.: An empirical study on the usage of mocking frameworks in software testing. In: International Conference on Quality Software, pp. 127–132 (2014). https://doi.org/10.1109/QSIC.2014.19

  19. Musuvathi, M., Qadeer, S., Ball, T., Basler, G., Nainar, P.A., Neamtiu, I.: Finding and reproducing Heisenbugs in concurrent programs. In: Symposium on Operating Systems Design and Implementation. USENIX (2008)

  20. Ročkai, P., Štill, V., Černá, I., Barnat, J.: DiVM: model checking with LLVM and graph memory. J. Syst. Softw. 143, 1–13 (2018). https://doi.org/10.1016/j.jss.2018.04.026

    Article  Google Scholar 

  21. Štill, V., Ročkai, P., Barnat, J.: Using off-the-shelf exception support components in C++ verification. In: Software Quality, Reliability and Security, pp. 54–64. IEEE (2017). https://doi.org/10.1109/QRS.2017.15

  22. Wachter, B., Kroening, D., Ouaknine, J.: Verifying multi-threaded software with Impact. In: Formal Methods in Computer-Aided Design, pp. 210–217. IEEE (2013). https://doi.org/10.1109/FMCAD.2013.6679412

  23. Yang, Y., Chen, X., Gopalakrishnan, G.: A Runtime Model Checker for Multithreaded C Programs. Technical Report, Inspect (2008)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Petr Ročkai.

Additional information

Communicated by Gwen Salaün and Peter Csaba Ölveczky.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work has been partially supported by the Czech Science Foundation grant No. 18-02177S and by Red Hat, Inc.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ročkai, P., Baranová, Z., Mrázek, J. et al. Reproducible execution of POSIX programs with DiOS . Softw Syst Model 20, 363–382 (2021). https://doi.org/10.1007/s10270-020-00837-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10270-020-00837-y

Keywords

Navigation