Keras2c: A library for converting Keras neural networks to real-time compatible C

https://doi.org/10.1016/j.engappai.2021.104182Get rights and content

Abstract

With the growth of machine learning models and neural networks in measurement and control systems comes the need to deploy these models in a way that is compatible with existing systems. Existing options for deploying neural networks either introduce very high latency, require expensive and time consuming work to integrate into existing code bases, or only support a very limited subset of model types. We have therefore developed a new method called Keras2c, which is a simple library for converting Keras/TensorFlow neural network models into real-time compatible C code. It supports a wide range of Keras layers and model types including multidimensional convolutions, recurrent layers, multi-input/output models, and shared layers. Keras2c re-implements the core components of Keras/TensorFlow required for predictive forward passes through neural networks in pure C, relying only on standard library functions considered safe for real-time use. The core functionality consists of 1500 lines of code, making it lightweight and easy to integrate into existing codebases. Keras2c has been successfully tested in experiments and is currently in use on the plasma control system at the DIII-D National Fusion Facility at General Atomics in San Diego.

Section snippets

Motivation

TensorFlow (Abadi et al., 2015) is one of the most popular libraries for developing and training neural networks. It contains a high level Python API called Keras (Chollet et al., 2015) that has gained popularity due to its ease of use and rich feature set. An example of using Keras to make a simple neural net is shown in Listing 1. As the use of machine learning and neural networks grows in the field of diagnostic and control systems (Hunt et al., 1992, Jin et al., 2018, Liu et al., 2017,

Method

Keras2c is based around the “layer” API of Keras, which treats each layer of a neural network as a function. This makes calculating the forward pass through the network a simple matter of calling the functions in the correct order with the correct inputs. The process of converting a model using Keras2c is shown in Fig. 1. The primary functionality can be broken into four primary components: weight and parameter extraction, graph parsing, a small C backend, and automatic testing.

Usage

An example of using Keras2c from within Python to convert a trained model is shown below in Listing 4. Here my_model is the Keras model to be converted (or a path to a saved model on disk in HDF5 format) and ‘‘my_converted_model’’ is the name that will be used for the generated C function and source files.

The command shown will generate three files:my_converted_model.c containing the main neural net function, my_converted_model.h containing the necessary declarations for including the neural

Benchmarks

Though the current C backend is not designed explicitly for speed, Keras2c has been benchmarked against Python Keras/TensorFlow for single CPU performance, and the generated code has been shown to be significantly faster for small to medium sized models while being competitive against other methods of implementing neural networks in C such as FANN and TensorFlow Lite. Results for several generic network types are shown in Fig. 2. They show that for fully connected, 1 dimensional convolutions,

Real-time applications

While not designed explicitly for speed, Keras2c was designed with real-time applications in mind, so efforts were made to ensure the generated code is deterministic and thread-safe, segmenting all non-deterministic behavior such as system calls to dedicated initialization and cleanup routines. The core of the neural net function is fully deterministic. All functions are re-entrant, and all mutable data is explicitly passed into and out of each function, allowing multiple calls to the neural

Discussion

As discussed in Section 4, the backend code as currently implemented is not optimized for execution speed, especially for large models. Convolutions are currently implemented using a direct calculation as opposed to the “im2col + GEMM” approach more commonly used in deep learning libraries (Chetlur et al., 2014). Planned future work will include implementing this and other modifications to improve calculation speed and better support the larger model sizes becoming common in state of the art

Conclusion

Keras2c allows for the straightforward and simple conversion of Keras neural networks to pure C code, in a form that can be easily deployed to real-time control systems, or anywhere that a C executable can be run. By relying only on C standard library functions, it avoids any complicated dependencies that can make deploying any program a challenge. The generated code is designed to be human readable and editable, to allow for custom modifications and extensions if necessary. Despite not making

CRediT authorship contribution statement

Rory Conlin: Conceptualization, Methodology, Software, Validation, Writing - original draft, Visualization, Investigation. Keith Erickson: Methodology, Software, Validation, Writing - review & editing. Joseph Abbate: Conceptualization, Methodology, Validation, Investigation. Egemen Kolemen: Conceptualization, Supervision, Project administration.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The authors thank Mitchell Clement, Ge Dong, and Mark D. Boyer for their help in beta testing and bug fixing. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences , using the DIII-D National Fusion Facility, a DOE Office of Science user facility, under Awards DE-FC02-04ER54698, DE-SC0015878, DE-AR0001166, and Field Work Proposal No. 1903

References (30)

  • ChetlurS. et al.

    CuDNN: Efficient primitives for deep learning

    (2014)
  • CholletF.

    Keras

    (2015)
  • CurtinR.R. et al.

    Mlpack 3: a fast, flexible machine learning library

    J. Open Source Softw.

    (2018)
  • FerronJ.R. et al.

    Flexible software architecture for tokamak discharge control systems

  • FuY. et al.

    Machine learning control for disruption and tearing mode avoidance

    Phys. Plasmas

    (2020)
  • Cited by (30)

    • Combining knowledge extension with convolution neural network for diabetes prediction

      2023, Engineering Applications of Artificial Intelligence
    • Forecasting long-term world annual natural gas production by machine learning

      2023, Resources Policy
      Citation Excerpt :

      Among the hyperparameters of the neural network model, the mean square error was chosen as the loss function. In addition to this, the number of epochs that controls the complete passes of the training set was chosen as 11 with 30 layers (Conlin et al., 2021). First, the prediction errors of each machine learning model were compared with one another using the Diebold-Mariano statistic (Table 1), which tests whether or not the forecasts differ by a statistically significant amount (Gao et al., 2019).

    • LOWESS smoothing and Random Forest based GRU model: A short-term photovoltaic power generation forecasting method

      2022, Energy
      Citation Excerpt :

      We add RepeatVector layer and TimeDistributed layer to make the GRU layer more diversified. First, we add the RepeatVector layer to the GRU hierarchy to ensure the same vector in each time step, which specifically refers to increase the dimension of input data and add attribute dimension, then the model can be analyzed in all aspects from various dimensions [34]. The parameter is represented by n. For example, when n = 3, it means that the dimension of input data increases to 2 dimensions.

    View all citing articles on Scopus
    View full text