Keras2c: A library for converting Keras neural networks to real-time compatible C
Section snippets
Motivation
TensorFlow (Abadi et al., 2015) is one of the most popular libraries for developing and training neural networks. It contains a high level Python API called Keras (Chollet et al., 2015) that has gained popularity due to its ease of use and rich feature set. An example of using Keras to make a simple neural net is shown in Listing 1. As the use of machine learning and neural networks grows in the field of diagnostic and control systems (Hunt et al., 1992, Jin et al., 2018, Liu et al., 2017,
Method
Keras2c is based around the “layer” API of Keras, which treats each layer of a neural network as a function. This makes calculating the forward pass through the network a simple matter of calling the functions in the correct order with the correct inputs. The process of converting a model using Keras2c is shown in Fig. 1. The primary functionality can be broken into four primary components: weight and parameter extraction, graph parsing, a small C backend, and automatic testing.
Usage
An example of using Keras2c from within Python to convert a trained model is shown below in Listing 4. Here my_model is the Keras model to be converted (or a path to a saved model on disk in HDF5 format) and ‘‘my_converted_model’’ is the name that will be used for the generated C function and source files.
The command shown will generate three files:my_converted_model.c containing the main neural net function, my_converted_model.h containing the necessary declarations for including the neural
Benchmarks
Though the current C backend is not designed explicitly for speed, Keras2c has been benchmarked against Python Keras/TensorFlow for single CPU performance, and the generated code has been shown to be significantly faster for small to medium sized models while being competitive against other methods of implementing neural networks in C such as FANN and TensorFlow Lite. Results for several generic network types are shown in Fig. 2. They show that for fully connected, 1 dimensional convolutions,
Real-time applications
While not designed explicitly for speed, Keras2c was designed with real-time applications in mind, so efforts were made to ensure the generated code is deterministic and thread-safe, segmenting all non-deterministic behavior such as system calls to dedicated initialization and cleanup routines. The core of the neural net function is fully deterministic. All functions are re-entrant, and all mutable data is explicitly passed into and out of each function, allowing multiple calls to the neural
Discussion
As discussed in Section 4, the backend code as currently implemented is not optimized for execution speed, especially for large models. Convolutions are currently implemented using a direct calculation as opposed to the “im2col GEMM” approach more commonly used in deep learning libraries (Chetlur et al., 2014). Planned future work will include implementing this and other modifications to improve calculation speed and better support the larger model sizes becoming common in state of the art
Conclusion
Keras2c allows for the straightforward and simple conversion of Keras neural networks to pure C code, in a form that can be easily deployed to real-time control systems, or anywhere that a C executable can be run. By relying only on C standard library functions, it avoids any complicated dependencies that can make deploying any program a challenge. The generated code is designed to be human readable and editable, to allow for custom modifications and extensions if necessary. Despite not making
CRediT authorship contribution statement
Rory Conlin: Conceptualization, Methodology, Software, Validation, Writing - original draft, Visualization, Investigation. Keith Erickson: Methodology, Software, Validation, Writing - review & editing. Joseph Abbate: Conceptualization, Methodology, Validation, Investigation. Egemen Kolemen: Conceptualization, Supervision, Project administration.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
The authors thank Mitchell Clement, Ge Dong, and Mark D. Boyer for their help in beta testing and bug fixing. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences , using the DIII-D National Fusion Facility, a DOE Office of Science user facility, under Awards DE-FC02-04ER54698, DE-SC0015878, DE-AR0001166, and Field Work Proposal No. 1903
References (30)
- et al.
State-of-the-art in artificial neural network applications: A survey
Heliyon
(2018) - et al.
Neural networks for control systems—a survey
Automatica
(1992) - et al.
Robot manipulator control using neural networks: A survey
Neurocomputing
(2018) - et al.
A survey of deep neural network architectures and their applications
Neurocomputing
(2017) - et al.
Extending the capabilities of the DIII-D plasma control system for worldwide fusion research collaborations
Fusion Eng. Des.
(2009) Tensorflow: Large-scale machine learning on heterogeneous systems
(2015)- et al.
Fully data-driven profile prediction for DIII-D
Nucl. Fusion
(2021) Amazon sagemaker
(2020)- et al.
ONNX: Open neural network exchange
(2019) - et al.
Cython: The best of both worlds
Comput. Sci. Eng.
(2011)
CuDNN: Efficient primitives for deep learning
Keras
Mlpack 3: a fast, flexible machine learning library
J. Open Source Softw.
Flexible software architecture for tokamak discharge control systems
Machine learning control for disruption and tearing mode avoidance
Phys. Plasmas
Cited by (30)
Combining knowledge extension with convolution neural network for diabetes prediction
2023, Engineering Applications of Artificial IntelligenceForecasting long-term world annual natural gas production by machine learning
2023, Resources PolicyCitation Excerpt :Among the hyperparameters of the neural network model, the mean square error was chosen as the loss function. In addition to this, the number of epochs that controls the complete passes of the training set was chosen as 11 with 30 layers (Conlin et al., 2021). First, the prediction errors of each machine learning model were compared with one another using the Diebold-Mariano statistic (Table 1), which tests whether or not the forecasts differ by a statistically significant amount (Gao et al., 2019).
LOWESS smoothing and Random Forest based GRU model: A short-term photovoltaic power generation forecasting method
2022, EnergyCitation Excerpt :We add RepeatVector layer and TimeDistributed layer to make the GRU layer more diversified. First, we add the RepeatVector layer to the GRU hierarchy to ensure the same vector in each time step, which specifically refers to increase the dimension of input data and add attribute dimension, then the model can be analyzed in all aspects from various dimensions [34]. The parameter is represented by n. For example, when n = 3, it means that the dimension of input data increases to 2 dimensions.
A general infrastructure for data-driven control design and implementation in tokamaks
2023, Journal of Plasma PhysicsMachine learning-based real-time kinetic profile reconstruction in DIII-D
2024, Nuclear Fusion