Skip to main content
Log in

Accelerating DES and AES Algorithms for a Heterogeneous Many-core Processor

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Data security is the focus of information security. As a primary method, file encryption is adopted for ensuring data security. Encryption algorithms created to meet the Data Encryption Standard (DES) and the Advanced Encryption Standard (AES) are widely used in a variety of systems. These algorithms are computationally highly complex, thus, the efficiency of encrypting or decrypting large files can be drastically reduced. To this end, we propose an optimized algorithm that efficiently encrypts and decrypts large files by parallelizing processing tasks on a single heterogeneous many-core processor in the Sunway TaihuLight computer system. Firstly, we convert the serial DES and AES programs to our experimental platform. Then we implement a task assignment strategy to test the converted algorithms. Finally, in order to optimize parallelized algorithms and improve data transmission performance, we apply the master-slave communication optimization, the three-stage parallel pipeline, and vectorization. Extensive experiments demonstrate that our optimized algorithm is faster than the state-of-the-art open-source implementations of DES and AES. Compared with the serial processing algorithms, our parallelized DES and AES perform nearly 40 times and 72 times faster, respectively. The work described in this paper leverages existing methods and provides a sound basis for the direction of future research in data encryption.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Coppersmith, D.: The data encryption standard (DES) and its strength against attacks. IBM J. RES. DEV. 38, 243–250 (1994)

    Article  Google Scholar 

  2. Advanced Encryption Standard: FIPS 197, (2001)

  3. Xu,Z., Lin, J., Matsuoka, S.: Benchmarking sw26010 many-core processor. In: Proceedings of International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 743-752 (2017)

  4. Dongarra, J.: Sunway TaihuLight supercomputer makes its appearance. NATL. SCI. REV. 3, 265–266 (2016)

    Article  Google Scholar 

  5. Li, Y., Wang, Q., Li Y., et al.: A Cost Model for Heterogeneous Many-Core Processor In: Proceedings of International Symposium on Parallel Architecture, Algorithm and Programming (PAAP), pp. 566-578 (2017)

  6. Chen, Z.D., Zhang, J.L.: Inner Fusion Optimization for AES Algorithm. J. Air Force Radar Academy 48, 215–217 (2012)

    Google Scholar 

  7. Daemen, J., Rijmen, V.: The Design of Rijndael: AES – The Advanced Encryption Standar. Springer, Jan. (2002)

  8. Li, H., Li, J.Z.: A New Compact Architecture for AES with Optimized Shiftrows Operation. In: Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1851-1854, May. (2007)

  9. Ahmad, N., Hasan R., Jubadi, W.M.: Design of AES S-Box using combinational logic optimization. In: Proceedings IEEE Symposium on Industrial Electronics and Applications (ISIEA), pp. 696-699 (2010)

  10. Zhou, Y.B., Li, Y.Z.: The Design and Implementation of a Symmetric Encryption Algorithm Based on DES. In: Proceedings of the 5th IEEE International Conference on Software Engineering and Service Science, pp. 517-520 (2014)

  11. Maraghy, M.EI., Hesham, S., et al.: Real-time efficient FPGA implementation of aes algorithm. In: Proceedings of the 26th IEEE International SOC Conference, pp. 203-208 (2013)

  12. Parikh, P., Narkhede, S.: High performance implementation of mixing of column and mixing of column for AES on FPGA. In: Proceedings of International Conference on Computation of Power, Energy Information and Communication (ICCPEIC), pp. 174-179 (2016)

  13. Jamal, S.: Implementation of advanced encryption standard (AES) 192 Bit on FPGA. Journal of information communication technologies and robotics applications (JICTRA) 2, 2228–3683 (2018)

    Google Scholar 

  14. Liu, Y.F., Xu, X.Y., Su, H.: AES algorithm optimization and fpga implementation. IOP Conference Series: Earth and Environmental Science 267, (2019)

    Article  Google Scholar 

  15. Jain, N., Ajnar, D.S., Jain, P.K.: Optimization of Advanced Encryption Standard Algorithm (AES) on Field Programmable Gate Array (FPGA). In: Proceedings of International Conference on Communication and Electronics Systems (ICCES), pp. 1086-1090 (2019)

  16. Chen, S., Hu, W., Li, Z.: High Performance Data Encryption with AES Implementation on FPGA. In: Proceedings of IEEE 5th Intl Conference on Big Data Security on Cloud (Big Data Security), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), pp. 149-153 (2019)

  17. Sawant, A.G., Nitnaware, V.N., Deshpande, A.A.: Spartan-6 FPGA Implementation of AES Algorithm. In: Proceedings of ICCCE, Singapore, pp. 205-211 (2020)

  18. Zodpe, H., Sapkal, A.: FPGA-Based High-Performance Computing Platform for Cryptanalysis of AES Algorithm. In: Proceedings of Computing in Engineering and Technology, pp. 637-646 (2020)

  19. Hafsa, A., Sghaier, A., Machhout, M., et al.: A New security Approach to Support the operations of ECC and AES Algorithms on FPGA. In: Proceedings of 19th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering (STA), pp. 95-100 (2019)

  20. Chang, Y., Zhu, K., Wu, G., et al.: An Introduction to Automated. In: Proceedings of Process Planning, Prentice-Hall International Series in Industrial and Systems Engineering (1985)

  21. Shang, L., Kaviani, A.S., Bathala, K.: Dynamic power consumption in Virtex\(^{{\rm TM}}\)-II FPGA family. In: Proceedings of the 2002 ACM/SIGDA tenth international symposium on Field-programmable gate arrays, pp. 157-164 (2002)

  22. George, Varghese, Jan, M.R.: Low-energy FPGAs-Architecture and Design, vol. 625. Springer Science & Business Media (2012)

  23. Donzellini, G., Oneto, L., Ponta, D., et al.: Introduction to FPGA and HDL Design. Introduction to Digital Systems Design, pp. 465–517. Springer, NewYork (2019)

  24. Gueron, S.: Intel advanced encryption standard (AES) instructions set. Intel Corp. (2010)

  25. Xia, H., Jia, Z.P., Zhang, F., et al.: The research and application of a specific instruction processor for AES. J. Compute. Res. Dev. 48, 1554–1562 (2011)

    Google Scholar 

  26. Feng, B., Qi, D.Y.: Implementation of extended instruction set for aes fast algorithm. J. South China Univ. Technol. (SCUT) 40, 97–102 (2012)

    Google Scholar 

  27. Hamburg, M.: Accelerating AES with vector permute instructions. In: Proceedings of International Workshop on Cryptographic Hardware and Embedded Systems (CHES), pp. 18-32 (2009)

  28. Iwai, K., Nishikawa, N., Kurokawa, T.: Acceleration of aes encryption on cuda gpu. Int. J. Netw. Comput. 2, 131–145 (2012)

    Google Scholar 

  29. Nishikawa, N., Amano, H., Iwai, K.: Implementation of bitsliced AES encryption on cuda-enabled GPU. In: Proceedings of Network and System Security - 11th International Conference, pp. 273-287 (2017)

  30. Gao, Y., Zhang, H., Zhou, Y., Cao, Y.: Electro-magnetic analysis of GPU-based AES implementation. In: Proceedings of the 55th Annual Design Automation Conference, pp. 121:1-121:6 (2018)

  31. Gao, Y., Zhou, Y., Cheng, W.: Efficient electro-magnetic analysis of a GPU bitsliced AES implementation. Cybersecur. 3, 1–17 (2020)

    Article  Google Scholar 

  32. Fei, X.W., Li, K.L., Yang, W.D., et al.: Implementation and exploring of acceleration efficiency of parallel aes algorithm on CUDA. Comput. Sci. 42, 59–62 (2015)

    Google Scholar 

  33. Abdelrahman, A.A., Fouad, M.M., Dahshan, H., Mousa, A.M.: High performance cuda aes implementation: A quantitative performance analysis approach. In: Proceedings of Computing Conference, pp. 1077–1085 (2017)

  34. Abdelrahman, A.A., Fouad, M.M., Dahshan, H.: Analysis on the aes implementation with various granularities on different gpu architectures. Adv. Elect. Electron. Eng. 15, 526 (2017)

    Google Scholar 

  35. Conti, V., Vitabile, S.: Design exploration of aes accelerators on fpgas and gpus. J. Telecommun. Inf. Technol. 1, 28 (2017)

    Google Scholar 

  36. Wang, C.H., Chu, X.W.: GPU Accelerated AES Algorithm. arXiv:1902.05234 (2019). Last Revised 14 Feb 2019

  37. Luo, C., Fei, Y.S., Luo, P., et al.: Side-channel Power Analysis of a GPU AES Implementation. In: Proceedings of the 33rd IEEE International Conference on Computer Design (ICCD ’15), IEEE Computer Society, pp. 281–288 (2015)

  38. Lin, Z., Mathur, U., Zhou, H.: Scatter-and-gather revisited: High-performance side-channel-resistant AES on GPUs. In: Proceedings of the 12th Workshop on General Purpose Processing Using GPUs, pp. 2-11 (2019)

  39. Chen, Y.D., Li, K.L., et al.: Implementation and Optimization of AES Algorithm on the Sunway TaihuLight. In: Proceedings of 17th International Conference on Parallel and Distributed Computing, Applications and Technologies PDCAT, pp. 256-261 (2016)

  40. Li, L., Fang, J., Jiang, J., et al.: SW-AES: Accelerating AES Algorithm on the Sunway TaihuLight. In: Proceedings of 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), pp. 1204-1211 (2017)

  41. Shin,S.H., Yamada, S., Hanaoka, G., et al.: An Extended CTRT for AES-256. In: Proceedings of International Workshop on Information Security Applications, pp. 79-91 (2019)

  42. Chen, Y.D., Li, K., Fei, X., et al.: Implementation and optimization of a data protecting model on the Sunway TaihuLight supercomputer with heterogeneous many-core processors. Concurr. Comput. Pract. Exp. 31, (2019)

    Google Scholar 

  43. Hajihassani, O., Monfared, S.K., Khasteh, S.H., et al.: Fast AES implementation: a High-throughput bitsliced approach. IEEE Transac. Parallel Distrib. Syst. 30, 2211–2222 (2019)

    Article  Google Scholar 

  44. Lu, J., Zhang, G.H., Li, G.Q.: Design of AES optimization algorithm based on data decomposition. Microcontrol. Embed. Syst. 4, 15–18 (2019)

    Google Scholar 

  45. Stallings, W.: Cryptography and Network Security: Principles and Practice, pp. 45–48. Prentice Hall, Upper Saddle River, USA (2011)

  46. Whitfield, D., Hellman, M.E.: Exhaustive cryptanalysis of the NBS data encryption standard. Compute. 10, 74–84 (1977)

    Google Scholar 

  47. Westlund, H.B.: NIST reports measurable success of advanced encryption standard. J. Res. Natl. Inst. Stand. Technol. 107, 307 (2002)

    Article  Google Scholar 

  48. Fu, H., Liao, J., Yang, J., et al.: The Sunway TaihuLight supercomputer: system and applications. Sci. China Inf. Sci. 59, (2016)

    Article  Google Scholar 

  49. Dongarra, J.: Report on the Sunway TaihuLight system. Rep. UT-EECS-16-742, Oak Ridge National Laboratory, Tennessee, USA (2016)

  50. Zhang, H., Hua, R., Yu, J.Z., et al.: Parallel acceleration algorithm of permutation entropy based on sunway many-core processor. Appl. Res. Comput. 37, 7 (2019)

    Google Scholar 

  51. Patterson, D.A., Sequin, C.H.: RISC I: A reduced instruction set VLSI computer. In: Proceedings of the 8th Annual Symposium on Computer Architecture, pp. 443-457 (1981)

  52. Pulte, C., Pichon-Pharabod, J., Kang, J., et al.: Promising-ARM/RISC-V: a simpler and faster operational concurrency model. In: Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 1-15 (2019)

  53. Garofalo, A., Rusci, M., Conti, F., et al.: PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors. Philos Transac. Royal Soc. A 378, 20190155 (2020)

    Article  MathSciNet  Google Scholar 

  54. Graham, S.L., Kessler, P.B., Mckusick, M.K.: Gprof: A call graph execution profiler. SIGPLAN Not. 39, 49–57 (2004)

    Article  Google Scholar 

  55. Singhal, S.P., Gupta, S., Nuzzo, P.: Profiling minisat based on user defined execution time—GPROF. arXiv:1909.13058 (2019). Last Revised 28 Sep 2019

Download references

Acknowledgements

This work is supported by the National Key Research and Development Program of China under Grant No.2017YFB0202105.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongquan Yang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xing, B., Wang, D., Yang, Y. et al. Accelerating DES and AES Algorithms for a Heterogeneous Many-core Processor. Int J Parallel Prog 49, 463–486 (2021). https://doi.org/10.1007/s10766-021-00692-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10766-021-00692-4

Keywords

Navigation