skip to main content
research-article

PipeArch: Generic and Context-Switch Capable Data Processing on FPGAs

Published:05 November 2020Publication History
Skip Abstract Section

Abstract

Data processing systems based on FPGAs offer high performance and energy efficiency for a variety of applications. However, these advantages are achieved through highly specialized designs. The high degree of specialization leads to accelerators with narrow functionality and designs adhering to a rigid execution flow. For multi-tenant systems this limits the scope of applicability of FPGA-based accelerators, because, first, supporting a single operation is unlikely to have any significant impact on the overall performance of the system, and, second, serving multiple users satisfactorily is difficult due to simplistic scheduling policies enforced when using the accelerator. Standard operating system and database management system features that would help address these limitations, such as context-switching, preemptive scheduling, and thread migration are practically non-existent in current FPGA accelerator efforts.

In this work, we propose PipeArch, an open-source project1 for developing FPGA-based accelerators that combine the high efficiency of specialized hardware designs with the generality and functionality known from conventional CPU threads. PipeArch provides programmability and extensibility in the accelerator without losing the advantages of SIMD-parallelism and deep pipelining. PipeArch supports context-switching and thread migration, thereby enabling for the first time new capabilities such as preemptive scheduling in FPGA accelerators within a high-performance data processing setting. We have used PipeArch to implement a variety of machine learning methods for generalized linear model training and recommender systems showing empirically their advantages over a high-end CPU and even over fully specialized FPGA designs.

References

  1. [n.d.]. Amazon Employee Access Dataset. https://github.com/owenzhang/Kaggle-AmazonChallenge2013.Google ScholarGoogle Scholar
  2. [n.d.]. Amazon F1 Instances. aws.amazon.com/ec2/instance-types/f1/.Google ScholarGoogle Scholar
  3. [n.d.]. AWS FPGA Stack Repository. Retrieved from https://github.com/aws/aws-fpga.Google ScholarGoogle Scholar
  4. [n.d.]. Baidu FPGA Instances. Retrieved from https://cloud.baidu.com/product/fpga.html.Google ScholarGoogle Scholar
  5. [n.d.]. Intel OPAE Framework. Retrieved from opae.github.io.Google ScholarGoogle Scholar
  6. [n.d.]. KDD Dataset. Retrieved from https://www.datarobot.com/blog/datarobot-the-2014-kdd-cup.Google ScholarGoogle Scholar
  7. [n.d.]. Music (Audio Features) Dataset. Retrieved from https://labrosa.ee.columbia.edu/millionsong.Google ScholarGoogle Scholar
  8. [n.d.]. Xilinx VCU1525. Retrieved from www.xilinx.com/products/boards-and-kits/vcu1525-a.html.Google ScholarGoogle Scholar
  9. Jason Agron and David Andrews. 2009. Building heterogeneous reconfigurable systems with a hardware microkernel. In Proceedings of the 7th IEEE/ACM International Conference on Hardware/software Codesign and System Synthesis. ACM, 393--402.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mikhail Asiatici, Nithin George, Kizheppatt Vipin, Suhaib A. Fahmy, and Paolo Ienne. 2017. Virtualized execution runtime for FPGA accelerators in the cloud. IEEE Access 5 (2017), 1900--1910.Google ScholarGoogle ScholarCross RefCross Ref
  11. James Bennett, Stan Lanning, et al. 2007. The Netflix prize. In Proceedings of the KDD Cup and Workshop, Vol. 2007. New York, NY, 35.Google ScholarGoogle Scholar
  12. Alban Bourge, Olivier Muller, and Frédéric Rousseau. 2016. Generating efficient context-switch capable circuits through autonomous design flow. ACM Trans. Reconfig. Technol. Syst. 10, 1 (2016), 1--23.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Doug Burger, Stephen W. Keckler, Kathryn S. McKinley, Mike Dahlin, Lizy K. John, Calvin Lin, Charles R. Moore, James Burrill, Robert G. McDonald, and William Yoder. 2004. Scaling to the end of silicon with EDGE architectures. Computer 37, 7 (2004), 44--55.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Stuart Byma, J. Gregory Steffan, Hadi Bannazadeh, Alberto Leon Garcia, and Paul Chow. 2014. FPGAs in the cloud: Booting virtualized hardware accelerators with OpenStack. In Proceedings of the 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines. IEEE, 109--116.Google ScholarGoogle ScholarCross RefCross Ref
  15. Emmanuel J. Candès and Benjamin Recht. 2009. Exact matrix completion via convex optimization. Foundations of Computational Mathematics 9, 6 (2009), 717.Google ScholarGoogle ScholarCross RefCross Ref
  16. Hui Yan Cheah, Suhaib A. Fahmy, and Douglas L. Maskell. 2012. iDEA: A DSP block based FPGA soft processor. In Proceedings of the 2012 International Conference on Field-Programmable Technology. IEEE, 151--158.Google ScholarGoogle Scholar
  17. Fei Chen, Yi Shan, Yu Zhang, Yu Wang, Hubertus Franke, Xiaotao Chang, and Kun Wang. 2014. Enabling FPGAs in the cloud. In Proceedings of the 11th ACM Conference on Computing Frontiers. ACM, 3.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Yao Chen, Jiong He, Xiaofan Zhang, Cong Hao, and Deming Chen. 2019. Cloud-DNN: An open framework for mapping DNN models to cloud FPGAs. In Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 73--82.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Wei-Sheng Chin, Yong Zhuang, Yu-Chin Juan, and Chih-Jen Lin. 2015. A fast parallel stochastic gradient method for matrix factorization in shared memory systems. ACM Transactions on Intelligent Systems and Technology (TIST) 6, 1 (2015), 2.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Christopher H. Chou, Aaron Severance, Alex D. Brant, Zhiduo Liu, Saurabh Sant, and Guy G. F. Lemieux. 2011. VEGAS: Soft vector processor with scratchpad memory. In Proceedings of the 19th ACM/SIGDA International Symposium on Field Programmable Gate Arrays. ACM, 15--24.Google ScholarGoogle Scholar
  21. Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, et al. 2018. Serving DNNs in real time at datacenter scale with project brainwave. IEEE Micro 38, 2 (2018), 8--20.Google ScholarGoogle ScholarCross RefCross Ref
  22. Jason Cong, Hui Huang, Chiyuan Ma, Bingjun Xiao, and Peipei Zhou. 2014. A fully pipelined and dynamically composable architecture of CGRA. In Proceedings of the 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines. IEEE, 9--16.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. James Coole and Greg Stitt. 2013. Fast, flexible high-level synthesis from OpenCL using reconfiguration contexts. IEEE Micro 34, 1 (2013), 42--53.Google ScholarGoogle ScholarCross RefCross Ref
  24. Henk Corporaal. 1997. Microprocessor Architectures: From VLIW to TTA. John Wiley 8 Sons, Inc.Google ScholarGoogle Scholar
  25. Kermin Fleming, Hsin-Jung Yang, Michael Adler, and Joel Emer. 2014. The LEAP FPGA operating system. In Proceedings of the 2014 24th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  26. Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, et al. 2018. A configurable cloud-scale DNN processor for real-time AI. In Proceedings of the 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA). IEEE, 1--14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Mark Gebhart, Bertrand A. Maher, Katherine E. Coons, Jeff Diamond, Paul Gratz, Mario Marino, Nitya Ranganathan, Behnam Robatmili, Aaron Smith, James Burrill, et al. 2009. An evaluation of the TRIPS computer system. ACM SIGARCH Computer Architecture News 37, 1 (2009), 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Seth Copen Goldstein, Herman Schmit, Matthew Moe, Mihai Budiu, Srihari Cadambi, R. Reed Taylor, and Ronald Laufer. 1999. PipeRench: A coprocessor for streaming multimedia acceleration. In Proceedings of the 26th International Symposium on Computer Architecture (Cat. No. 99CB36367). IEEE, 28--39.Google ScholarGoogle ScholarCross RefCross Ref
  29. Venkatraman Govindaraju, Chen-Han Ho, Tony Nowatzki, Jatin Chhugani, Nadathur Satish, Karthikeyan Sankaralingam, and Changkyu Kim. 2012. Dyser: Unifying functionality and parallelism specialization for energy-efficient computing. IEEE Micro 32, 5 (2012), 38--51.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Panu Hamalainen, Jari Heikkinen, Marko Hannikainen, and Timo D. Hamalainen. 2005. Design of transport triggered architecture processors for wireless encryption. In Proceedings of the 8th Euromicro Conference on Digital System Design (DSD’05). IEEE, 144--152.Google ScholarGoogle Scholar
  31. Markus Happe, Andreas Traber, and Ariane Keller. 2015. Preemptive hardware multitasking in ReconOS. In Proceedings of the International Symposium on Applied Reconfigurable Computing. Springer, 79--90.Google ScholarGoogle ScholarCross RefCross Ref
  32. Jan Hoogerbrugge and Henk Corporaal. 1995. Automatic synthesis of transport triggered processors. In Proceedings of the First Ann. Conf. Advanced School for Computing and Imaging, Heijen, The Netherlands.Google ScholarGoogle Scholar
  33. S. Idreos, F. Groffen, N. Nes, S. Manegold, S. Mullender, and M. Kersten. 2012. MonetDB: Two decades of research in column-oriented database architectures. Data Engineering 40 (2012).Google ScholarGoogle Scholar
  34. Aws Ismail and Lesley Shannon. 2011. FUSE: Front-end user framework for O/S abstraction of hardware accelerators. In Proceedings of the 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines. IEEE, 170--177.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Zsolt István, David Sidler, and Gustavo Alonso. 2016. Runtime parameterizable regular expression operators for databases. In Proceedings of the IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’16). IEEE, 204--211.Google ScholarGoogle ScholarCross RefCross Ref
  36. Xabier Iturbe, Khaled Benkrid, Chuan Hong, Ali Ebrahim, Raul Torrego, Imanol Martinez, Tughrul Arslan, and Jon Perez. 2013. R3TOS: A novel reliable reconfigurable real-time operating system for highly adaptive, efficient, and dependable computing on FPGAs. IEEE Transactions on Computers 62, 8 (2013), 1542--1556.Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Pekka Jääskeläinen, Aleksi Tervo, Guillermo Payá Vayá, Timo Viitanen, Nicolai Behmann, Jarmo Takala, and Holger Blume. 2018. Transport-triggered soft cores. In Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 83--90.Google ScholarGoogle ScholarCross RefCross Ref
  38. Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture. ACM, 1--12.Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Muhammed Al Kadi, Benedikt Janssen, Jones Yudi, and Michael Huebner. 2018. General-purpose computing with soft GPUs on FPGAs. ACM Transactions on Reconfigurable Technology and Systems (TRETS) 11, 1 (2018), 5.Google ScholarGoogle Scholar
  40. Nachiket Kapre. 2016. Optimizing soft vector processing in FPGA-based embedded systems. ACM Transactions on Reconfigurable Technology and Systems (TRETS) 9, 3 (2016), 17.Google ScholarGoogle Scholar
  41. Nachiket Kapre and Jan Gray. 2015. Hoplite: Building austere overlay NOCs for FPGAs. In Proceedings of the 2015 25th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  42. Kaan Kara, Dan Alistarh, Gustavo Alonso, Onur Mutlu, and Ce Zhang. 2017. FPGA-accelerated dense linear machine learning: A precision-convergence trade-off. In Proceedings of the IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM’17). IEEE, 160--167.Google ScholarGoogle ScholarCross RefCross Ref
  43. Kaan Kara and Gustavo Alonso. 2016. Fast and robust hashing for database operators. In Proceedings of the 26th International Conference on Field Programmable Logic and Applications (FPL’16). IEEE, 1--4.Google ScholarGoogle ScholarCross RefCross Ref
  44. Kaan Kara, Ken Eguro, Ce Zhang, and Gustavo Alonso. 2018. ColumnML: Column-store machine learning with on-the-fly data transformation. Proceedings of the VLDB Endowment 12, 4 (2018), 348--361.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Kaan Kara, Jana Giceva, and Gustavo Alonso. 2017. FPGA-based data partitioning. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 433--445.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Oliver Knodel, Paul R. Genssler, and Rainer G. Spallek. 2017. Migration of long-running tasks between reconfigurable resources using virtualization. ACM SIGARCH Computer Architecture News 44, 4 (2017), 56--61.Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Dirk Koch, Christian Haubelt, and Jürgen Teich. 2007. Efficient hardware checkpointing: Concepts, overhead analysis, and implementation. In Proceedings of the 2007 ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays. 188--196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  48. Chris Lattner and Jacques Pienaar. 2019. MLIR primer: A compiler infrastructure for the end of Moore’s law. (2019).Google ScholarGoogle Scholar
  49. Cheng Liu, Ho-Cheung Ng, and Hayden Kwok-Hay So. 2015. QuickDough: A rapid FPGA loop accelerator design framework using soft CGRA overlay. In Proceedings of the 2015 International Conference on Field Programmable Technology (FPT). IEEE, 56--63.Google ScholarGoogle ScholarCross RefCross Ref
  50. Yu Liu, Hantian Zhang, Luyuan Zeng, Wentao Wu, and Ce Zhang. 2018. MLBench: How good are machine learning clouds for binary classification tasks on structured data? Proceedings of the VLDB Endowment 11, 10 (2018), 1220--1232.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Enno Lübbers and Marco Platzner. 2009. ReconOS: Multithreaded programming for reconfigurable computers. ACM Transactions on Embedded Computing Systems (TECS) 9, 1 (2009), 8.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Divya Mahajan, Joon Kyung Kim, Jacob Sacks, Adel Ardalan, Arun Kumar, and Hadi Esmaeilzadeh. 2018. In-RDBMS hardware acceleration of advanced analytics. Proceedings of the VLDB Endowment 11, 11 (2018), 1317--1331.Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Aurelio Morales-Villanueva, Rohit Kumar, and Ann Gordon-Ross. 2016. Configuration prefetching and reuse for preemptive hardware multitasking on partially reconfigurable FPGAs. In Proceedings of the 2016 Design, Automation 8 Test in Europe Conference 8 Exhibition (DATE). IEEE, 1505--1508.Google ScholarGoogle ScholarCross RefCross Ref
  54. Ramadass Nagarajan, Karthikeyan Sankaralingam, Doug Burger, and Stephen W. Keckler. 2001. A design space evaluation of grid processor architectures. In Proceedings of the 34th ACM/IEEE International Symposium on Microarchitecture. MICRO-34. IEEE, 40--51.Google ScholarGoogle Scholar
  55. Neal Oliver, Rahul R. Sharma, Stephen Chang, Bhushan Chitlur, Elkin Garcia, Joseph Grecco, Aaron Grier, Nelson Ijih, Yaping Liu, Pratik Marolia, et al. 2011. A reconfigurable computing system based on a cache-coherent fabric. In Proceedings of the International Conference on Reconfigurable Computing and FPGAs (ReConFig’11). IEEE, 80--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Muhsen Owaida, Gustavo Alonso, Laura Fogliarini, Anthony Hock-Koon, and Pierre-Etienne Melet. 2019. Lowering the latency of data processing pipelines through FPGA based hardware acceleration. Proceedings of the VLDB Endowment 13, 1 (2019), 71--85.Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. Muhsen Owaida, David Sidler, Kaan Kara, and Gustavo Alonso. 2017. Centaur: A framework for hybrid CPU-FPGA databases. In Proceedings of the 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 211--218.Google ScholarGoogle ScholarCross RefCross Ref
  58. Muhsen Owaida, Hantian Zhang, Ce Zhang, and Gustavo Alonso. 2017. Scalable inference of decision tree ensembles: Flexible design for CPU-FPGA platforms. In Proceedings of the 27th International Conference on Field Programmable Logic and Applications (FPL’17). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  59. Sinno Jialin Pan and Qiang Yang. 2009. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2009), 1345--1359.Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Kolin Paul, Chinmaya Dash, and Mansureh Shahraki Moghaddam. 2012. reMORPH: A runtime reconfigurable architecture. In Proceedings of the 2012 15th Euromicro Conference on Digital System Design. IEEE, 26--33.Google ScholarGoogle ScholarDigital LibraryDigital Library
  61. Andrew Putnam. 2014. Large-scale reconfigurable computing in a microsoft datacenter. In Proceedings of the Hot Chips 26 Symposium (HCS), 2014 IEEE. IEEE, 1--38.Google ScholarGoogle Scholar
  62. Andrew Putnam, Adrian M. Caulfield, Eric S. Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth Gopal, Jan Gray, et al. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. ACM SIGARCH Computer Architecture News 42, 3 (2014), 13--24.Google ScholarGoogle ScholarDigital LibraryDigital Library
  63. Benjamin Recht and Christopher Ré. 2013. Parallel stochastic gradient algorithms for large-scale matrix completion. Mathematical Programming Computation 5, 2 (2013), 201--226.Google ScholarGoogle ScholarCross RefCross Ref
  64. Aaron Severance, Joe Edwards, Hossein Omidian, and Guy Lemieux. 2014. Soft vector processors with streaming pipelines. In Proceedings of the 2014 ACM/SIGDA International Symposium on Field-programmable Gate Arrays. ACM, 117--126.Google ScholarGoogle ScholarDigital LibraryDigital Library
  65. Aaron Severance and Guy Lemieux. 2012. VENICE: A compact vector processor for FPGA applications. In Proceedings of the 2012 International Conference on Field-Programmable Technology. IEEE, 261--268.Google ScholarGoogle ScholarCross RefCross Ref
  66. Shai Shalev-Shwartz and Ambuj Tewari. 2011. Stochastic methods for L1-regularized loss minimization. Journal of Machine Learning Research 12, Jun (2011), 1865--1892.Google ScholarGoogle Scholar
  67. David Sidler, Zsolt István, Muhsen Owaida, Kaan Kara, and Gustavo Alonso. 2017. doppioDB: A hardware accelerated database. In Proceedings of the 2017 ACM International Conference on Management of Data. ACM, 1659--1662.Google ScholarGoogle ScholarDigital LibraryDigital Library
  68. Bharat Sukhwani, Hong Min, Mathew Thoennes, Parijat Dube, Balakrishna Iyer, Bernard Brezzo, Donna Dillenberger, and Sameh Asaad. 2012. Database analytics acceleration using FPGAs. In Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques. ACM, 411--420.Google ScholarGoogle ScholarDigital LibraryDigital Library
  69. Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818--2826.Google ScholarGoogle ScholarCross RefCross Ref
  70. Yaman Umuroglu, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre, and Kees Vissers. 2017. FINN: A framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, 65--74.Google ScholarGoogle ScholarDigital LibraryDigital Library
  71. Anuj Vaishnav, Khoa Dang Pham, and Dirk Koch. 2018. A survey on FPGA virtualization. In Proceedings of the 2018 28th International Conference on Field Programmable Logic and Applications (FPL). IEEE, 131--1317.Google ScholarGoogle ScholarCross RefCross Ref
  72. Zeke Wang et al. 2019. Accelerating generalized linear models with MLWeaving: A one-size-fits-all system for any-precision learning. Proceedings of the VLDB Endowment 12, 7 (2019), 807--821.Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Jagath Weerasinghe, Raphael Polig, Francois Abel, and Christoph Hagleitner. 2016. Network-attached FPGAs for data center applications. In Proceedings of the 2016 International Conference on Field-Programmable Technology (FPT). IEEE, 36--43.Google ScholarGoogle ScholarCross RefCross Ref
  74. Loring Wirbel. 2014. Xilinx SDAccel Whitepaper.Google ScholarGoogle Scholar
  75. Peter Yiannacouras, J. Gregory Steffan, and Jonathan Rose. 2008. VESPA: Portable, scalable, and flexible FPGA-based vector processors. In Proceedings of the 2008 International Conference on Compilers, Architectures and Synthesis for Embedded Systems. ACM, 61--70.Google ScholarGoogle ScholarDigital LibraryDigital Library
  76. Jiansong Zhang, Yongqiang Xiong, Ningyi Xu, Ran Shu, Bojie Li, Peng Cheng, Guo Chen, and Thomas Moscibroda. 2017. The Feniks FPGA operating system for cloud computing. In Proceedings of the 8th Asia-Pacific Workshop on Systems. ACM, 22.Google ScholarGoogle ScholarDigital LibraryDigital Library
  77. Tong Zhang. 2004. Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the Twenty-first International Conference on Machine Learning. ACM, 116.Google ScholarGoogle ScholarDigital LibraryDigital Library
  78. Zhuangdi Zhu, Alex X. Liu, Fan Zhang, and Fei Chen. 2018. FPGA resource pooling in cloud computing. IEEE Transactions on Cloud Computing (2018).Google ScholarGoogle Scholar

Index Terms

  1. PipeArch: Generic and Context-Switch Capable Data Processing on FPGAs

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in

                Full Access

                • Published in

                  cover image ACM Transactions on Reconfigurable Technology and Systems
                  ACM Transactions on Reconfigurable Technology and Systems  Volume 14, Issue 1
                  March 2021
                  138 pages
                  ISSN:1936-7406
                  EISSN:1936-7414
                  DOI:10.1145/3418746
                  • Editor:
                  • Deming Chen
                  Issue’s Table of Contents

                  Copyright © 2020 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 5 November 2020
                  • Accepted: 1 August 2020
                  • Revised: 1 June 2020
                  • Received: 1 April 2020
                  Published in trets Volume 14, Issue 1

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • research-article
                  • Research
                  • Refereed
                • Article Metrics

                  • Downloads (Last 12 months)20
                  • Downloads (Last 6 weeks)3

                  Other Metrics

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader

                HTML Format

                View this article in HTML Format .

                View HTML Format