当前位置: X-MOL 学术Concurr. Comput. Pract. Exp. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Signal processing for a reverse-GPS wildlife tracking system: CPU and GPU implementation experiences
Concurrency and Computation: Practice and Experience ( IF 1.5 ) Pub Date : 2021-07-22 , DOI: 10.1002/cpe.6506
Yaniv Rubinpur 1 , Sivan Toledo 1
Affiliation  

We present robust high-performance implementations of signal-processing tasks performed by a high-throughput wildlife tracking system called ATLAS. The system tracks radio transmitters attached to wild animals by estimating the time of arrival of radio packets to multiple receivers (base stations). Time-of-arrival estimation of wideband radio signals is computationally expensive, especially in acquisition mode (when the time of transmission is not known, not even approximately). These computations are a bottleneck that limits the throughput of the system. We developed a sequential high-performance CPU implementation of the computations a few years back, and more recently a GPU implementation. Both strive to balance performance with simplicity, maintainability, and development effort, as most real-world codes do. The article reports on the two implementations and carefully evaluates their performance. The evaluations indicates that the GPU implementation dramatically improves performance and power-performance relative to the sequential CPU implementation running on a desktop CPU typical of the computers in current base stations. Performance improves by more than 50X on a high-end GPU and more than 4X with a GPU platform that consumes almost 5 times less power than the CPU platform. Performance-per-Watt ratios also improve (by more than 16X), and so do the price-performance ratios.

中文翻译:

反向 GPS 野生动物跟踪系统的信号处理:CPU 和 GPU 实现经验

我们提出了由称为 ATLAS 的高通量野生动物跟踪系统执行的信号处理任务的强大高性能实现。该系统通过估计无线电数据包到达多个接收器(基站)的时间来跟踪连接到野生动物的无线电发射器。宽带无线电信号的到达时间估计在计算上是昂贵的,尤其是在采集模式下(当传输时间未知时,甚至不知道近似值时)。这些计算是限制系统吞吐量的瓶颈。几年前,我们开发了计算的顺序高性能 CPU 实现,最近又开发了 GPU 实现。正如大多数现实世界的代码所做的那样,两者都努力在性能与简单性、可维护性和开发工作之间取得平衡。本文报告了这两种实现,并仔细评估了它们的性能。评估表明,相对于在当前基站计算机典型的台式 CPU 上运行的顺序 CPU 实现,GPU 实现显着提高了性能和电源性能。在高端 GPU 上性能提升超过 50 倍,在消耗近 5 倍的 GPU 平台上性能提升超过 4 倍功耗低于CPU 平台。每瓦性能比也提高了(提高了 16 倍以上),性价比也提高了。
更新日期:2021-07-22
down
wechat
bug