Policy-Based Reinforcement Learning for Through Silicon Via Array Design in High-Bandwidth Memory Considering Signal Integrity,IEEE Transactions on Electromagnetic Compatibility

当前位置： X-MOL 学术 › IEEE Trans. Electromagn Compat. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Policy-Based Reinforcement Learning for Through Silicon Via Array Design in High-Bandwidth Memory Considering Signal Integrity
IEEE Transactions on Electromagnetic Compatibility ( IF 2.1 ) Pub Date : 2024-01-17 , DOI: 10.1109/temc.2023.3343700
Keunwoo Kim ₁ , Hyunwook Park ₂ , Seongguk Kim ₁ , Youngwoo Kim ₃ , Kyungjune Son ₁ , Daehwan Lho ₁ , Keeyoung Son ₁ , Taein Shin ₁ , Boogyo Sim ₁ , Joonsang Park ₁ , Shinyoung Park ₄ , Joungho Kim ₁

Affiliation

In this article, a policy-based reinforcement learning (RL) method for optimizing through silicon via (TSV) array design in high-bandwidth memory (HBM) considering signal integrity is proposed. The proposed method can provide an optimal TSV-array signal/ground pattern design to maximize the eye opening (EO), which determines the bandwidth of the high-speed TSV channel. The proposed method adopts the proximal policy optimization algorithm, which directly trains the optimal policy, providing efficient handling of large action spaces rather than value-based RL. The convolutional neural network is used as a feature extractor to extract the location information of the TSV-array. To overcome the computational cost of the reward estimation, a fast EO estimation method is developed based on the equivalent circuit modeling and peak distortion analysis. The proposed method is applied to optimize 1-byte of TSV-array in a 16-high HBM and showed an 18.2% increase in EO compared with the initial design. The optimality performance of the proposed method is compared with deep q-network and random search algorithm, and the proposed method shows 3.4% and 9.6% better optimality, respectively.

中文翻译：

考虑信号完整性的高带宽存储器中硅通孔阵列设计的基于策略的强化学习

在本文中，提出了一种基于策略的强化学习（RL）方法，用于在考虑信号完整性的情况下优化高带宽存储器（HBM）中的硅通孔（TSV）阵列设计。所提出的方法可以提供最佳的 TSV 阵列信号/接地图案设计，以最大化眼图张开度 (EO)，这决定了高速 TSV 通道的带宽。该方法采用近端策略优化算法，直接训练最优策略，提供对大动作空间的有效处理，而不是基于值的强化学习。卷积神经网络用作特征提取器来提取 TSV 阵列的位置信息。为了克服奖励估计的计算成本，基于等效电路建模和峰值失真分析，开发了一种快速 EO 估计方法。所提出的方法用于优化 16 高 HBM 中的 1 字节 TSV 阵列，与初始设计相比，EO 增加了 18.2%。将所提出方法的最优性性能与深度q网络和随机搜索算法进行比较，所提出方法的最优性分别提高了3.4%和9.6%。

更新日期：2024-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>