当前位置: X-MOL 学术Microprocess. Microsyst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
IRHT: An SDC detection and recovery architecture based on value locality of instruction binary codes
Microprocessors and Microsystems ( IF 1.9 ) Pub Date : 2020-06-02 , DOI: 10.1016/j.micpro.2020.103159
Alireza Tajary , Hamid R. Zarandi , Nader Bagherzadeh

Silent Data Corruptions (SDCs), those errors that escape detection methods, are critical for system designers because they may result in systems failures. In order to catch SDCs, mechanisms should focus on the behavioural aspects of errors in addition to their physical location or error patterns. Therefore, protection codes like parity, hamming, and the Reed-Solomon code, which heavily depend on the physical location of data bits, are not enough in processors for detection of computing errors. Using characterizing data behaviour during program executions, we have observed value locality in results of destination register for each instruction binary code (instruction opcode and operand codes). This locality exists not only in the results of each instruction, but also in the results of instructions at different memory locations having the same binary code. As a result, an architecture called Instruction Result History Table (IRHT) is presented, which is indexed by the instruction binary code. In the IRHT, a history of values produced by the same instruction binary codes are stored in and utilized during each instruction execution cycle. Any mismatch between the stored values in the IRHT and those generated by current execution, indicates an SDC syndrome. To confirm having SDCs with a high level of confidence, a second execution of the current instruction is issued. A duplication of the execution confirms whether SDC occurred. In the case of SDCs, a third instruction execution with the help of a majority voting frees the system of SDC. Several extensive simulations showed that, up to 83.54% of SDCs are detectable with the help of this locality. Moreover, with the small hardware, IRHT, i.e., 16 kB size, 80.66% of SDCs can be detected on average. Note that the presented method can detect those errors that escape conventional detection mechanisms. So, it can be utilized in conjunction with other conventional methods.



中文翻译:

IRHT:一种基于指令二进制代码值局部性的SDC检测和恢复体系结构

静默数据损坏(SDC)是逃避检测方法的那些错误,对于系统设计人员而言至关重要,因为它们可能导致系统故障。为了捕获SDC,机制应该关注错误的行为方面,以及其物理位置或错误模式。因此,严重依赖于数据位的物理位置的诸如奇偶校验,汉明编码和Reed-Solomon码的保护码在处理器中不足以检测到计算错误。通过在程序执行期间表征数据行为,我们观察到了每个指令二进制代码(指令操作码和操作数代码)的目标寄存器结果中的值局部性。这种局部性不仅存在于每条指令的结果中,而且还会导致不同存储器位置具有相同二进制代码的指令结果。结果,提出了一种称为“指令结果历史表”(IRHT)的体系结构,该体系结构由指令二进制代码索引。在IRHT中,在每个指令执行周期中存储并利用由相同指令二进制代码产生的值的历史。IRHT中存储的值与当前执行所生成的值之间的任何不匹配都表示SDC综合症。为了确认具有高置信度的SDC,发出当前指令的第二次执行。执行重复可确认是否发生了SDC。对于SDC,在多数表决的帮助下执行第三条指令将释放SDC系统。几个广泛的模拟显示,最多83个。在此位置的帮助下,可以检测到54%的SDC。此外,使用小型硬件IRHT(即16 kB大小),平均可以检测到80.66%的SDC。注意,所提出的方法可以检测那些逃脱常规检测机制的错误。因此,它可以与其他常规方法结合使用。

更新日期:2020-06-02
down
wechat
bug