Variation-Tolerant Hierarchical Voltage Monitoring Circuit for Soft Error Detection

Ashay Narsale and Michael Huang
University of Rochester


Abstract

As device feature size continues to scale down to the nanometer regime, the decreasing critical charge fundamentally reduces noise margins of devices and in turn increases the susceptibility of the ICs to external noise sources such as particle strikes. While protection techniques for memory such as ECC are mature and effective, protections for logic errors remain imperfect. Full-blown redundancy solutions for microprocessors such as mirrored cores and triple-modular redundancy incur significant overhead and are clearly limited to the niche market of mission-critical servers. The fundamental inefficiency of such redundancy lies in the repetition of all operations to detect the discrepancy caused by events much rarer than cycle-to-cycle activities. Clearly, for the vast majority of general-purpose systems, a detection mechanism that has low standby energy consumption is called for. In this paper, we propose a circuit-level solution to detect errors by monitoring the supply rail disturbance caused by a particle strike. Combined with checkpointing and rollback support, such a circuit can provide a high level of protection against particle-strike induced soft errors. At 17%, the power overhead of the design is reasonable and much lower than prior art. The design is also tolerant to process, voltage, and temperature (PVT) variations.