NBTI Aware Workload Balancing in Multi-core Systems

Jin Sun1,  Avinash Kodi2,  Ahmed Louri1,  Janet Wang1
1Univ. of Arizona, 2Ohio University


Abstract

As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Stability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (10$\%$)to relax in a short time period (10 second), the proposed methodology improves CMP system yield by 20 $\%$, and extend MTF by 30$\%$ with little degradation in performance (less than $6\%$).