Xiaotong Zhuang, Mauricio J. Serrano, et al.
ACM SIGPLAN Notices
This paper presents a framework for analyzing the performance of multithreaded programs using a model called a constraint graph. We review previous constraint graph definitions for sequentially consistent systems, and extend these definitions for use in analyzing other memory consistency models. Using this framework, we present two constraint graph analysis case studies using several commercial and scientific workloads running on a full system simulator. The first case study illustrates how a constraint graph can be used to determine the necessary conditions for implementing a memory consistency model, rather than conservative sufficient conditions. Using this method, we classify coherence misses as either required or unnecessary. We determine that on average over 30% of all load instructions that suffer cache misses due to coherence activity are unnecessarily stalled because the original copy of the cache line could have been used without violating the memory consistency model. Based on this observation, we present a novel delayed consistency implementation that uses stale cache lines when possible. The second case study demonstrates the effects of memory consistency constraints on the fundamental limits of instruction level parallelism, compared to previous estimates that did not include multiprocessor constraints. Using this method we determine the differences in exploitable ILP across memory consistency models for processors that do not perform consistency-related speculation.
Xiaotong Zhuang, Mauricio J. Serrano, et al.
ACM SIGPLAN Notices
Ravi Nair
IEEE TC
Ravi Nair
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Ravi Nair, Se June Hong, et al.
DAC 1982