Skip to main content

My Dissertation

·2 mins

Factors Affecting False Sharing on Page-Granularity Cache-Coherent Shared-Memory Multiprocessors #

Abstract #

Efficiently supporting a shared memory paradigm in a large-scale multiprocessor generally involves some form of data caching. One of the drawbacks of caching shared data is the cost of keeping the multiple copies coherent. One source of unnecessary coherency overhead is caused by a problem known as false sharing. Unfortunately, the lack of a precise, universally accepted, definition of false sharing hinders research to detect and eliminate the problem.

We articulate our intuitive notion of false sharing and address the problems encountered in previous attempts at defining false sharing. We motivate the importance of a concrete measure by demonstrating that false sharing related coherence overhead comprises a significant portion of the coherence costs in real applications, especially when page-granularity coherence is required. An architecture-independent measure of the false sharing exhibited in a reference trace for cache lines of a specified size is proposed and evaluated experimentally.

The proposed measure attempts to summarize the false sharing impact by approximating some factors and discarding others. The evaluation of this formulation reveals that such summary statistics lose too much information to be of practical use in predicting performance. We use this work to motivate experiments to determine the relative importance of the various workload and architectural factors that affect coherence data traffic. The conclusion from these experiments is that the precise memory reference interleaving order is the most significant factor affecting false sharing coherence data traffic.

Our methodology is to use an execution-driven simulation of specific architectures and applications to generate memory reference traces. The traces are then analyzed off-line.


Getting Your Very Own Copy #

If you would like a copy of my dissertation, you can download a compressed PostScript copy from the Duke web site or a converted PDF file, both of which need to be printed two-sided. Alternatively, you can request a printed copy by e-mail. Ask for Technical Report CS-1994-37. It is 123 pages long.

Vivek Khera
Author
Vivek Khera