Sei sulla pagina 1di 1

10

A. A Formal Definition

Consider the CFG subgraph Gs that contains all blocks along all paths that are rooted at a divergent
branch Bd and terminate at any post dominator of that branch Bpd . Pick a path, Pi , and a block along
that path Bi . The pre-scheduled thread frontier of Bi is Gs − {Bd → Bi }, where {Bd → Bi } is the set of
blocks along Pi from Bd to Bi . Simply put, the pre-schedule thread frontier identifies all other blocks that
may contain divergent threads that were created at Bd , without any knowledge of the thread scheduling
order. This information is not very useful in of itself, and it may contain most blocks in the program.
A stronger form is produced by assigning a priority to each block in Gs and asserting that blocks are
scheduled in priority order. In this case, the post-schedule thread frontier, which is simply referred to as
the thread frontier in the rest of the paper, is defined as Gs − {Bd → Bp d} − Glp − Gth , where Glp is the
set of blocks in Gs with priority less than Bi and Gth is the set of blocks with a predecessor in any path
that is not in Glp . In this case, Bi contains the threads currently being executed, and its thread frontier
contains the locations of all other divergent threads. A re-convergence point exists when a successor of
Bi is contained in its thread frontier. The thread frontier represents an upper bound on the total number
of blocks that may contain divergent threads at any point in the program, and this information may be
computed statically to bound the resources required to manage divergent threads at compile time.
Thread frontiers depend on both the CFG structure and the scheduling order of blocks; a different
schedule will create a different thread frontier for each block. This property may be used to optimize the
thread schedule to meet a certain criteria, for example, to minimize the size of all thread frontiers, to
minimize the size of thread frontiers for blocks along hot paths, or to minimize the average path length
between blocks in the same thread frontier. These criteria are expected to influence the program behavior by
limiting the possibility and degree of divergence in the program, limiting it along a hot path, or encouraging
threads to execute in near lock-step to improve instruction and data locality. The advantage of this technique
is that these properties can be enforced statically, before the program is executed.

Potrebbero piacerti anche