Tuesday, August 31, 2010

p13 Dynamic Code Mapping for Limited Local Memory Systems

Abstract—This paper presents heuristics for dynamic management
of application code on limited local memories present
in high-performance multi-core processors.

I. INTRODUCTION
Multicore architectures are becoming popular since they
provide a way to improve peak performance without much
increase in the power consumption.

In processor cores with limited local memories, if the whole
application fits into the limited local memory, it executes extremely
efficiently.

(page 1 col 2)

The inputs to this code mapping problem are traditionally,
i) the maximum size of code region, and ii) a call graph,
in which the nodes represent functions, and a directed edge
between two functions denotes a caller-callee relationship

The approach in this paper removes several limitations of
previous approaches.

II. MOTIVATING EXAMPLE
This section provides an example to illustrate two ideas
i) interference cost between two functions depends on where
the other functions are mapped, and ii) updating interference
costs can lead a code mapping to minimize the data transfers
between the limited local memory and the global memory.

Figure 1 (a) shows a simple call graph in which function
F1 calls F2, F2 calls F3, and F3 calls F4, and then they all
return.

For the indirect edges, the weight calculation is slightly
tricky.

(page 2)
Previous approaches computed the worst case interference
cost, i.e., 2.4 KB for F1 - F3, and never updated it, and
therefore obtained inferior mapping.

(p2 col 2)
Clearly there is a discrepancy in computing the interference
cost between region 2 and function F4.

No comments:

Post a Comment