The free/libre software license under which
Scotch 5.1
is distributed is
the CeCILL-C license [6], which has basically the same features as the GNU
LGPL (“
Lesser General Public License
”): ability to link the code as a library
to any free/libre or even proprietary software, ability to modify the code and to
redistribute these modifications. Version
4.0
of
Scotch
was distributed under the
LGPL itself.
Please refer to section 8 to see how to obtain the free/libre distribution of
Scotch
.
3
Algorithms
3.1
Static mapping by Dual Recursive Bipartitioning
For a detailed description of the mapping algorithm and an extensive analysis of its
performance, please refer to [41, 44]. In the next sections, we will only outline the
most important aspects of the algorithm.
3.1.1
Static mapping
The parallel program to be mapped onto the target architecture is modeled by a val-
uated unoriented graph
S
called
source graph
or
process graph
, the vertices of which
represent the processes of the parallel program, and the edges of which the commu-
nication channels between communicating processes. Vertex- and edge- valuations
associate with every vertex
v
S
and every edge
e
S
of
S
integer numbers
w
S
(
v
S
) and
w
S
(
e
S
) which estimate the computation weight of the corresponding process and
the amount of communication to be transmitted on the channel, respectively.
The target machine onto which is mapped the parallel program is also modeled
by a valuated unoriented graph
T
called
target graph
or
architecture graph
. Vertices
v
T
and edges
e
T
of
T
are assigned integer weights
w
T
(
v
T
) and
w
T
(
e
T
), which
estimate the computational power of the corresponding processor and the cost of
traversal of the inter-processor link, respectively.
A
mapping
from
S
to
T
consists of two applications
τ
S,T
:
V
(
S
)
−→
V
(
T
) and
ρ
S,T
:
E
(
S
)
−→ P
(
E
(
T
)), where
P
(
E
(
T
)) denotes the set of all simple loopless
paths which can be built from
E
(
T
).
τ
S,T
(
v
S
) =
v
T
if process
v
S
of
S
is mapped
onto processor
v
T
of
T
, and
ρ
S,T
(
e
S
) =
{
e
1
T
, e
2
T
, . . . , e
n
T
}
if communication channel
e
S
of
S
is routed through communication links
e
1
T
,
e
2
T
, . . . ,
e
n
T
of
T
.
|
ρ
S,T
(
e
S
)
|
denotes the dilation of edge
e
S
, that is, the number of edges of
E
(
T
) used to route
e
S
.
3.1.2
Cost function and performance criteria
The computation of efficient static mappings requires an
a priori
knowledge of the
dynamic behavior of the target machine with respect to the programs which are
run on it. This knowledge is synthesized in a
cost function
, the nature of which
determines the characteristics of the desired optimal mappings. The goal of our
mapping algorithm is to minimize some communication cost function, while keeping
the load balance within a specified tolerance. The communication cost function
f
C
that we have chosen is the sum, for all edges, of their dilation multiplied by their
weight:
f
C
(
τ
S,T
, ρ
S,T
)
def
=
X
e
S
∈
E
(
S
)
w
S
(
e
S
)
|
ρ
S,T
(
e
S
)
|
.
8