Copying files in parallel
26
ParaStation5 Administrator's Guide
# UseMCast
statement.
If Multicast is enabled, the ParaStation daemons exchange status information using multicast messages.
Thus, a Linux kernel supporting multicast on all nodes of the cluster is required. This is usually no problem,
since all standard kernels from all common distribution are compiled with multicast support. If a customized
kernel is used, multicast support must be enabled within the kernel configuration! In order to learn more
about multicast take a look at the Multicast over TCP/IP HOWTO.
In addition, the hardware also has to support multicast packets. Since all modern Ethernet switches support
multicast and the nodes of a cluster typically live in a private subnet, this should be not a problem. If the
cluster nodes are connected by a gateway, it has to be configured appropriately to allow multicast packets
to reach all nodes of the cluster from all nodes.
Using a gateway in order to link parts of a cluster is not a recommended configuration.
On nodes with more than one Ethernet interface, typically frontend or head nodes, or systems where the
default route does not point to the private cluster subnet, a proper route for the multicast traffic must be
setup. This is done by the command
route add -net 224.0.0.0 netmask 240.0.0.0 dev ethX
where
ethX
should be replaced by the actual name of the interface connecting to all other nodes. In order
to enable this route at system startup, a corresponding entry has to be added to
/etc/route.conf
or
/
etc/sysconfig/networks/routes
, depending on the type of Linux distribution in use.
5.17. Copying files in parallel
To copy large files to many or all nodes in a cluster at once, pscp is very handy. It overlaps storing data
to disk and transfering data on the network, therefore it scales very well with respect to the number of
nodes. Arbitrary size of files may be copied, even archives containing large lists of files may be created
and unpacked on-the-fly.
Pscp uses the ParaStation
pscom
library for data transfers, that automatically will use the most
effective communication channel available. If required, the communication layer may be controlled using
environment variables, refer to ps_environment(7) for details. The client process on each node is spawned
using the ParaStation process management.
As pscp uses administrative ParaStation tasks to spawn the client processes, the user must be a member
of the
adminuser
list or the user's group must be a member of the
admingroup
list. By default, only root
is a member of the
adminuser
list and therefore allowed to use pscp. Refer to ParaStation5 User's Guide
and psiadmin(8) for details.
For more details refer to ParaStation5 User's Guide and pscp(8).
5.18. Using ParaStation accounting
ParaStation may write accounting information about each finished job run on the cluster to
/var/
account/yyyymmdd
, where
yyyymmdd
denotes the current accounting file in the form year, month and
day.
To enable accouting, the special hardware
accounter
must be set within the ParaStation configuration
file for at least one node. On each configured node, an accounting daemon collecting all information for all
jobs within the cluster will store the job information in the accouting file.
Summary of Contents for PARASTATION5 V5
Page 1: ...Administrator s Guide Release 5 0 5 Published April 2010...
Page 16: ...12 ParaStation5 Administrator s Guide...
Page 38: ...34 ParaStation5 Administrator s Guide...
Page 50: ...46 ParaStation5 Administrator s Guide...
Page 70: ...66 ParaStation5 Administrator s Guide...
Page 72: ...68 ParaStation5 Administrator s Guide...
Page 74: ...70 ParaStation5 Administrator s Guide...
Page 76: ...72 ParaStation5 Administrator s Guide...
Page 78: ...74 ParaStation5 Administrator s Guide...