6
–
SHMEM Description and Configuration
Progress Model
IB0054606-02 A
6-11
Alternatively, if
$SHMEM_SHMALLOC_BASE_ADDR
is specified as
0
, then each
SHMEM process will independently choose its own base virtual address for the
global shared memory segment. In this case, the values for a symmetric allocation
using
shmalloc()
are no longer guaranteed to be identical across the PEs. The
QLogic SHMEM implementation takes care of this asymmetry by using offsets
relative to the base of the symmetric heap in its protocols. However, applications
that interpret symmetric heap pointer values or exchange symmetric heap pointer
values between PEs will not behave as expected.
It is possible for SHMEM to fail at start-up or while allocating global shared
memory due to limits placed by the operating system on the amount of *local*
shared memory that SHMEM can use. Since SHMEM programs can use very
large amounts of memory this can exceed typical OS configurations. As long as
there is sufficient physical memory for the program, the following steps can be
used to solve local shared memory allocation problems:
Check for low
ulimits
on memory:
ulimit -l
: max locked memory (important for PSM not SHMEM)
ulimit -v
: max virtual memory
Check the contents of these
sysctl
variables:
sysctl kernel.shmmax
; maximum size of a single shm allocation in
bytes
sysctl kernel.shmall
; maximum size of all shm allocations in “pages”
sysctl kernel.shmnmi
; maximum number of shm segments
Check the size of
/dev/shm
:
df /dev/shm
Check for stale files in
/dev/shm
:
ls /dev/shm
If any of these checks indicate a problem, ask the cluster administrator to increase
the limit.
Progress Model
QLogic SHMEM supports active and passive progress models. Active progress
means that the PE must actively call into SHMEM for progress to be made on
SHMEM one-sided operations. Passive progress means that progress on
SHMEM one-sided operations can occur without the application needing to call
into SHMEM. Active progress is the default mode of operation for QLogic
SHMEM. Passive progress can be selected using an environment variable where
required.
Summary of Contents for OFED+ Host
Page 1: ...IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Page 14: ...xiv IB0054606 02 A OFED Host Software Release 1 5 4 User Guide...
Page 22: ...1 Introduction Interoperability 1 4 IB0054606 02 A...
Page 96: ...4 Running MPI on QLogic Adapters Debugging MPI Programs 4 24 IB0054606 02 A...
Page 140: ...6 SHMEM Description and Configuration SHMEM Benchmark Programs 6 32 IB0054606 02 A...
Page 148: ...8 Dispersive Routing 8 4 IB0054606 02 A...
Page 164: ...9 gPXE HTTP Boot Setup 9 16 IB0054606 02 A...
Page 176: ...A Benchmark Programs Benchmark 3 Messaging Rate Microbenchmarks A 12 IB0054606 02 A...
Page 202: ...B SRP Configuration OFED SRP Configuration B 26 IB0054606 02 A Notes...
Page 206: ...C Integration with a Batch Queuing System Clean up PSM Shared Memory Files C 4 IB0054606 02 A...
Page 238: ...E ULP Troubleshooting Troubleshooting SRP Issues E 20 IB0054606 02 A...
Page 242: ...F Write Combining Verify Write Combining is Working F 4 IB0054606 02 A Notes...
Page 280: ...G Commands and Files Summary of Configuration Files G 38 IB0054606 02 A...
Page 283: ......