Featured Post

Linux daemon using Python daemon with PID file and logging

The python-daemon package ( PyPI listing , Pagure repo ) is very useful. However, I feel it has suffered a bit from sparse documentation, an...

2015-08-03

Abaqus integration for Univa Grid Engine (update)

My last post about Abaqus integration with Univa Grid Engine (UGE) had one disadvantage: it did not use qrsh to launch the slave MPI processes. As a result, job resource usage accounting was inaccurate. To fix this, certain parallel environment (PE) settings need to be corrected, and the rsh command that Abaqus uses for launching MPI slaves needs to be set to the wrapper rsh script.

The PE settings which worked worked for me -- see also sge_pe(5)

pe_name                abaqus
slots                  99999
user_lists             NONE
xuser_lists            NONE
start_proc_args        /cm/shared/apps/sge/var/default/common/pescripts/abaqus.py
stop_proc_args         NONE
allocation_rule        $round_robin
control_slaves         TRUE
job_is_first_task      FALSE
urgency_slots          min
accounting_summary     TRUE
daemon_forks_slaves    TRUEmaster_forks_slaves    FALSE

And the updated PE script (again, setting the mp_host_list is optional). The rsh command is actually the rsh wrapper shell script, which then calls qrsh.

#!/usr/bin/env python
import sys, os

### PE startup script aka prologue to set up Abaqus MPI "hostfile"
### Based on documented env file format
###     http://www.simulia.com/support/v67/books/sgb67EF/default.htm?startat=ch04s01.html

machinefile = os.environ['PE_HOSTFILE']
abaqenvfile = "abaqus_v6.env"

machinelines = []
with open(machinefile, "ro") as mf:
    for l in mf:
        lsplit = l.split()
        machinelines.append( [lsplit[0], int(lsplit[1])] )

with open(abaqenvfile, "wo") as envfile:
    envfile.write("mp_mode=MPI\n")
    envfile.write("mp_rsh_command='/cm/shared/apps/sge/univa/mpi/rsh -n -l %U %H %C'\n")
    envfile.write("mp_host_list=%s\n" % (str(machinelines)))