ARSC HPC Users' Newsletter 409, December 23, 2009

Watch Those Compiler Wrappers!

[ By Ed Kornkven ]

It is typical for high performance systems to "wrap" their compilers so that site-specific options and libraries can be used invisibly to the user. For example, on Midnight, when compiling a Fortran+MPI program with the PathScale compiler, instead of using the PathScale compiler command "pathf90", we advise users to use the wrapper named "mpif90." The wrapper will make sure that the compiler can find the MPI header and library files without requiring the user to add any additional options. One can see what the wrapper is doing by adding the "-show" option, e.g.,


    mpif90 -show

The situation is similar on the Cray XT5. Cray has long named their Fortran compiler "ftn" (and their C compiler "cc") and it is no different on Pingo. However, on the XT5, the name "ftn" wraps an underlying compiler, depending on the module that is loaded. So if the PGI environment is loaded, "ftn" will invoke the PGI compiler:


    module load PrgEnv-pgi
    ftn -o a.out pgm.f90

That command not only invokes the correct compiler, it wraps it with path and environment settings. It is still possible to compile with the PGI compiler ("pgf90") but that is rarely advisable.

As an example of an effect of not using the wrapper, recently I was porting a program and Makefile to Pingo and forgot to change the compiler name to "ftn." All was fine until I wanted to profile the code with CrayPat. When I invoked the "pat_build" command to instrument the executable


    pat_build -O apa a.out

I got this error:


    ERROR: Missing required ELF section 'link information' from the program.
    >> Load the correct 'craypat' module and rebuild the program.

That message had me scratching my head for a while until I sought help from our able consultants. Moral: use the wrapped compiler for MPI codes on Midnight and probably always on Pingo.

Using OpenMP with the PathScale and PGI Fortran Compilers

[ By Oralee Nudson ]

One interesting situation a user recently ran into while running her OpenMP program on midnight involved the program executing flawlessly when compiled with PathScale, but not picking up the thread identities when compiled with PGI.

As a test, I typed in the following simple.f90 program found in "Parallel Programming in OpenMP" written by Chandra, Dagum, et. al.:


% cat simple.f90
  program hello
  Integer :: myid, nthreads
  print *, "Hello parallel world from threads:"
  !$omp parallel private (myid, nthreads)
  myid = omp_get_thread_num()
  nthreads = omp_get_num_threads()
  print *, 'I am thread ', myid, 'of ', nthreads,' total '
  !$omp end parallel
  print *, "Back to the sequential world."
  end

Compiling simple.f90 with version 3.2 of the PathScale compiler gave me the expected results:


% pathf90 -mp -o simple simple.f90
% export OMP_NUM_THREADS=4
% ./simple
 Hello parallel world from threads:
 I am thread  0 of  4  total
 I am thread  1 of  4  total
 I am thread  2 of  4  total
 I am thread  3 of  4  total
 Back to the sequential world.

However, when I switched to a version (I tried a few) of the PGI compiler, the threads seemed to be created, but were unable to recognize their assigned task or the total number of threads running:


% module purge
% module load PrgEnv.pgi
% module list
Currently Loaded Modulefiles:
  1) voltairempi-S-1.pgi   2) pgi-7.2.2             3) PrgEnv.pgi
% pgf90 -mp -o simple simple.f90
% export OMP_NUM_THREADS=4
% ./simple
 Hello parallel world from threads:
 I am thread             0 of             0  total
 I am thread             0 of             0  total
 I am thread             0 of             0  total
 I am thread             0 of             0  total
 Back to the sequential world.

This problem was quite puzzling. After some thought and discussion with my colleagues, I learned that the PGI compiler requires the inclusion of the "use omp_lib" statement within the Fortran program. After adding the use statement underneath the "program hello" and above the Integer declarations, the PGI compiled version of the program ran as expected. Problem Solved! :)


% pgf90 -mp -o simple simple.f90
% export OMP_NUM_THREADS=4
% ./simple
 Hello world from threads:
 I am thread             0 of             4  total
 I am thread             1 of             4  total
 I am thread             2 of             4  total
 I am thread             3 of             4  total
 Returning from the parallel world.

Make and tar for Source File Management

[ By Lawrence Murakami ]

I have decided that including a tar rule in my Makefile is a good idea. In some cases I may create a Makefile just to have the tar rule. The following is my latest Makefile that creates a tar file by:

  • finding all files in the current directory and any sub-directories
  • removing the leading "./" characters from the names
  • selecting the files I want included
  • omitting some I don't want included
  • then create the archive using the resulting file list.

.PHONY : tar

tar : ; find . -type f 
 sed 's
^./

' \

 grep 'f90$$\
Makefile\
README\
tester\
results\
hwk.txt$$\
,v$$' \

 grep -v "~$$" \

 xargs tar -cvjf phys693-OpenMP-hwk.tbz2

The above rule includes all Fortran 90 source files, all the make files, README files, tester files, results files and all RCS archive files. It excludes temp files made by vi or kwrite editors. Note that the $ sign needs to be doubled to have make treat it as an ordinary character.

The files in this example exist in a set of subdirectories. With this rule in a Makefile in the root folder I can use "make tar" to package up all the source and some other important files and the single file can be transmitted to another machine. Any object files, temp files, compiled programs, or output data is not included. I can transfer this and compile on another computer.

Floating Point Exceptions and the PathScale Compilers

[ By Ed Kornkven ]

In issues 376 and 377 of this newsletter, Don Bahls wrote a couple of helpful articles on how to debug floating point exceptions. In issue 377, he discussed the PGI -Ktrap compiler option for enabling floating point exception trapping without additional code modifications. Since the matter arose for a PathScale user recently, this is a follow-up article to show similar options for PathScale.

Like the PGI suite, PathScale compilers offer several options for choosing the appropriate balance between IEEE arithmetic standard conformance and high performance. Users should be aware that the default behavior for compilers (including PGI and PathScale) is not strict IEEE conformance, but "safe" optimizations that may not be strictly standard-conforming. One of those behaviors, as we are discussing here, is to disable floating point exception trapping. Rather than attempt exhaustive treatment of this subject, I will simply refer the interested reader to the PathScale Fortran options:

  • -O and -OPT optimization options
  • -OPT:IEEE_arithmetic=N
  • -fno-math-errno
  • -TENV:X=N
  • -TENV:simd_Xmask=OFF

It is this last class of options that is our focus here. The "X" in the option name specifies a SIMD floating point exception mask that can be turned OFF, thereby enabling floating point exceptions in machines with SSE SIMD instructions (like the AMD Opterons on Pingo and Midnight). The options for disabling various masks and the exceptions they enable are:

  • -TENV:simd_imask=OFF (traps invalid operation exceptions)
  • -TENV:simd_dmask=OFF (traps denormalized operand exceptions)
  • -TENV:simd_zmask=OFF (traps divide by zero exceptions)
  • -TENV:simd_omask=OFF (traps overflow exceptions)
  • -TENV:simd_umask=OFF (traps underflow exceptions)
  • -TENV:simd_pmask=OFF (traps precision exceptions)

Revisiting Don's divide-by-zero example,


    program exception
         real*4 a
         real*4 b
         call enable_exceptions()
         a=1
         b=0
         a=a/b             !  with exceptions enabled the code should crash here.

         print *,"a=", a
    end program

we can compile with PathScale like this


    pathf90 -TENV:simd_zmask=OFF -g exceptions.f90 -lm -o exception

and then execute, getting our floating point exception and core dump.


    mg57> ./exception
    Floating exception(coredump)

Holiday Greetings

Your editors would like to thank our readers for reading, contributing and making suggestions for this newsletter. We write this to be a resource for our fellow HPC users, at ARSC and elsewhere, and it is important to us that our readers find it relevant and helpful. Over the past year we have received clarifications, follow-ups, insights and ideas for future articles and we want to both thank you and encourage continued feedback. Please receive our best wishes and blessings for a happy holiday season and a prosperous 2010.

Quick-Tip Q & A


A:[[ I have an input file with space and newline delimited ASCII input.  
  [[ The first few lines have 1 to 5 values each, but the next several
  [[ thousand lines should all have 20 values per line.  I recently found
  [[ a file that had the wrong number of values on one of those lines so I
  [[ need to start checking these files.  Obviously, visual inspection is
  [[ not my preferred option.  How can I do this check and find any lines
  [[ that don't have 20 values?  There's got to be an easy way to do this!
  [[

#
# We have many proficient awk users in our readership, and several of them
# contributed awk-based solutions to this problem.  Thanks to Jed Brown,
# Ken Irving, Ryan Czerwiec and Scott Kajihara for sending us awk answers.
#
# Ken replied:
#
    awk does this sort of thing pretty succinctly, e.g.:

        $ awk 'NF!=20' FILE

    or
        $ cat FILE 
 awk 'NF!=20{print NR, NF}'

    are amoung the inumerable ways to do this.

#
# Ryan adds:
#

    If you want to see the line numbers as well so you know where to
    look in the file for them, something like this will work.  It
    displays line number, a colon, then the offending line:

        gawk 'NF != 20 {printf "%s:%s\n", FNR, $0}' file

    In your case, you happen to know the first few lines will fail the
    test, so you can exclude them if you wish:

        gawk '(NF != 20) && (FNR > 3) {printf "%s:%s\n", FNR, $0}' file

    For example will not consider the first 3 lines if that's how big
    your header is.

#
# Scott skips the initial lines like this:
#

    The number of initial lines to skip (N) can be set in the loop in
    the BEGIN clause as the querent does not specify this. However, the
    following will print out the line (actually record) numbers of the
    errant lines as well as the actual lines:

        awk '\
        BEGIN { for (i = 0; i < N; ++i) { getline; }; } \
        NF != 20 { printf "%d: %s\n", NR, $0; }' \
        filename

#
# and Jed's approach infers, rather than specifies the number of
# initial lines to skip:
#

    $ awk 'NR>5 && NF!=20 {print NR ": " $0}' file

#
# For extra credit, Jed suggests using find to process many such files:
#

    $ find /path/of/interest -name '*.suf' -mtime -2 \
        
xargs -n1 awk 'NR>5 && NF!=20 {print FILENAME ":" NR ": " $0}'

        (with find, you could also use -exec awk ARGS '{}' \;)

#
# We also heard from some of our Perl users.  Tom Baring's suggestion 
# was short-and-sweet:
#

    perl -ne 'print "$. :$_" unless (m/^ *([^ ]+ +){19}[^ ]+ *$/);' file.txt

#
# and Derek Bastille added some extra bells and whistles:
#

    #!/usr/bin/perl

    # This script checks for the number of values in a line
    # after a specified number of 'short' lines, each line
    # thereafter needs to have exactly N values.

    # vars
    $totalShortLines = 5;   # total number of special lines
    $valsPerLine = 20;      # each line needs this many values

    # loop-o-rama
    while (<>) {
      chomp();
      if( $. <= $totalShortLines ) {
        # this is a short line
        print "[$.] Short Line => $_\n";
      } else {
        @valueArray = split(/\s+/, $_);
        if( $#valueArray != ($valsPerLine - 1)) {
          # print error
          print "ERR: wrong number of values in line [$.] (" . ($#valueArray + 1) . " values) \n";
        }
      }
    } 


Q: I need to generate some test data for a program I am writing.
The input to the program is a list of integers.  How can I generate
all the permutations of a list of n integers?  For example, for
n=3, my generator would have an input of 3 and it would output
something like:

    1 2 3
    1 3 2
    2 1 3
    2 3 1
    3 1 2
    3 2 1
 

[[ Answers, Questions, and Tips Graciously Accepted ]]


Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.

Back to Top