ARSC T3D Users' Newsletter 83, April 19, 1996

T3E Information


> PSC's Cray T3E Is Installed and Running Parallel Applications        04.19.96
> NEWS BRIEFS                                                           HPCwire
> =============================================================================
> 
>   Eagan, Minn. -- The first CRAY T3E scalable parallel system has been
> installed at the Pittsburgh Supercomputing Center (PSC) and is already
> running parallel applications, Cray Research announced earlier this week.
> 
>   Since installing the early-access system only three weeks ago, six
> applications are now running in parallel mode on the new Cray system, said
> PSC officials, and a number of additional applications are targeted for
> near-term deployment. PSC scientific co-director Michael Levine said that 
> all applications running to date on the new system have produced correct 
> results and that PSC experts have verified hardware and software stability 
> with repeated long overnight runs.
> 
>   Applications selected for early migration to the CRAY T3E system include 
> several developed by PSC staff and users, as well as frequently used off-
> the-shelf packages ported and optimized for scalable parallel computing 
> under PSC's Parallel Applications Technology Program partnership with 
> Cray. Applications already running include major quantum chemistry and 
> biomedical applications (CHARMM, GAMESS and AMBER). Others to be 
> added will include software for crystallographic structure determination, 
> properties of advanced materials, leading-edge environmental and finite- 
> element multiphysics codes and Msearch, PSC's parallel genome sequence 
> database searching and alignment package.
> 
>   According to a Cray representative, volume shipments of the new
> supercomputer are slated to begin in third quarter. PSC's system will be
> upgraded over time and ultimately scale to 512 processors. The center will
> continue operating its prior-generation CRAY T3D system for production
> problems and will replace it when the CRAY T3E reaches 512 processors,
> according to Levine, who made a presentation at an Executive Cray User Group
> meeting April 17 on the status of the CRAY T3E system and PSC's applications
> migration progress.
> 
>   "On behalf of the national scientific community, we at PSC are looking 
> forward to the substantially increased capability of the new CRAY T3E," 
> Levine said. "Its faster processing and communications speeds, coupled with
> a much larger memory, will enhance our ability to attack leading-edge
> problems while maintaining our multi-year investment in highly optimized
> applications programs. This will keep PSC, the NSF supercomputer centers
> program and American researchers in a world leadership position."
> 
>   Levine said that over the course of the last three years, PSC users have
> run a wide-range of industrial and scientific problems on the current CRAY
> T3D supercomputer, consuming about 4.6 million processing hours. "When we 
> get into full production with the CRAY T3E," said Levine, "we expect
> scientific productivity to improve by a factor of three to four."
> 
>   "We are pleased that this early access system is enabling PSC application 
> experts to make such rapid advances in migrating their applications to the 
> CRAY T3E environment," said Robert H. Ewald, Cray Research president and 
> chief operating officer. "The CRAY T3E preserves the macroarchitecture and
> programming environment of the CRAY T3D. This consistency protects PSC's
> parallel applications investment and contributes to this exceptional progress
> by PSC and Cray personnel. This early applications progress will enable the
> production use of the CRAY T3E at PSC and other customers later this year."
> 
>   Cray said that it had more than $160 million in advance orders for the 
> CRAY T3E system at year-end 1995.

Comparing PVM, MPI and SHMEM

In newsletter #81 (4/5/96), I presented some preliminary results comparing PVM and MPI. I have updated that table with newer results. The changes from that table to this are:
  1. On the SGI workstations, I am now running the most recent version of PVM, pvm version 3.3 release 10.
  2. On both the SGI workstations and the T3D, I am now running the most recent version of the Argonne/Mississippi State implementation of MPI, MPICH 1.0.12.
  3. I've corrected the original source code to count ints on the T3D as 8 bytes not 4 bytes (oops, sorry about that).
  4. Increased the table to include SHMEM timings for an operation similar to the sends and receives in the PVM and MPI versions.

                       Preliminary results for MPI/PVM comparison

                  <-----------T3D--T3D------>         <----SGI COW---->

                   CRI    CRI    EPCC   Argonne      Oak Ridge  Argonne
                  SHMEM   PVM    MPI      MPI           PVM       MPI
                 PE 1.2  3.3.4   1.4a    1.0.12       3.3.10     1.0.12
                        (2.1.1)

  ping test          5     259     81      80           2567       2858
  (microseconds)

  bandwidth test 
  (Mbytes/second)
  length of message 
  (bytes) 
     ~100        16.00     .44    1.03    1.20          .04        .03
    ~1000        73.84    3.51    7.68   10.66          .24        .28
   ~10000       123.07   14.24   19.91   48.97          .54        .77
  ~100000       126.81   24.24   28.37   95.52          .62        .99
 ~1000000       127.32   25.95   29.44  103.07          .48        .88

Timing SHMEMs

To time the SHMEM_PUT routines used in a similar manner to the PVM and MPI send/receive pairs, I used the same control as the MPI timing program of Newsletter #81 but replaced the send of the data and receive of the acknowledgment with a shmem_put and a shmem_wait. On PE0, where the timing is done, I replaced:

<   if( MPI_Send( iarray,numint,MPI_INT,other,tagsend,MPI_COMM_WORLD ) ) {
<     printf( "can't send to bandwidth test\n" );
<     goto bail;
<   } 
<   if( MPI_Recv( iarray,1,MPI_INT,other,tagrecv,MPI_COMM_WORLD,&stat)){
<     printf( "recv error in bandwidth test\n" );
<     goto bail;
<   }
with:

>   ack[ 0 ] = 0;
>   shmem_put( jarray, iarray, numint, 1 );
>   if( ack[ 0 ] == 0 ) {
>       shmem_wait( ack, 0 );
>   }
and on PE1 where the receive was done and an acknowledgment was sent, I replaced:

&tl;   MPI_Recv( iarray, numint, MPI_INT,other,tagsend,MPI_COMM_WORLD,&stat );
<   MPI_Send( iarray, 1, MPI_INT, other, tagrecv, MPI_COMM_WORLD );
with:

>   testpos = numint - 1;
>   if( jarray[ testpos ] == 0 ) {
>      shmem_wait( &jarray[ testpos ], 0 );
>   }
>   if( jarray[ testpos ] == 1 ) {
>      shmem_put( ack, ack, 1, 0 );
>   }
I think this substitution duplicated the functionality of the send/receive pairs in PVM or MPI versions. The complete program for measuring shmem_puts is given at the end of this newsletter. A sample output is shown below:

  RTT Avg uSec 5 RTT Min uSec 5
  Message size 96
  Avg Byte/uSec 13.714286 Max Byte/uSec 13.714286
  Message size 960
  Avg Byte/uSec 73.846154 Max Byte/uSec 73.846154
  Message size 9600
  Avg Byte/uSec 117.073171 Max Byte/uSec 117.073171
  Message size 96000
  Avg Byte/uSec 125.326371 Max Byte/uSec 126.149803
  Message size 960000
  Avg Byte/uSec 126.465551 Max Byte/uSec 127.270317
  Done on PE0
  Done on PE1
All of the timings of Newsletter #81 and this newsletter were between PE0 and PE1, but with SHMEMS having such low latency and high bandwidth maybe I can use them to detect the number of hops or changes in dimension for messages between PE0 and PEn. We'll see in a future newsletter.

Announcement from PSC


> ---------------------------------------------------------------------------
>                      Pittsburgh Supercomputing Center
>      Supercomputing Techniques: Parallel Processing on CRAY MPP Systems
>                              May 20-23,  1996
> ---------------------------------------------------------------------------
>                  REGISTRATION DEADLINE:  May 1, 1996
> ---------------------------------------------------------------------------
> 
> 
> PURPOSE:
> 
>         The purpose of this four day workshop is to introduce participants to
>         parallel processing on the CRAY T3D and explore more advanced topics,
>         including performance monitoring and optimization techniques.
> 
> AGENDA:
> 
>         The first two days of this workshop have been designed to introduce
>         participants to PSC's supercomputing environment, compiling, debugging,
>         job submission, and parallel programming concepts. Participants will
>         learn to write parallel code using message passing calls.
> 
>         The third and fourth days are designed to cover more advanced topics,
>         including advanced parallel programming techniques, how to monitor
>         code performance and optimization strategies. There will also be
>         presentations on scientific applications which have been parallelized.
> 
>         ==> A working knowledge of FORTRAN or C and UNIX are required.
>         ==> Parallel computing experience is not necessary.
> 
> REGISTRATION FEES:
> 
>         Admission to this training workshop is free to the United States
>         academic community.
> 
>         Interested corporate and government applicants, as well as applicants
>         from academic institutions outside the United States should contact
>         Anne Marie Zellner at (412)268-4960 for information on attendance fees
> 
> HOUSING AND TRAVEL:
> 
>         Housing and travel are the responsibility of participants, but we will
>         provide information on local hotels at your request. Group rates for
>         local hotels are available on a first-come, first-served basis.
> 
>         A list of local hotels is included on the Web page referenced below.
> 
> REGISTRATION:
> 
>         To register for this workshop, please complete and return the
>         registration form below by May 1, 1996 to:
> 
>                 Workshop Application Committee,
>                 ATTN: Anne Marie Zellner
>                 Pittsburgh Supercomputing Center
>                 4400 Fifth Avenue,
>                 Pittsburgh, PA  15213.
> 
>         You may also apply for this workshop by sending requested information
>         via electronic mail to workshop@psc.edu or via fax to (412/268-5832).
> 
>         All applicants will be notified of acceptance on May 2, 1996.
> 
> For additional online information, please visit the workshop's Web page at
> http://www.psc.edu/training/T3D_May_96/welcome.html
> 
> ==============================================================================
>                             Registration Form
>      Supercomputing Techniques: Parallel Processing on CRAY MPP Systems
>                              May 20-23, 1996
> 
> Name:
> 
> Department:
> 
> Univ/Ind/Gov Affiliation:
> 
> Address:
> 
> Telephone:  W (   )               H(   )
> 
> Electronic Mail Address:
> 
> Social Security Number:
> 
> Citizenship:
> 
> Are you a PSC user (yes/no)?
> If yes, please give your PSC username:
> 
> Academic Standing (please check one):
>   F - Faculty          UG - Undergraduate                   I - Industrial
>  PD - Postdoctorate    UR - University Research Staff      GV - Government
>  GS - Graduate Student UN - University Non-Research Staff   O - Other
> 
> Please explain why you are interested in attending this workshop and what
> you hope to gain from it:
> 
> 
> Briefly describe your computing background (scalar, vector, and parallel
> programming experience; platforms; languages) and research interests:
> 
> 
> All applicants will be notified of acceptance on May 2, 1996.
> 
> 

SHMEM Timing Source


/*************** Timing program for SHMEMs by Mike Ess, ARSC ******************/
  #include <stdio.h>
  #include <time.h>
  #include <mpp/shmem.h>
  #define MIN( a, b )  (( a < b ) ? a : b )
  
  #define MAXSIZE 250000
  long iarray[ MAXSIZE ];
  long jarray[ MAXSIZE ];
  main(argc, argv)
    int argc;
    char *argv[];
  {
    double t1, t2, second();
    int reps = 100;           /* number of samples per test */
    struct timeval tv1, tv2;  /* for timing */
    int dt1, dt2;             /* time for one iter */
    int at1, at2;             /* accum. time */
    int mt1, mt2;             /* minimum times */
    int numint;               /* message length */
    int n;
    int i;
    long ack[ 1 ];            /* acknowledgment signal */
    int size;                 /* number of PEs */
    int rank;                 /* my PE number */
    int other;                /* the other guy's PE */
    long psync[ 2 ];          /* space for shmem_barrier synchronization */
    int testpos;              /* last position changed by send */
  
    rank = pvm_get_PE( pvm_mytid() );                /* who I am */
    if( rank == 0 ) other = 1;                       /* who he is */
    if( rank == 1 ) other = 0;
    for( i = 0; i < MAXSIZE; i++ ) iarray[ i ] = 1;  /* initialize send buffer */
    psync[ 0 ] = _SHMEM_SYNC_VALUE;
    psync[ 1 ] = _SHMEM_SYNC_VALUE;
    shmem_barrier( 0, 1, 2, psync );                 /* sync the PEs */
    if( rank == 0 ) {                                /* On PE 0 */
      at1 = 0;
      mt1 = 10000000;
      for (n = 1; n <= reps; n++) {                  /* do rep timings */
        t1 = second( );                              /* latency test */
           ack[ 0 ] = 0;
           shmem_put( jarray, iarray, 1, 1 );        /* send a word */
           if( ack[ 0 ] == 0 ) {       
              shmem_wait( ack, 0 );                  /* wait for acknowledge */
           }
        t2 = second( );
        dt1 = ( t2 - t1 ) * 1000000.0;               /* to microseconds */
        at1 += dt1;                                  /* the running sum */
        mt1 = MIN( dt1, mt1 );                       /* best timing */
      }
      printf("RTT Avg uSec %d ", at1 / reps);
      printf("RTT Min uSec %d\n", mt1 );
      for (numint = 100 / sizeof( int ); numint < 1000000; numint *= 10) {
        printf("Message size %d\n", numint * sizeof( int ));
        at2 = 0;                                     /* bandwidth test */
        mt2 = 10000000;                              /* numint = 12, 120, ... */ 
        for (n = 1; n <= reps; n++) {                /* do rep timings */
          t1 = second();
             ack[ 0 ] = 0;
             shmem_put( jarray, iarray, numint, 1 ); /* send numint ints */
             if( ack[ 0 ] == 0 ) {
                shmem_wait( ack, 0 );                /* wait for acknowledgment */
             }
          t2 = second();
          dt2 = ( t2 - t1 ) * 1000000.0;             /* to microseconds */
          at2 += dt2;                                /* the running sum */
          mt2 = MIN( mt2, dt2 );                     /* best timing */
        }
        at2 /= reps;
        printf("Avg Byte/uSec %8f ", (numint * sizeof( int )) / (double)at2);
        printf("Max Byte/uSec %8f\n", (numint * sizeof( int )) / (double)mt2);
      }
    } else {                                         /* On PE1 */
      ack[ 0 ] = 1;
      jarray[ 0 ] = 0;
      for ( n = 1; n <= reps; n++ ) {                /* mimic PE0's control */
        if( jarray[ 0 ] == 0 ) {
          shmem_wait( jarray, 0 );                   /* wait for change */
        }
        if( jarray[ 0 ] == 1 ) {
           shmem_put( ack, ack, 1, 0 );              /* send an ack */
        }
        jarray[ 0 ] = 0;
      }
      for (numint = 100 / sizeof( int ); numint < 1000000; numint *= 10) {
        testpos = numint - 1;
        jarray[ testpos ] = 0;
        for (n = 1; n <= reps; n++) {                /* mimic PE0's control */
          if( jarray[ testpos ] == 0 ) {
             shmem_wait( &jarray[ testpos ], 0 );    /* wait for last element */
          }                                          /*  to change */
          if( jarray[ testpos ] == 1 ) {
             shmem_put( ack, ack, 1, 0 );            /* send an ack */
          }
          jarray[ testpos ] = 0;
        }
      }
    }
    printf( "Done on PE%d\n", rank );
    exit( 0 );
  bail:
    printf( "Bailing out on PE%d\n", rank );
    exit( -1 );
  }
  double second()
  {
    double junk;
    fortran irtc();
    junk = irtc( ) / 150000000.0;
    return( junk );
  }

Current Editors:
Ed Kornkven ARSC HPC Specialist ph: 907-450-8669
Kate Hedstrom ARSC Oceanographic Specialist ph: 907-450-8678
Arctic Region Supercomputing Center
University of Alaska Fairbanks
PO Box 756020
Fairbanks AK 99775-6020
E-mail Subscriptions: Archives:
    Back issues of the ASCII e-mail edition of the ARSC T3D/T3E/HPC Users' Newsletter are available by request. Please contact the editors.
Back to Top