BassiLogging In Accounts/Charges File Storage Programming Running Jobs Software AIX Environment IBM Manuals Detailed Specs Node Network Map Bassi Timeline Benchmark Codes Relative Performance Performance Monitoring Links
Quick Start Guide
Status & StatsUP Wed 10/31 14:51NERSC MOTD Announcements Known Problems Current Queue Look Completed Jobs List Job Stats |
Working with Cray Files on the IBM SPBinary data files written on the Crays are not readable on the IBM SP, except in certain special circumstances. This document describes some of the strategies available to deal with these files. NERSC has written and collected A number of utilities that can be used to convert a file for use on the IBM SP. IntroductionThere are three main issues when converting Cray files for use on the SP: blocking; data alignment; and data representation. Blocking and data alignment refers to the insertion of control data and pad data into the file in addition to the regular user data, data representation refers to the issue of converting Cray floating-point and integer representation to the IEEE format. The latter issue is only a concern for files written on the PVP machines, the T3E uses an IEEE data representation just like the IBM SP. The following table summarizes how most common files are written.
BlockingFiles that are COS blocked have control words (64 bit) every 512 words, and additional control words at the end of every Fortran record, and at the end of the file. The format of these control words is described in Chapter 7 of the Application Programmer's I/O Guide, available at Craydoc website. Files that are IBM blocked have a 32 bit control word a the beginning and end of a record. The control word is the length of the record in bytes. The Cray control words can be processed correctly by reading with the NCARU library, or the file may be converted with the crayconv utility. PaddingPadding refers to extra bytes inserted into the user data. These extra bytes are inserted to maintain a certain alignment relation between the data written out. Because the lengths of the default integer, logical, and real data types are all a multiple of 8 bytes on the Crays, padding will only occur if you have used character variables that are not of lengths that are a multiple of 8, or have used real*4 or integer*4 data on the T3E (even if declared as real*4 or integer*4, on the PVP systems 8 bytes are used). In cases where padding occurs, bytes are inserted so that any datum of length 8 bytes is at a byte offset, from the beginning of the record, that is a multiple of 8 bytes (characters are considered to have length 1 byte, and complex are treated as two reals). Then the end of the record is padded so that the whole record length is a multiple of 8. For example, a Fortran record is written on the PVP machines: real a(50) integer n(50) character*17 label write(50) n, a, label The lengths of n, a, and label are 50 × 8 bytes, 50 × 8 bytes, and 17 bytes respectively. Within the Fortran record, n starts at offset 0, a at offset 400, and label at offset 800. The only padding that occurs is at the end of the record, where 7 bytes are added to make the total record length 816 bytes, a multiple of 8. In the following case, the situation is more complicated: real a(50) integer n(50) character*17 label write(50) label, a, n In this case, the lengths of the data are the same as the previous example. However, without padding, the alignments are label at offset 0, a at offset 17, and n at offset 417. Since a has elements of length 8 bytes, according to the section above it must be written at an offset that is a multiple of 8 bytes, therefore a pad of 7 bytes is inserted between the end of label and the beginning of a. In the record that is written to the file, the alignments are label at offset 0, a at offset 24, and n at offset 424. In the final case, assume the following record is written on the T3E: real a(40), b(40) integer*4 n(13), m(13) character*12 label write(50) label, n, a, m, b Without padding, the alignments are label at offset 0, n at offset 12, a at offset 64, m at offset 384, and b at offset 436. a and b need to be at offsets that are a multiple of 8 bytes, a is already correct, but 4 bytes must be inserted before b, so that it starts at offset 440. There is no automatic or easy way to deal with padding. If you have files that have padded records the only solution is to pick out the data manually, avoiding the bytes that have been inserted. Cray Data RepresentationAs noted in the table above, files written on the PVP systems will contain Cray real and integer data. This data must be converted to IEEE data before it it usable on the IBM SP. Character data does not need to be converted. The NCARU library contains routines to convert various Cray datatypes to IEEE format. UtilitiesAll the utilities described in this section can be accessed using "module load ncaru". The crayconv utility developed at NERSC makes use of the NCARU library to read a Cray dataset and write the data out in a format compatible with the IBM SP. The format of the command is: crayconv -i input -o output where input is the input file and output is the output file. The default is to assume a sequential-access unformatted file written on the PVP, with no data truncation. The following options are also available:
NotesThe utility cannot remove pad data. For files from the PVP machines, the utility cannot distinguish between a real variable with value zero or an integer variable with the value zero. The -zr or -zi options can be used to select how a zero value is converted. This is only really relevant when you select different values for the -r and -i options. For example, if you run crayconv with the options "-r8 -i4 -zi", and you have real data in the file that has a value zero, it will be treated by crayconv as an integer and converted to a 4 byte IEEE integer. This will cause a misalignment of all subsequent data on that record, and it will likely show up as garbage after being read by a program on the IBM SP. For this reason, use mismatched -r and -i options with care. If your file was written on the PVP using a Fortran program compiled with the option "-Onfastint", the utility may not work. In addition, provided with NCARU are the following commands:
NCARUNCARU is a library and set of utilities written at the Scientific Computing Division (SCD) at the National Center for Atmospheric Research in Boulder, Colorado. The NCARU library is used to read Cray-style datasets on non-Cray computers. It performs unblocking of Cray datasets, and converting from Cray-style data representation to IEEE data representation. The user interface consists of routines to open, close, read and write data. The library is not available via standard Fortran I/O statements, you must change your source code to read in Cray datasets. In addition, the NCARU library has limited functionality when dealing with Fortran records containing different types of data. These limitations mean that it is likely to be less work to convert your data on the Cray using FFIO, than to use NCARU on the SP. Furthermore, the library may not easily work for files not written with standard Fortran or C language I/O. Further limitations are: no support for 64 bit address space programs, not thread safe, no support for Cray double precision (128 bit real). Using the NCARU LibraryThe basic input/output routines are as follows. For further information, see NCARU Library Documentation.
Both the crayread and craywrite routines can convert data from Cray format to IEEE as it is transferred to and from the file. This option only works when the entire record is composed of data of a single type. For records containing multiple data types, you must read the record into a buffer, then translate each datatype individually. The data conversion rotuines are as follows. For further information, see NCARU Library Documentation.
ExamplesFor cases where different datatypes are present in one record, there are multiple strategies for converting the data. For a simple cases of all the data being of length 8 bytes, it is most straightforward. Consider the following record written on the PVP machines: real a(50) integer n(50) ... open(20,file='data') write(20) a, n The easiest way to convert this record for the IBM SP is:
real x(50)
integer n(50)
real*8 buffer(100)
...
! open in blocked mode
ifc = crayopen('data',10,0)
! read record without converting
nwds = crayread(ifc,buffer,100,0)
! convert data
call ctospf(buffer,x,50)
call ctospi(buffer(51),n,50)
Note that the buffer to hold the intermediate data must be big enough to hold 100 64 bit words, even though 100 32 bit words are eventually produced. If the data has a mixture of lengths (particularly character data), some non-standard Fortran is necessary. Consider the following record written on the T3E: character*9 label integer*4 n(10) real x(50) ... open(20,file='data') write(20) label, n, x One way to read this is to use a non-standard data type, byte:
character*9 label
integer n(10)
real*8 x(50)
byte buffer(454)
...
! open in blocked mode
ifc = crayopen('data',10,0)
! read record without converting
! 454 bytes is 62 words
nwds = crayread(ifc,buffer,62,0)
! convert data
call bcopy(buffer, label, 9)
call bcopy(buffer(10), n, 40)
! padding makes x start at byte 55
call bcopy(buffer(55),x,400)
...
subroutine bcopy(a, b, n)
byte a(n), b(n)
b=a
return
end
In this case it is probably better to write a free-standing program based on the application that simply reads in the file and converts it to IBM format. This would avoid introducing non-standard Fortran into the application program. To use the NCARU library % module load ncaru % xlf -o a.out b.o c.f $NCARU Related Information | |||||||||||||||||||||||||||||||||||||||||||||
![]() |
Page last modified: Thu, 24 Jan 2008 19:07:21 GMT Page URL: http://www.nersc.gov/nusers/systems/bassi/crayconv.php Web contact: webmaster@nersc.gov Computing questions: consult@nersc.gov Privacy and Security Notice |
![]() |