next up previous contents
Next: 3 Namelist or Resource Up: UsersGuide Previous: 1 Obtaining the Source   Contents

Subsections

2 Creating and Running the Executable

The CLM2.0 model can be built to run in one of three modes. It can run as a stand alone executable where atmospheric forcing data is periodically read in (e.g., using the data in NCEPDATA). This will be referred to as offline mode. It can also be run as part of the Community Atmosphere Model (CAM) where communication between the atmospheric and land models occurs via subroutine calls. This will be referred to as cam mode. Finally, it can be run as a component in a system of geophysical models (CCSM). In this mode, the atmosphere, land, ocean and sea-ice models are run as separate executables that communicate with each other via the CCSM flux coupler. This will be referred to as csm mode.

The following table lists the supported target architectures for the different modes:


Mode SUPPORTED Platforms
offline IBM(SP) (AIX), SGI (IRIX64)
cam IBM(SP) (AIX), SGI (IRIX64), Linux, Compaq (OSF1), Sun (SunOS)
csm IBM(SP) (AIX), SGI (IRIX64)


The IBM(SP) is a distributed memory machine consisting of multiple compute nodes. Each node in turn contains multiple shared memory processors (currently four) and network connections which attach it to the other nodes. When running on the IBM(SP), CLM2.0 uses Open Multi Processing (OpenMP) directives within a single (shared memory) node and Message Passing Interface (MPI) directives across nodes (distributed memory) to take full advantage of parallelism within and across several nodes.

The SGI is a shared memory RISC architecture machine. The optimal method of running CLM2 on the SGI is through use of only OpenMP directives to take advantage of shared memory parallelism. If OpenMP directives are used (default), MPI should not be invoked (the cpp directive SPMD should not be defined).

The method of building and running CLM2.0 depends on the selected mode as well as the target architecture. A general discussion of the various aspects of building and running CLM2.0 follows.

2.1 Offline mode: Using jobscript.csh

In order to build and run CLM2.0 in offline mode, a sample script, jobscript.csh, and a corresponding Makefile are provided in the bld/offline directory. The script creates a model executable at 3x3 resolution, determines the necessary input datasets, constructs the input model namelist and runs the model for one day.

The user must edit this script appropriately in order to build and run the executable for their particular requirements. This script is provided only as an example to help the novice user get CLM2.0 up and running as quickly as possible.

Jobscript.csh can be run with minimal user modification, assuming the user resets several environment variables at the top of the script. In particular, the user must set MODEL_DATADIR to point to the full disk pathname of the directory containing the untarred input data subdirectories NCEPDATA, inidata, pftdata, rawdata, rtmdata and srfdata. The user must also set MODEL_SRCDIR to point to the full disk pathname of the directory containing the untarred source code subdirectories : biogeophys, camclm_share, csm_share, ecosysdyn, ecosysdyndgvm, main, mksrfdata, riverroute and utils. Finally, the user must set MODEL_EXEDIR to point to the directory where the user wants the executable to be build and run.

The script can be divided into five functional sections: 1) specification of script environment variables; 2) creation of two header files (misc.h and preproc.h) and a directory search path file (Filepath) needed to build the model executable; 3) creation of the model input namelist; 4) creation of the model executable; and 5) execution of the model. Each of these functional sections is discussed in what follows.

2.1.1 Specification of environment variables

The following environment variables are set from within the script. The script provides tentative settings for all variables, however, these values MUST BE edited by the user.


Environment Variable Synopsis
MODEL_SRCDIR Full pathname for the source code directory hierarchy
MODEL_EXEDIR Full pathname for the directory where model executable will reside
  Object files will be built in the directory $MODEL_EXEDIR/obj
MODEL_DATADIR Full pathname for the directory where the input datasets reside
DEBUG Turns debugging flags on in Makefile if set
NTHRDS Number of OpenMP multitasking threads
  Should not exceed the number of physical CPUs (ie, processors) on a shared memory machine
  Should not exceed the number of CPUs in a node on a distributed memory machine
NTASKS Number of MPI tasks to use for distributed memory implementation
  If NTASKS = 1, distributed memory is disabled
  If NTASKS > 1, distributed memory is enabled on NTASKS MPI processes
LIB_NETCDF Full pathname for directory containing the netCDF library
  Setting depends on user's target architecture
INC_NETCDF Full pathname for directory containing netCDF include files
  Setting depends on user's target architecture
LIB_MPI Full pathname for directory containing the MPI library
  Setting depends on user's target architecture
  Only needed if NTASKS > 1
INC_MPI Full pathname for directory containing the MPI include files
  Setting depends on user's target architecture
  Only needed if NTASKS > 1


2.1.2 Creation of header files and directory search path

The script creates the header files misc.h and preproc.h and the directory search path file Filepath. These files are placed in the directory $MODEL_EXEDIR/obj. To modify these files the user should edit their contents from within the script rather than attempt to edit the files directly since the script will overwrite the files upon its execution. The use of these files by gnumake is discussed in section 2.1.4. The contents of each of these files are summarized below.

The file misc.h contains a list of resolution- and model-independent C-language pre-processor (cpp) tokens


misc.h cpp token Synopsis
SPMD If defined, enables distributed memory (single program multiple data (SPMD)) implementation
  (Automatically defined if environment variable NTASKS > 1)
PERGRO If defined, enables modification that tests reasonable perturbation error growth
  Only applicable in cam mode


The file, preproc.h, contains a list of resolution- and model-dependent cpp tokens.


preproc.h cpp token Synopsis
OFFLINE If defined, offline mode is invoked
COUP_CSM If defined, csm mode is invoked
COUP_CAM If defined, cam mode is invoked
LSMLON Number of model longitudes
LSMLAT Number of model latitudes
RTM If defined, RTM river routing is invoked


The file Filepath contains a list of directories used by gnumake to resolve the location of source files and to determine dependencies. Users can add new search directories by editing jobscript.csh under ``build Filepath''. The default Filepath directory hierarchy for CLM2 is as follows:


Source Directories Functionality
$MODEL_SRCDIR/main control routines (history, restart, etc)
$MODEL_SRCDIR/biogeophys biogeophysics routines
$MODEL_SRCDIR/ecosysdyn ecosystem dynamics routines
$MODEL_SRCDIR/riverroute river routing routines
$MODEL_SRCDIR/camclm_share code shared between CAM and CLM2
$MODEL_SRCDIR/csm_share code shared by all CCSM geophysical model components
$MODEL_SRCDIR/utils/timing timing routines
$MODEL_SRCDIR/mksrfdata generation of surface dataset routines


2.1.3 Setting the Namelist

Before building and running the model, the user must specifiy CLM2 namelist variables appearing in the CLM2 namelist, clmexp. A default namelist is generated by jobscript.csh. This namelist will result in a one day model run using the provided datasets. Namelist input is written to the file lnd.stdin and can be divided into several main categories: run definitions, datasets, history and restart file settings, and land model physics settings. A full discussion of possible namelist settings is given in Section 3.

2.1.4 Building the model

The script, jobscript.csh, invokes gnumake to build (compile) the model. The file, Makefile, located in the bld/offline directory, contains the commands used by gnumake that are required for each of the supported target architectures. The executable name resulting from the build procedure is "clm". The result of the build procedure will be documented in the log file, compile_log.clm. Any problems encountered during the build procedure will be documented here.

Gnumake generates a list of source and object files using each directory listed in Filepath. For each source file, gnumake invokes cpp to create a dependency file in the directory $MODEL_EXEDIR/obj. For example, routine.F90 will have a dependency file, routine.d. If a file which is listed as a target of a dependency does not exist in $MODEL_EXEDIR/obj, gnumake searches the directories contained in Filepath, in the order given, for a file with that name. The first file found satisfies the dependency. If user-modified code is to be introduced, Filepath should contain, as the first entry, the directory containing the user code.

A parallel gnumake is achieved in the script by invoking gnumake with the -j option, which specifies the number of job commands to run in parallel.

To obtain a model executable, the environment variables LIB_NETCDF and INC_NETCDF must be specified. These provide pathnames to netCDF library and include files. Furthermore, if CLM2.0 is run under MPI (the environment variable NTASKS is greater than 1 in the script and the C-preprocessor directive SPMD is defined), then the directories containing the MPI library and MPI include files must also be specified as environment variables in the script. (This is not the case only for the IBM-SP, where the MPI library and include files are obtained directly from choice of compiler command).

C-preprocessor directives of the form #include, #if defined, etc., are used to enhance code portability and allow for the implementation of distinct blocks of functionality (such as incorporation of different modes) within a single file. Header files, such as misc.h, are included with #include statements within the source code. When gnumake is invoked, the C preprocessor includes or excludes blocks of code depending on which cpp tokens have been defined. C-preprocessor directives are also used to perform textual substitution for resolution-specific parameters in the code. The format of these tokens follows standard cpp protocol in that they are all uppercase versions of the Fortran variables, which they define. Thus, a code statement like


parameter(lsmlon = LSMLON); parameter(lsmlat = LSMLAT)


will result in the following processed line (for 3x3 model resolution):


parameter(lsmlon = 120) ; parameter(lsmlat = 60)


where LSMLON and LSMLAT are set in preproc.h via the jobscript.

2.1.5 Running the executable

Jobscript.csh will execute the commands required to run the model under the supported target architectures. The settings of the environment variables NTASKS and NTHRDS in the script determine the CLM2.0 runtime environment. If NTHRDS is set to greater than 1, OpenMp multitasking will be used for the number of threads specified. If NTASKS is greater than 1, CLM2.0 will be run under MPI for the number of tasks specified. If both are greater than 1 (this should only be used for the IBM(SP)), then hybrid mode OpenMP/MPI will be enabled.

If MPI is used for model execution, most MPI implementations provide a startup script which accepts the MPI executable as a command line argument. Additional command line arguments allow the user to specify details such as the various machine architectures or number of processes to use for the run. Once MPI has created the specified number of processes, model execution will begin. The collection of active tasks will then compute locally and exchange messages with each other to integrate the model.

Upon successful completion of the model run, several files will be generated in the executable directory. These include history, restart, and initialization files (see section 3.3 for more details), as well as log files that document the execution of the model. The log files will be located in the directory corresponding to the script environment variable, $MODEL_EXEDIR, and will have the form, clm.log.YYMMDD-HHMMSS, where YY is the last two digits of the current year, MM is the month, DD is the day of the month, HH is the hour, MM is the minutes, and SS is the seconds of the start of the model run. These logs file may be referred to as ''standard out.'' A timing file, timing.0, containing model performance statistics is also generated in the executable directory.

2.2 Cam mode

When running CLM2.0 as part of the CAM executable, CAM build and run scripts must be utilized and the user should refer to the CAM User's Guide for specific details on building and running the CAM executable. We will only discuss some essential points of the CAM build and run scripts.

The header files, preproc.h and misc.h, as well as the directory search path file, Filepath, are needed for the CAM build procedure in an analogous manner to the CLM2.0 build procedure. The user should keep in mind that the CLM2.0 directory hierarchy MUST appear after the CAM directory hierarchy in Filepath. CLM2.0 contains several files that have the same name as the corresponding CAM files (e.g. time_manager.F90). When running in CAM mode, the corresponding CAM file must be used. The CAM build and run scripts ensure this.

The CLM2.0 namelist, clmexp, must also be specified. By default, RTM river routing is not enabled in cam mode (i.e. the cpp variable, RTM, is not defined). Furthermore, CLM2.0 does not permit the user to independently set several namelist variables (in particular, those dealing with history file logic and run control logic) when running in cam mode. CLM2.0 will override any user input for these variables with the corresponding values used by the CAM model. This is discussed in more detail in section 3.6.

2.3 Csm mode

When running CLM2.0 as the land component of CCSM, CCSM build and run scripts must be utilized. The user should refer to the CCSM2.0 Quick Start Guide and associated documentation for a complete description. We will only briefly outline some of the key points associated with executing CLM2.0 as part of CCSM2.0.

The master CCSM script, test.a1.run, coordinates the building/running of the complete system of CCSM executables. The land component setup script, lnd.setup.csh (equivalent to jobscript.csh in the offline case), builds the CLM2.0 executable and creates the input CLM2.0 namelist, lnd.stdin. The ending time step need not be specified in the CLM2.0 namelist since the model responds to flags set in the flux coupler script, cpl.setup.csh, that determine when it is time to terminate the run. For a full discussion of CLM2.0 namelist varibles, csm mode included, refer to section 3.


next up previous contents
Next: 3 Namelist or Resource Up: UsersGuide Previous: 1 Obtaining the Source   Contents
Mariana Vertenstein 2002-05-16