Chapter 2. Building and Running CAM

Table of Contents
Sample Interactive Session
Sample Run Scripts
Running CAM via the CESM scripts

This chapter describes how to build and run CAM in its standalone configuration. We do not provide scripts that are setup to work out of the box on a particular set of platforms. If you would like this level of support then consider running CAM from the CESM scripts (see CESM-1.0 User's Guide). We do however provide some examples of simple run scripts which should provide a useful starting point for writing your own scripts (see the Section called Sample Run Scripts).

In order to build and run CAM the following are required:

To build CAM for SPMD execution it will also be necessary to have an MPI library (version 1 or later). As with the NetCDF library, the Fortran API should be build using the same Fortran90 compiler that is used to build the rest of CAM. Otherwise linking to the library may encounter difficulties, usually due to inconsistencies in Fortran name mangling.

Building and running CAM takes place in the following steps:

  1. Configure model

  2. Build model

  3. Build namelist

  4. Execute model

Configure model. This step is accomplished by running the configure utility to set the compile-time parameters such as the dynamical core (Eulerian Spectral, Semi-Lagrangian Spectral, Finite-Volume, or HOMME), horizontal grid resolution, and the type of parallelism to employ (shared-memory and/or distributed memory). The configure utility is discussed in Appendix A.

Build model. This step includes compiling and linking the executable using the GNU make command (gmake). configure creates a Makefile in the directory where the build is to take place. The user then need only change to this directory and execute the gmake command.

Build namelist. This step is accomplished by running the build-namelist utility, which supports a variety of options to control the run-time behavior of the model. Any namelist variable recognized by CAM can be changed by the user via the build-namelist interface. There is also a high level "use case" functionality which makes it easy for the user to specify a consistent set of namelist variable settings for running particular types of experiments. The build-namelist utility is discussed in Appendix B.

Execute model. This step includes the actual invocation of the executable. When running using distributed memory parallelism this step requires knowledge of how your machine invokes MPI executables. When using shared-memory parallelism using OpenMP you may also set the number of OpenMP threads. On most HPC platforms access to the compute resource is through a batch queue system. The sample run scripts discussed in the Section called Sample Run Scripts show how to set the batch queue resources on several HPC platforms.

Sample Interactive Session

The following sections present an interactive C shell session to build and run a default version of CAM. Most often these steps will be encapsulated in shell scripts. An important advantage of using a script is that it acts to document the run you've done. Knowing the source code tree, and the configure and build-namelist commands provides all the information needed to exactly replicate a run.

For the interactive session the shell variable camcfg is set to the directory in the source tree that contains the CAM configure and build-namelist utilities ($CAM_ROOT/models/atm/cam/bld).

Configuring CAM for serial execution

We start by changing into the directory in which the CAM executable will be built, and then setting the environment variables INC_NETCDF and LIB_NETCDF which specify the locations of the NetCDF include files and library. This information is required by configure in order for it to produce the Makefile. The NetCDF library is require by all CAM builds. The directories given are just examples; the locations of the NetCDF include files and library are system dependent. The information provided by these environment variables could alternatively be provided via the commandline arguments -nc_inc and -nc_lib.

% cd /work/user/cam_test/bld
% setenv INC_NETCDF /usr/local/include
% setenv LIB_NETCDF /usr/local/lib

Next we issue the configure command. The argument -dyn fv specifies using the FV dynamical core which is the default for CAM5, but we recommend always adding the dynamical core (aka dycore) argument to configure commands for clarity. The argument -hgrid 10x15 specifies the horizontal grid. This is the coarsest grid available for the FV dycore in CAM and is often useful for testing purposes.

We recommend using the -test option the first time CAM is built on any machine. This will check that the environment is properly set up so that the Fortran compiler works and can successfully link to the NetCDF and MPI (if SPMD is enabled) libraries. Furthermore, if the configuration is for serial execution, then the tests will include both build and run phases which may be useful in exposing run time problems that don't show up during the build, for example when libraries are linked dynamically. If any tests fail then it is useful to rerun the configure command and add the -v option which will produce verbose output of all aspects of the configuration process including the tests. If the configuration is for an SPMD build, then no attempt to run the tests will be made. Typically MPI runs must be submitted to a batch queue and are not enabled from interactive sessions. But the build and static linking will still be tested.

% $camcfg/configure -dyn fv -hgrid 10x15 -nospmd -nosmp -test 
Issuing command to the CICE configure utility:
  $CAM_ROOT/models/ice/cice/bld/configure -hgrid 10x15 -cice_mode prescribed \
  -ntr_aero 0 0 -ntasks 1 -nthreads 1 -cache config_cache_cice.xml \
  -cachedir /work/user/cam_test
configure done.
creating /work/user/cam_test/bld/Filepath
creating /work/user/cam_test/bld/misc.h
creating /work/user/cam_test/bld/preproc.h
creating /work/user/cam_test/bld/Makefile
creating /work/user/cam_test/bld/config_cache.xml
Looking for a valid GNU make... using gmake
Testing for Fortran 90 compatible compiler... using pgf90
Test linking to NetCDF library... ok
CAM configure done.

The first line of output from the configure command is an echo of the system command that CAM's configure issues to invoke the CICE configure utility. This brings up a major difference from the CAM3 build. The thermodynamic sea ice model used by CAM3 was a version of CSIM4 that was modified to run using CAM's decomposition and data structures. It was essentially a part of CAM and was built just as any other physics parameterization. The thermodynamic sea ice model used by CAM5 is a special configuration of CICE4 which is a completely independent component with it's own build requirements. A major build requirement of the CICE model is that it's grid decomposition (which is independent of CAM's decomposition even when the two models are using the same horizontal grid) must be specified at build time. The CICE configure utility is responsible for setting the values of the CPP macros that are needed to build the CICE code. These settings include the specification of the CICE decomposition. Note that the line "configure done." immediately after the CICE configure commandline is being issued by the CICE configure, not by CAM's configure.

The next five lines of output inform the user of the files being created by configure. All the files produced by configure except for the cache file are required to be in the CAM build directory, so it is generally easiest to be in that directory when configure is invoked. Note: the files misc.h and preproc.h are no longer used by CAM, but are required for the CLM build.

The output from the -test option tells us that gmake is a GNU Make on this machine; that the Fortran compiler is pgf90; and that code compiled with the Fortran compiler can be successfully linked to the NetCDF library. The CAM Makefile is where the default compiler is specified. On Linux systems the default is pgf90. Finally, since this is a serial configuration no test for linking to the MPI library was done.

Configuring CAM for parallel execution

Before moving on to building CAM we address configuring the executable for parallel execution. But before talking about configuration specifics let's briefly discuss the parallel execution capabilities of CAM.

CAM makes use of both distributed memory parallelism implemented using MPI (referred to throughout this document as SPMD), and shared memory parallelism implemented using OpenMP (referred to as SMP). Each of these parallel modes may be used independently of the other, or they may be used at the same time which we refer to as "hybrid mode". When talking about the SPMD mode we usually refer to the MPI processes as "tasks", and when talking about the SMP mode we usually refer to the OpenMP processes as "threads". A feature of CAM which is very helpful in code development work is that the simulation results are independent of the number of tasks and threads being used.

Now consider configuring CAM to run in pure SPMD mode. With CAM3 SPMD was turned on using the -spmd option. But with CAM5 if we try that we find the following:

% $camcfg/configure -dyn fv -hgrid 10x15 -spmd -nosmp
**    ERROR: If CICE decomposition parameters are not specified, then
**    -ntasks must be specified to determine a default decomposition
**    for a pure MPI run.  The setting was:  ntasks=

This error results from the fact discussed in the Section called Configuring CAM for serial execution that the CICE model needs to set it's decomposition at build time, and in order to set the decomposition it needs to know how much parallelism is going to be used. If you know how the CICE decomposition works then you're free to set it explicitly using the configure options provided for that purpose. Otherwise it's best to let the CICE configure set the decomposition for you and just specify the number of MPI tasks that the job will use via setting the -ntasks option as follows:

% $camcfg/configure -dyn fv -hgrid 10x15 -ntasks 6 -nosmp
Issuing command to the CICE configure utility:
  $CAM_ROOT/models/ice/cice/bld/configure -hgrid 10x15 -cice_mode prescribed \
  -ntr_aero 0 -ntasks 6 -nthreads 1 -cache config_cache_cice.xml \
  -cachedir /work/user/cam_test
configure done.

Notice that the number of tasks specified to CAM's configure is passed through to the commandline that invokes the CICE configure. Generally any number of tasks that is appropriate for CAM to use for a particular horizontal grid will also work for CICE. But it is possible to get an error from CICE at this point in which case either the number of tasks requested should be adjusted, or the options that set the CICE decomposition explicitly will need to be used.

Note: The use of the -ntasks argument to configure implies building for SPMD. This means that an MPI library will be required. Hence, the specification -ntasks 1 is not the same as building for serial execution which is done via the -nospmd option and does not require a full MPI library. (Implementation detail: when building for serial mode a special serial MPI library is used which basically provides a complete MPI API, but doesn't do any message passing.)

Next consider configuring CAM to run in pure SMP mode. With CAM3 SMP was turned on using the -smp option. But with CAM5 that will result in the same error from CICE that we obtained above from attempting to use -spmd. If we are going to run the CICE code in parallel, we need to specify up front how much parallelism will be used so that the CICE configure utility can set the CPP macros that determine the grid decomposition. We specify the amount of SMP parallelism by setting the -nthreads option as follows:

% $camcfg/configure -dyn fv -hgrid 10x15 -nospmd -nthreads 6
Issuing command to the CICE configure utility:
  $CAM_ROOT/models/ice/cice/bld/configure -hgrid 10x15 -cice_mode prescribed \
  -ntr_aero 0 -ntasks 1 -nthreads 6 -cache config_cache_cice.xml \
  -cachedir /work/user/cam_test
configure done.

We see that the number of threads has been passed through to the CICE configure command.

Note: The use of the -nthreads argument to configure implies building for SMP. This means that the OpenMP directives will be compiled. Hence, the specification -nthreads 1 is not the same as building for serial execution which is done via the -nosmp option and does not require a compiler that supports OpenMP.

Finally, to configure CAM for hybrid mode, simply specify both the -ntasks and -nthreads arguments to configure.

Building CAM

Once configure is successful, build CAM by issuing the make command:

% gmake -j2  >&! make.out

The argument -j2 is given to allow a parallel build using 2 processes. The optimal number of processes to use depends on the compute resource available.

It is useful to redirect the output from make to a file for later reference. This file contains the exact commands that were issued to compile each file and the final command which links everything into an executable file. Relevant information from this file should be included when posting a bug report concerning a build failure.

Building the Namelist

The first step in the run procedure is to generate the namelist files. The only safe way to generate consistent namelist settings is via the build-namelist utility. Even in the case where only a slight modification to the namelist is desired, the best practice is to provide the modified value as an argument to build-namelist and allow it to actually generate the namelist files.

The following interactive C shell session builds a default namelist for CAM. We assume that a successful execution of configure was performed in the build directory as discussed in the previous section. This is an essential prerequisite because the config_cache.xml file produced by configure is a required input file to build-namelist. One of the responsibilities of build-namelist is to set appropriate default values for many namelist variables, and it can only do this if it knows how the CAM executable was configured. That information is present in the cache file. As in the previous section the shell variable camcfg is set to the CAM configuration directory ($CAM_ROOT/models/atm/cam/bld).

We begin by changing into the directory where CAM will be run. It is usually convenient to have the run directory be separate from the build directory. Possibly a number of different runs will be done that each need to have a separate run directory for the output files, but will all use the same executable file from a common build directory. It is, of course, possible to execute build-namelist in the build directory since that's where the cache file is and so you don't need to specify to build-namelist where to find that file (it looks in the current working directory by default). But then, assuming you plan to run CAM in a different directory, all the files produced by build-namelist need to be copied to the run directly. If you're running configure and build-namelist from a script, then you need to know how to generate the filenames for the files that need to be copied. For this reason it's more robust to change to the run directory and execute build-namelist there. That way if there's a change to the files that are produced, your script doesn't break because the files haven't all been copied to the run directory.

Next we set the CSMDATA environment variable to point to the root directory of the tree containing the input data files. Note that this is a required input for build-namelist (this information may alternatively be provided using the -csmdata argument). If not provided then build-namelist will fail with an informative message. The information is required because many of the namelist variables have values that are absolute filepaths. These filepaths are resolved by build-namelist by prepending the CSMDATA root to the relative filepaths that are stored in the default values database.

The build-namelist commandline contains the -config argument which is used to point to the cache file which was produced in the build directory. It also contains the -test argument, explained further below.

% cd /work/user/cam_test
% setenv CSMDATA /fs/cgd/csm/inputdata
% $camcfg/build-namelist -test -config /work/user/cam_test/bld/config_cache.xml
Writing CICE namelist to ./ice_in 
Writing DOCN namelist to ./docn_ocn_in 
Writing DOCN stream file to ./ 
Writing CLM namelist to ./lnd_in 
Writing driver namelist to ./drv_in 
Writing dry deposition namelist to ./drv_flds_in 
Writing ocean component namelist to ./docn_in 
Writing CAM namelist to ./atm_in 
Checking whether input datasets exist locally...
OK -- found depvel_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart/dvel/
OK -- found tracer_cnst_filelist = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/oxid/oxid_1.9x2.5_L26_clim_list.c090805.txt
OK -- found depvel_lnd_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart/dvel/
OK -- found tracer_cnst_datapath = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/oxid
OK -- found xs_long_file = /fs/cgd/csm/inputdata/atm/waccm/phot/
OK -- found rsf_file = /fs/cgd/csm/inputdata/atm/waccm/phot/
OK -- found clim_soilw_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart/dvel/
OK -- found exo_coldens_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart/phot/
OK -- found tracer_cnst_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/oxid/
OK -- found season_wes_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart/dvel/
OK -- found solar_data_file = /fs/cgd/csm/inputdata/atm/cam/solar/
OK -- found soil_erod = /fs/cgd/csm/inputdata/atm/cam/dst/
OK -- found ncdata = /fs/cgd/csm/inputdata/atm/cam/inic/fv/
OK -- found modal_optics = /fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found bnd_topo = /fs/cgd/csm/inputdata/atm/cam/topo/
OK -- found bndtvs = /fs/cgd/csm/inputdata/atm/cam/sst/
OK -- found focndomain = /fs/cgd/csm/inputdata/atm/cam/ocnfrac/
OK -- found tropopause_climo_file = /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart/ub/
OK -- found fpftcon = /fs/cgd/csm/inputdata/lnd/clm2/pftdata/pft-physiology.c100226
OK -- found fsnowaging = /fs/cgd/csm/inputdata/lnd/clm2/snicardata/
OK -- found fatmlndfrc = /fs/cgd/csm/inputdata/lnd/clm2/griddata/
OK -- found fsnowoptics = /fs/cgd/csm/inputdata/lnd/clm2/snicardata/
OK -- found fsurdat = /fs/cgd/csm/inputdata/lnd/clm2/surfdata/
OK -- found fatmgrid = /fs/cgd/csm/inputdata/lnd/clm2/griddata/
OK -- found liqopticsfile = /fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found iceopticsfile = /fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found prescribed_ozone_datapath = /fs/cgd/csm/inputdata/atm/cam/ozone
OK -- found prescribed_ozone_file = /fs/cgd/csm/inputdata/atm/cam/ozone/
OK -- found ext_frc_specifier for SO2         -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found ext_frc_specifier for bc_a1       -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found ext_frc_specifier for num_a1      -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found ext_frc_specifier for num_a2      -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found ext_frc_specifier for pom_a1      -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found ext_frc_specifier for so4_a1      -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found ext_frc_specifier for so4_a2      -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for DMS       -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for SO2       -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for SOAG      -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for bc_a1     -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for num_a1    -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for num_a2    -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for pom_a1    -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for so4_a1    -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found srf_emis_specifier for so4_a2    -> /fs/cgd/csm/inputdata/atm/cam/chem/trop_mozart_aero/emis/
OK -- found rad_climate for P_so4_a1:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_dst_a1:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_dst_a3:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_bc_a1:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_pom_a1:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_soa_a1:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_ncl_a1:/fs/cgd/csm/inputdata/atm/cam/physprops/
OK -- found rad_climate for P_ncl_a3:/fs/cgd/csm/inputdata/atm/cam/physprops/

The first eight lines of output from build-namelist inform the user of the files that have been created. There are namelist files for ice component (ice_in), the land component (lnd_in), the data ocean component (docn_in, docn_ocn_in), the atmosphere component (atm_in), the driver (drv_in), and a file that is read by both the atmosphere and land components (drv_flds_in). There is also a "stream file" ( which is read by the data ocean component. Note that these filenames are hardcoded in the components and my not be changed with source code modifications.

The next section of output is the result of using the -test argument to build-namelist. As with configure we recommend using this argument whenever a model configuration is being run for the first time. It checks that each of the files that are present in the generated namelists can be found in the input data tree whose root is given by the CSMDATA environment variable. If a file is not found then the user will need to take steps to make that file accessible to the executing model before a successful run will be possible. The following is a list of possible actions:

  1. Acquire the missing file. If this is a default file supplied by the CESM project then you will be able to download the file from the project's svn data repository (see the Section called Acquiring Input Datasets).

  2. If you have write permissions in the directory under $CSMDATA then add the missing file to the appropriate location there.

  3. If you don't have write permissions under $CSMDATA then put the file in a place where you can (for example, your run directory) and rerun build-namelist with an explicit setting for the file using your specific filepath.

Expanding a bit on rerunning build-namelist: let's say for example that the -test option informed you that the ncdata file was not found. You acquire the file from the data repository, but don't have permissions to write in the $CSMDATA tree. So you put the file in your run directory and issue a build-namelist command that looks like this:

% $camcfg/build-namelist -config /work/user/cam_test/bld/config_cache.xml \
  -namelist "&atm ncdata='/work/user/cam_test/' /"

Now the namelist in atm_in will contain an initial file (specified by namelist variable ncdata) which will be found by the executing CAM model.

A final note: this particular configuration of CAM which is using the default cam5 physics package requires that 52 datasets be specified in order to run correctly. Trying to manage namelists of that complexity by hand editing files is extremely error prone and is strongly discouraged. User modifications to the default namelist settings can be made in a number of ways while still letting build-namelist actually generate the final namelist. In particular, the -namelist, -infile, and -use_case arguments to build-namelist are all mechanisms by which the user can override default values or specify additional namelist variables and still allow build-namelist to do the error and consistency checking which makes the namelist creation process more robust.

Acquiring Input Datasets

Note: If you are doing a standard production run that is supported in the CCSM scripts, then using those scripts will automatically invoke a utility to acquire needed input datasets. The information in this section is to aid developers using CAM standalone scripts.

The input datasets required to run CAM are available from a Subversion repository located here: The user name and password for the input data repository will be the same as for the code repository (which are provided to users when they register to acquire access to the CCSM source code repository).


If you have a list of files that you need to acquire before running CAM, then you can either just issue commands interactively, or if your list is rather long then you may want to put the commands into a shell script. For example, suppose after running build-namelist with the -test option you find that you need to acquire the file /fs/cgd/csm/inputdata/atm/cam/inic/fv/ And let's assume that /fs/cgd/csm/inputdata/ is the root directory of the inputdata tree, and that you have permissions to write there. If the subdirectory atm/cam/inic/fv/ doesn't already exist, then create it. Finally, issue the following commands at an interactive C shell prompt:

% set svnrepo=''
% cd /fs/cgd/csm/inputdata/atm/cam/inic/fv
% svn export $svnrepo/atm/cam/inic/fv/
Error validating server certificate for '':
 - The certificate is not issued by a trusted authority. Use the
   fingerprint to validate the certificate manually!
 - The certificate hostname does not match.
 - The certificate has expired.
Certificate information:
 - Hostname: localhost.localdomain
 - Valid: from Feb 20 23:32:25 2008 GMT until Feb 19 23:32:25 2009 GMT
 - Issuer: SomeOrganizationalUnit, SomeOrganization, SomeCity, SomeState, --
 - Fingerprint: 86:01:bb:a4:4a:e8:4d:8b:e1:f1:01:dc:60:b9:96:22:67:a4:49:ff
(R)eject, accept (t)emporarily or accept (p)ermanently? p
Export complete.

The messages about validating the server certificate will only occur for the first file that you export if you answer "p" to the question as in the example above.

Running CAM

Once the namelist files have successfully been produced, and the necessary input datasets are available, the model is ready to run. Usually CAM will be run with SPMD parallelization enabled, and this requires setting up MPI resources and possibly dealing with batch queues. These issues will be addressed briefly in the Section called Sample Run Scripts. But for a simple test in serial mode executed from an interactive shell, we only need to issue the following command:

% /work/user/cam_test/bld/cam >&! cam.log

The commandline above redirects STDOUT and STDERR to the file cam.log. The CAM logfile contains a substantial amount of information from all components that can be used to verify that the model is running as expected. Things like namelist variable settings, input datasets used, and output datasets created are all echoed to the log file. This is the first place to look for problems when a model run is unsuccessful. It is also very useful to include relevant information from the logfile when submitting bug reports.