How are cice and pop decompositions set and how do I override them?

The pop and cice models both have similar decompositions and strategies for specifying the decomposition. Both models support decomposition of the horizontal grid into two-dimensional blocks, and these blocks are then allocated to individual processors inside each component. The decomposition must be specified when the models are built. There are four environment variables in env_build.xml for each model that specify the decomposition used. These variables are POP or CICE followed by _BLCKX, _BLCKY, _MXBLCKS, and _DECOMP. BLCKX and BLCKY specify the size of the local block in grid cells in the "x" and "y" direction. MXBLCKS specifies the maximum number of blocks that might be on any given processor, and DECOMP specifies the strategy for laying out the blocks on processors.

The values for these environment variables are set automatically by scripts in the cice and pop "bld" directories when "configure -case" is run. The scripts that generate the decompositions are


Those tools leverage decompositions stored in xml files,


to set the decomposition for a given resolution and total processor count. The decomposition used can have a significant effect on the model performance, and the decompositions specified by the tools above generally provide optimum or near optimum values for the given resolution and processor count. More information about cice and pop decompositions can be found in each of those user guides.

The decompositions can be specified manually by setting the environment variable POP_AUTO_DECOMP or CICE_AUTO_DECOMP to false in env_mach_pes.xml (which turns off use of the scripts above) and then setting the four BLCKX, BLCKY, MXBLCKS, and DECOMP environment variables in env_build.xml.

In general, relatively square and evenly divided Cartesian decompositions work well for pop at low to moderate resolution. Cice performs best with "tall and narrow" blocks because of the load imbalance for most global grids between the low and high latitudes. At high resolutions, more than one block per processor can result in land block elimination and non-Cartesian decompositions sometimes perform better. Testing of several decompositions is always recommended for performance and validation before a long run is started.