Wrapping Up

My summer job as a research student of Steve Easterbrook is nearing an end. All of a sudden, I only have a few days left, and the weather is (thankfully) cooling down as autumn approaches. It feels like just a few weeks ago that this summer was beginning!

Over the past three months, I examined seven different GCMs from Canada, the United States, and Europe. Based on the source code, documentation, and correspondence with scientists, I uncovered the underlying architecture of each model. This was represented in a set of diagrams. You can view full-sized versions here:

The component bubbles are to scale (based on the size of the code base) within each model, but not between models. The size and complexity of each GCM varies greatly, as can be seen below. UVic is by far the least complex model – it is arguably closer to an EMIC than a full GCM.

I came across many insights while comparing GCM architectures, regarding how modular components are, how extensively the coupler is used, and how complexity is distributed between components. I wrote some of these observations up into the poster I presented last week to the computer science department. My references can be seen here.

A big thanks to the scientists who answered questions about their work developing GCMs: Gavin Schmidt (Model E); Michael Eby (UVic); Tim Johns (HadGEM3); Arnaud Caubel, Marie-Alice Foujols, and Anne Cozic (IPSL); and Gary Strand (CESM). Additionally, Michael Eby from the University of Victoria was instrumental in improving the diagram design.

Although the summer is nearly over, our research certainly isn’t. I have started writing a more in-depth paper that Steve and I plan to develop during the year. We are also hoping to present our work at the upcoming AGU Fall Meeting, if our abstract gets accepted. Beyond this project, we are also looking at a potential experiment to run on CESM.

I guess I am sort of a scientist now. The line between “student” and “scientist” is blurry. I am taking classes, but also writing papers. Where does one end and the other begin? Regardless of where I am on the spectrum, I think I’m moving in the right direction. If this is what Doing Science means – investigating whatever little path interests me – I’m certainly enjoying it.

Advertisement

Modularity

I’ve now taken a look at the code and structure of four different climate models: Model E, CESM, UVic ESCM, and the Met Office Unified Model (which contains all the Hadley models). I’m noticing all sorts of similarities and differences, many of which I didn’t expect.

For example, I didn’t anticipate any overlap in climate model components. I thought that every modelling group would build their own ocean, their own atmosphere, and so on, from scratch. In fact, what I think of as a “model” – a self-contained, independent piece of software – applies to components more accurately than it does to an Earth system model. The latter is more accurately described as a collection of models, each representing one piece of the climate system. Each modelling group has a different collection of models, but not every one of these models is unique to their lab.

Ocean models are a particularly good example. The Modular Ocean Model (MOM) is built by GFDL, but it’s also used in NASA’s Model E and the UVic Earth System Climate Model. Another popular ocean model is the Nucleus for European Modelling of the Ocean (NEMO, what a great acronym) which is used by the newer Hadley climate models, as well as the IPSL model from France (which is sitting on my desktop as my next project!)

Aside: Speaking of clever acronyms, I don’t know what the folks at NCAR were thinking when they created the Single Column Atmosphere Model. Really, how did they not see their mistake? And why haven’t Marc Morano et al latched onto this acronym and spread it all over the web by now?

In most cases, an Earth system model has a unique architecture to fit all the component models together – a different coupling process. However, with the rise of standard interfaces like the Earth System Modeling Framework, even couplers can be reused between modelling groups. For example, the Hadley Centre and IPSL both use the OASIS coupler.

There are benefits and drawbacks to the rising overlap and “modularity” of Earth system models. One could argue that it makes the models less independent. If they all agree closely, how much of that agreement is due to their physical grounding in reality, and how much is due to the fact that they all use a lot of the same code? However, modularity is clearly a more efficient process for model development. It allows larger communities of scientists from each sub-discipline of Earth system modelling to form, and – in the case of MOM and NEMO – make two or three really good ocean models, instead of a dozen mediocre ones. Concentrating our effort, and reducing unnecessary duplication of code, makes modularity an attractive strategy, if an imperfect one.

The least modular of all the Earth system models I’ve looked at is Model E. The documentation mentions different components for the atmosphere, sea ice, and so on, but these components aren’t separated into subdirectories, and the lines between them are blurry. Nearly all the fortran files sit in the same directory, “model”,  and some of them deal with two or more components. For example, how would you categorize a file that calculates surface-atmosphere fluxes? Even where Model E uses code from other institutions, such as the MOM ocean model, it’s usually adapted and integrated into their own files, rather than in a separate directory.

The most modular Earth system model is probably the Met Office Unified Model. They don’t appear to have adapted NEMO, CICE (the sea ice model from NCAR) and OASIS at all – in fact, they’re not present in the code repository they gave us. I was a bit confused when I discovered that their “ocean” directory, left over from the years when they wrote their own ocean code, was now completely empty! Encapsulation to the point where a component model can be stored completely externally to the structural code was unexpected.

An interesting example of the challenges of modularity appears in sea ice. Do you create a separate, independent sea ice component, like CESM did? Do you consider it part of the ocean, like NEMO? Or do you lump in lake ice along with sea ice and subsequently allow the component to float between the surface and the ocean, like Model E?

The real world isn’t modular. There are no clear boundaries between components on the physical Earth. But then, there’s only one physical Earth, whereas there are many virtual Earths in the form of climate modelling, and limited resources for developing the code in each component. In this spectrum of interconnection and encapsulation, is one end or the other our best bet? Or is there a healthy balance somewhere in the middle?