Modularity

I’ve now taken a look at the code and structure of four different climate models: Model E, CESM, UVic ESCM, and the Met Office Unified Model (which contains all the Hadley models). I’m noticing all sorts of similarities and differences, many of which I didn’t expect.

For example, I didn’t anticipate any overlap in climate model components. I thought that every modelling group would build their own ocean, their own atmosphere, and so on, from scratch. In fact, what I think of as a “model” – a self-contained, independent piece of software – applies to components more accurately than it does to an Earth system model. The latter is more accurately described as a collection of models, each representing one piece of the climate system. Each modelling group has a different collection of models, but not every one of these models is unique to their lab.

Ocean models are a particularly good example. The Modular Ocean Model (MOM) is built by GFDL, but it’s also used in NASA’s Model E and the UVic Earth System Climate Model. Another popular ocean model is the Nucleus for European Modelling of the Ocean (NEMO, what a great acronym) which is used by the newer Hadley climate models, as well as the IPSL model from France (which is sitting on my desktop as my next project!)

Aside: Speaking of clever acronyms, I don’t know what the folks at NCAR were thinking when they created the Single Column Atmosphere Model. Really, how did they not see their mistake? And why haven’t Marc Morano et al latched onto this acronym and spread it all over the web by now?

In most cases, an Earth system model has a unique architecture to fit all the component models together – a different coupling process. However, with the rise of standard interfaces like the Earth System Modeling Framework, even couplers can be reused between modelling groups. For example, the Hadley Centre and IPSL both use the OASIS coupler.

There are benefits and drawbacks to the rising overlap and “modularity” of Earth system models. One could argue that it makes the models less independent. If they all agree closely, how much of that agreement is due to their physical grounding in reality, and how much is due to the fact that they all use a lot of the same code? However, modularity is clearly a more efficient process for model development. It allows larger communities of scientists from each sub-discipline of Earth system modelling to form, and – in the case of MOM and NEMO – make two or three really good ocean models, instead of a dozen mediocre ones. Concentrating our effort, and reducing unnecessary duplication of code, makes modularity an attractive strategy, if an imperfect one.

The least modular of all the Earth system models I’ve looked at is Model E. The documentation mentions different components for the atmosphere, sea ice, and so on, but these components aren’t separated into subdirectories, and the lines between them are blurry. Nearly all the fortran files sit in the same directory, “model”,  and some of them deal with two or more components. For example, how would you categorize a file that calculates surface-atmosphere fluxes? Even where Model E uses code from other institutions, such as the MOM ocean model, it’s usually adapted and integrated into their own files, rather than in a separate directory.

The most modular Earth system model is probably the Met Office Unified Model. They don’t appear to have adapted NEMO, CICE (the sea ice model from NCAR) and OASIS at all – in fact, they’re not present in the code repository they gave us. I was a bit confused when I discovered that their “ocean” directory, left over from the years when they wrote their own ocean code, was now completely empty! Encapsulation to the point where a component model can be stored completely externally to the structural code was unexpected.

An interesting example of the challenges of modularity appears in sea ice. Do you create a separate, independent sea ice component, like CESM did? Do you consider it part of the ocean, like NEMO? Or do you lump in lake ice along with sea ice and subsequently allow the component to float between the surface and the ocean, like Model E?

The real world isn’t modular. There are no clear boundaries between components on the physical Earth. But then, there’s only one physical Earth, whereas there are many virtual Earths in the form of climate modelling, and limited resources for developing the code in each component. In this spectrum of interconnection and encapsulation, is one end or the other our best bet? Or is there a healthy balance somewhere in the middle?

10 thoughts on “Modularity

  1. I feel like modularity is the way to go, even in the case where one group has the resources to do it all themselves. It is just good programming practice… as long as the interfaces between the modules are good, the modularity helps with the programming and shouldn’t interfere with the functionality.

    Well, I guess there’s a limit: i’m not going to make a “grid cell X,Y,Z” module: but all atmospheric grid cells should have approximately the same rules, and therefore should be in one module (though maybe stratosphere and troposphere could be separate), all ocean grid cells have a 2nd set of rules, so 2nd module for those. To the extent that lake ice and sea ice obey the same rules, then one module could be used there, but if freshwater ice and saltwater ice have different properties, then different modules would also work.

    Of course, the interfaces are very key: does the ocean-atmosphere interface handle hurricanes and their effects on heat transport into the depths and to the top of the atmosphere? Does the ecosystem module couple appropriately with the soil module and the atmosphere module, especially when it comes to the water, carbon, and nitrogen cycles? etc. etc.

    -M

  2. Is there such a thing as a model programmed using the Object Oriented method?

    Would seem like a good idea.

    Interest has been expressed, such as by Michael Tobis, to write a model in Python, but I’m not sure how much has happened yet. Any news, mt? -Kate

  3. While the goal may be research – the meta goal is mass communication.

    I wish there was a simulation game engine that could use the most current, standardized data.

    It sure would do much to promote greater understanding of climate science – the greatest game idea = simulated reality.

  4. > Single Column Atmosphere Model

    UKMO used to have the Single Column Unified Model :-)

    And indeed, it was rumoured that the Hadley Center for Climate Prediction and Research was going to be for Climate Research And Prediction until…

  5. Moularity makes perfect sense. There’s no point reinventing the wheel, to use a hackneyed phrase. If you’re building a mallet and you have designed a perfectly good shaft, you don’t redesign a new one.

    Naturally, if you can’t recognise a good wheel, or shaft mallet, when it hits you in the face, but you can use the phrase ‘bad design methodology’ in an assertive, confident way, you’re perfectly positioned to increase global FUD.

    As for games: I was thinking just the other day how when I used to play Civilization many moons ago, that the game always got heavy towards the end, when vacant land was hard to find, and dealing with the pollution from the mass of humanity was… not a lot of fun. Some missed lessons there, I feel. I do find it a bit odd that Sid Meier hasn’t taken more advantage of current knowledge (as far as I know) — I would think the time’s about right for a ‘Fall of Civilization’ variant!

    Fate of The World is based on the research of Dr Myles Allen. That game is up to version 1.0.7 now. Robin Tregaskis (at Red Redemption) told me a couple of weeks ago that a major update is imminent, though I don’t know the details.

  6. Theoretically, modularity is certainly the way to go — while the Earth is very complex, you’re not really trying to recreate all the complexities, but merely implementing a sort of simplified view of how the Earth works (and even this simplified view is already very complex).

    But at a practical level, there’s a potential conflict between modularity and computational speed. I suspect the way you mentally divide up a system won’t necessarily the same way you’d organize the computation for optimal speed — and may well be very different — especially when you’re also using specialized hardware and/or software facilities like SciPy or Pentium’s MMX or what not. This is a problem because climate models need lots of cycles to run, and scientists do want their models to run fast so that they can get the results in before the conference deadline or IPCC cut-off date.

    Though maybe there’s already some sort of über-powerful compiler technology out there that already addresses most of this problem…

    (Totally unrelated: via MT’s blog, saw a blog post on the Craigslist Reverse Programmer Troll. :) )

    — frank

  7. Colin Reynolds regarding Civilisation.

    Although I class myself as being ‘green’, whenever I play or played Civilisation I always went for world military domination :-)

    I have Fate of the World, but only played the first scenario so far. The graphics don’t have the same refinement of most modern games, but then it is quite cheap.

  8. well to me the world is somewhat modular, the trouble is the boundaries of the modules are not static, things like stratopause or halocline may move and that makes applying a rigid grid to an earth model a somewhat harder task, I’d imagine? On the other hand, if one would model the moving boundaries, one would have to calculate the grid point volumes again and again, and that doesn’t sound too happy a task either. anyway even some genes are modular in structure so that’s the natural way to go imho. but you’re definitely going into the deep end with this job! looking forward for reports of it and anything else you might write. :-)

Leave a reply to William Connolley Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.