Climate Models on Ubuntu

Part 1: Model E

I felt a bit over my head attempting to port CESM, so I asked a grad student, who had done his Master’s on climate modelling, for help. He looked at the documentation, scratched his head, and suggested I start with NASA’s Model E instead, because it was easier to install. And was it ever! We had it up and running within an hour or so. It was probably so much easier because Model E comes with gfortran support, while CESM only has scripts written for commercial compilers like Intel or PGI.

Strangely, when using Model E, no matter what dates the rundeck sets for the simulation start and end, the subsequently generated I file always has December 1, 1949 as the start date and December 2, 1949 as the end date. We edited the I files after they were created, which seemed to fix the problem, but it was still kind of weird.

I set up Model E to run a ten-year simulation with fixed atmospheric concentration (really, I just picked a rundeck at random) over the weekend. It took it about 3 days to complete, so just over 7 hours per year of simulation time…not bad for a 32-bit desktop!

However, I’m having some weird problems with the output – after configuring the model to output files in NetCDF format and opening them in Panoply, only the file with all the sea ice variables worked. All the others either gave a blank map (array full of N/A’s) or threw errors when Panoply tried to read them. Perhaps the model isn’t enjoying having the I file edited?

Part 2: CESM

After exploring Model E, I felt like trying my hand at CESM again. Steve managed to port it onto his Macbook last year, and took detailed notes. Editing the scripts didn’t seem so ominous this time!

The CESM code can be downloaded using Subversion (instructions here) after a quick registration. Using the Ubuntu Software Center, I downloaded some necessary packages: libnetcdf-dev, mpich2, and torque-scheduler. I already had gfortran, which is sort of essential.

I used the Porting via user defined machine files method to configure the model for my machine, using the Hadley scripts as a starting point. Variables for the config_machines.xml are explained in Appendix D through H of the user’s guide (links in chapter 7). Mostly, you’re just pointing to folders where you want to store data and files. Here are a few exceptions:

  • DOUT_L_HTAR: I stuck with "TRUE", as that was the default.
  • CCSM_CPRNC: this tool already exists in the CESM source code, in /models/atm/cam/tools/cprnc.
  • BATCHQUERY and BATCHSUBMIT: the Hadley entry had “qstat” and “qsub”, respectively, so I Googled these terms to find out which batch submission software they referred to (Torque, which is freely available in the torque-scheduler package) and downloaded it so I could keep the commands the same!
  • GMAKE_J: this determines how many processors to commit to a certain task, and I wasn’t sure how many this machine had, so I just put “1”.
  • MAX_TASKS_PER_NODE: I chose "8", which the user’s guide had mentioned as an example.
  • MPISERIAL_SUPPORT: the default is “FALSE”.

The only file that I really needed to edit was Macros.<machine name>. The env_machopts.<machine name> file ended up being empty for me. I spent a while confused by the modules declarations, which turned out to refer to the Environment Modules software. Once I realized that, for this software to be helpful, I would have to write five or six modulefiles in a language I didn’t know, I decided that it probably wasn’t worth the effort, and took these declarations out. I left mkbatch.<machine name> alone, except for the first line which sets the machine, and then turned my attention to Macros.

“Getting this to work will be an iterative process”, the user’s guide says, and it certainly was (and still is). It’s never a good sign when the installation guide reminds you to be patient! Here is the sequence of each iteration:

  1. Edit the Macros file as best I can.
  2. Open up the terminal, cd to cesm1_0/scripts, and create a new case as follows: ./create_newcase -case test -res f19_g16 -compset X -mach <machine name>
  3. If this works, cd to test, and run configure: ./configure -case
  4. If all is well, try to build the case: ./test.<machine name>.build
  5. See where it fails and read the build log file it refers to for ideas as to what went wrong. Search on Google for what certain errors mean. Do some other work for a while, to let the ideas simmer.
  6. Set up for the next case: ./test.<machine name>.clean_build , cd .., and rm -rf test. This clears out old files so you can safely build a new case with the same name.
  7. See step 1.

I wasn’t really sure what the program paths were, as I couldn’t find a nicely contained folder for each one (like Windows has in “Program Files”), but I soon stumbled upon a nice little trick: look up the package on Ubuntu Package Manager, and click on “list of files” under the Download section. That should tell you what path the program used as its root.

I also discovered that setting FC and CC to gfortran and gcc, respectively, in the Macros file will throw errors. Instead, leave the variables as mpif90 and mpicc, which are linked to the GNU compilers. For example, when I type mpif90 in the terminal, the result is gfortran: no input files, just as if I had typed gfortran. For some reason, though, the errors go away.

As soon as I made it past building the mct and pio libraries, the build logs for each component (eg atm, ice) started saying gmake: command not found. This is one of the pitfalls of Ubuntu: it uses the command make for the same program that basically every other Unix-based OS calls gmake. So I needed to find and edit all the scripts that called gmake, or generated other scripts that called it, and so on. “There must be a way to automate this,” I thought, and from this article I found out how. In the terminal, cd to the CESM source code folder, and type the following:

grep -lr -e 'gmake' * | xargs sed -i 's/gmake/make/g'

You should only have to do this once. It’s case sensitive, so it will leave the xml variable GMAKE_J alone.

Then I turned my attention to compiler flags, which Steve chronicled quite well in his notes (see link above). I made most of the same changes that he did, except I didn’t need to change -DLINUX to -DDarwin. However, I needed some more compiler flags still. In the terminal, man gfortran brings up a list of all the options for gfortran, which was helpful.

The ccsm build log had hundreds of undefined reference errors as soon as it started to compile fortran. The way I understand it, many of the fortran files reference each other, but gfortran likes to append underscores to user-defined variables, and then it can’t find the file the variable is referencing! You can suppress this using the flag -fno-underscoring.

Now I am stuck on a new error. It looks like the ccsm script is almost reaching the end, as it’s using ld, the gcc linking mechanism, to tie all the files together. Then the build log says:

/usr/bin/ld: seq_domain_mct.o(.debug_info+0x1c32): unresolvable R_386_32 relocation against symbol 'mpi_fortran_argv_null'
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: ld returned 1 exit status

I’m having trouble finding articles on the internet about similar errors, and the gcc and ld manpages are so long that trying every compiler flag isn’t really an option. Any ideas?

Update: Fixed it! In scripts/ccsm_utils/Build/Makefile, I changed LD := $(F90) to LD := gcc -shared. The build was finally successful! Now off to try and run it…

The good thing is that, since I re-started this project a few days ago, I haven’t spent very long stuck on any one error. I’m constantly having problems, but I move through them pretty quickly! In the meantime, I’m learning a lot about the model and how it fits everything together during installation. I’ve also come a long way with Linux programming in general. Considering that when I first installed Ubuntu a few months ago, and sheepishly called my friend to ask where to find the command line, I’m quite proud of my progress!

I hope this article will help future Ubuntu users install CESM, as it seems to have a few quirks that even Mac OS X doesn’t experience (eg make vs gmake). For the rest of you, apologies if I have bored you to tears!

Tornadoes and Climate Change

Cross-posted from NextGen Journal

It has been a bad season for tornadoes in the United States. In fact, this April shattered the previous record for the most tornadoes ever. Even though the count isn’t finalized yet, nobody doubts that it will come out on top:

In a warming world, many questions are common, and quite reasonable. Is this a sign of climate change? Will we experience more, or stronger, tornadoes as the planet warms further?

In fact, these are very difficult questions to answer. First of all, attributing a specific weather event, or even a series of weather events, to a change in the climate is extremely difficult. Scientists can do statistical analysis to estimate the probability of the event with and without the extra energy available in a warming world, but this kind of study takes years. Even so, nobody can say for certain whether an event wasn’t just a fluke. The recent tornadoes very well might have been caused by climate change, but they also might have happened anyway.

Will tornadoes become more common in the future, as global warming progresses? Tornado formation is complicated, and forecasting them requires an awful lot of calculations. Many processes in the climate system are this way, so scientists simulate them using computer models, which can do detailed calculations at an increasingly impressive speed.

However, individual tornadoes are relatively small compared to other kinds of storms, such as hurricanes or regular rainstorms. They are, in fact, smaller than a single square in the highest-resolution climate models around today. Therefore, it’s just not possible to directly project them using mathematical models.

However, we can project the conditions necessary for tornadoes to form. They don’t always lead to a tornado, but they make one more likely. Two main factors exist: high wind shear and high convective available potential energy (CAPE). Climate change is making the atmosphere warmer, and increasing specific humidity (but not relative humidity): both of these contribute to CAPE, so that factor will increase the likelihood of conditions favourable to tornadoes. However, climate change warms the poles faster than the equator, which will decrease the temperature difference between them, subsequently lowering wind shear. That will make tornadoes less likely (Diffenbaugh et al, 2008). Which factor will win out? Is there another factor involved that climate change could impact? Will we get more tornadoes in some areas and less in others? Will we get weaker tornadoes or stronger tornadoes? It’s very difficult to tell.

In 2007, NASA scientists used a climate model to project changes in severe storms, including tornadoes. (Remember, even though an individual tornado can’t be represented on a model, the conditions likely to cause a tornado can.) They predicted that the future will bring fewer storms overall, but that the ones that do form will be stronger. A plausible solution to the question, although not a very comforting one.

With uncertain knowledge, how should we approach this issue? Should we focus on the comforting possibility that the devastation in the United States might have nothing to do with our species’ actions? Or should we acknowledge that we might bear responsibility? Dr. Kevin Trenberth, a top climate scientist at the National Center for Atmospheric Research (NCAR), thinks that ignoring this possibility until it’s proven is a bad idea. “It’s irresponsible not to mention climate change,” he writes.

Learning Experiences

I apologize for my brief hiatus – it’s been almost two weeks since I’ve posted. I have been very busy recently, but for a very exciting reason: I got a job as a summer student of Dr. Steve Easterbrook! You can read more about Steve and his research on his faculty page and blog.

This job required me to move cities for the summer, so my mind has been consumed with thoughts such as “Where am I and how do I get home from this grocery store?” rather than “What am I going to write a post about this week?” However, I have had a few days on the job now, and as Steve encourages all of his students to blog about their research, I will use this outlet to periodically organize my thoughts.

I will be doing some sort of research project about climate modelling this summer – we’re not yet sure exactly what, so I am starting by taking a look at the code for some GCMs. The NCAR Community Earth System Model is one of the easiest to access, as it is largely an open source project. I’ve only read through a small piece of their atmosphere component, but I’ve already seen more physics calculations in one place than ever before.

I quickly learned that trying to understand every line of the code is a silly goal, as much as I may want to. Instead, I’m trying to get a broader picture of what the programs do. It’s really neat to have my knowledge about different subjects converge so completely. Multi-dimensional arrays, which I have previously only used to program games of Sudoku and tic-tac-toe, are now being used to represent the entire globe. Electric potential, a property I last studied in the circuitry unit of high school physics, somehow impacts atmospheric chemistry. The polar regions, which I was previously fascinated with mainly for their wildlife, also present interesting mathematical boundary cases for a climate model.

It’s also interesting to see how the collaborative nature of CESM, written by many different authors and designed for many different purposes, impacts its code. Some of the modules have nearly a thousand lines of code, and some have only a few dozen – it all depends on the programming style of the various authors. The commenting ranges from extensive to nonexistent. Every now and then one of the files will be written in an older version of Fortran, where EVERYTHING IS IN UPPER CASE.

I am bewildered by most of the variable names. They seem to be collections of abbreviations I’m not familiar with. Some examples are “mxsedfac”, “lndmaxjovrdmdni”, “fxdd”, and “vsc_knm_atm”.

When we get a Linux machine set up (I have heard too many horror stories to attempt a dual-boot with Windows) I am hoping to get a basic CESM simulation running, as well as EdGCM (this could theoretically run on my laptop, but I prefer to bring that home with me each evening, and the simulation will probably take over a day).

I am also doing some background reading on the topic of climate modelling, including this book, which led me to the story of PHONIAC. The first weather prediction done on a computer (the ENIAC machine) was recreated as a smartphone application, and ran approximately 3 million times faster. Unfortunately, I can’t find anyone with a smartphone that supports Java (argh, Apple!) so I haven’t been able to try it out.

I hope everyone is having a good summer so far. A more traditional article about tornadoes will be coming at the end of the week.

What Kevin Trenberth Has to Say

A comment from Steve Bloom several months ago got me thinking about a new kind of post that would be a lot of fun: interviewing top climate scientists, both on their research and their views of climate science journalism and communication. When I emailed Dr. Kevin Trenberth to see if he would be interested in such an interview, he responded with an entire essay that he had written about recent events in climate change communication. Although this essay is unpublished as of yet, he graciously suggested that I quote it for a post here.

It’s no surprise that Dr. Trenberth, head of the Climate Analysis Section at the National Center for Atmospheric Research in Colorado, is angry about the way stolen emails between researchers were trumpeted around the world in an attempt to make them seem like something they were not. He was “involved in just over 100” of the emails, and from the looks of things, hasn’t heard the end of it since they were stolen.

One oft-quoted statement of his went viral: The fact is that we can’t account for the lack of warming at the moment and it is a travesty that we can’t. Climate change deniers portrayed this quote as an admission that the world wasn’t warming after all, or even that scientists were trying to cover up a cooling trend. Taken in the full context of the email in which it was written, however, it’s clear that Trenberth was referring to a recent paper of his, which discussed our incomplete understanding of the factors affecting short-term variability in the Earth’s temperature. There were a couple years between 2004 and 2008 that weren’t quite as warm as scientists expected after looking at all the forcings, such as solar irradiance and ENSO. The paper and the subsequent email in no way mean that global warming has stopped. In fact, we’re well on our way to the warmest year on record. “It is amazing to see this particular quote lambasted so often,” says Trenberth.

Another quote, this time from a stolen email he was not even a recipient of, was written by Phil Jones, the director of CRU. I can’t see either of these papers being in the next IPCC report, wrote Jones, referring to several studies that were not regarded very highly by the climate science community, one of which was later retracted. Kevin and I will keep them out somehow – even if we have to redefine what the peer-reviewed literature is!

Dr. Trenberth offers an insight for this comment that was previously unknown to me. The IPCC’s 2007 report “was the first time Jones was on the writing team of an IPCC Assessment,” he says. “The comment was naive and sent before he understood the process and before any lead author meetings were held…As a veteran of 3 previous IPCC assessments, I was well aware that we do not keep any papers out, and none were kept out.” Indeed, both studies were discussed in the 2007 report, offering proof that the private emails of scientists do not always correspond to their ultimate actions.

To date, four independent investigations (five if you count the two Penn State reports as separate) “have confirmed what climate scientists have never seriously doubted: established scientists depend on their credibility and have no motivation in purposely misleading the public and their colleagues.” Referring to the only major criticism that the investigations had for CRU, Trenberth notes that scientists “are also understandably, but inadvisably, reluctant to share complex data sets with non-experts that they perceive as charlatans.”

Despite the complete absence of evidence for scientific fraud, the fact that no papers were changed or retracted due to these emails, and the obvious innocence of scientists like Dr. Trenberth, public confusion over climate change has grown in recent months. Almost everyone who keeps up with the news will remember hearing something about climate researchers accused of malpractice. “There should be condemnation of the abuse, misuse and downright lies about the emails,” says Trenberth. “That should be the real ClimateGate!”

After all this experience as the subject of libelous attacks and campaigns of misinformation, Kevin Trenberth can offer suggestions for other scientists in the same position. He does not recommend debating the conclusions of climate change research in the public sphere, as “scientific facts are not open to debate and opinion because they are evidence and/or physically based.” He has learned, like so many of us here at ClimateSight, that “in a debate it is impossible to counter lies [and] loudly proclaimed confident statements that often have little or no basis.”

“Moreover,” he adds, “a debate actually gives alternative views credibility,” something that climate change deniers haven’t earned. He and his colleagues “find it disturbing that blogs by uninformed members of the public are given equal weight with carefully researched information backed up with extensive observational facts and physical understanding.”

Much of the online climate change community has lost faith with climate journalism in recent months, and Dr. Trenberth is no exception. He asserts that the mass media has been “complicit in this disinformation campaign of the deniers”, and has some explanations as to why. “Climate varies slowly,” he says, “and so the message remains similar, year after year — something not exciting for journalists as it is not “news”.” He also notes the stubborn phenomenon of artificial balance, as “controversy is the fodder of the media, not truth, and so the media amplify the view that there are two sides and give unwarranted attention to views of a small minority or those with vested interests or ideologies.”

“The media are a part of the problem,” says Trenberth. “But they have to be part of the solution.”