Feeds:
Posts
Comments

Posts Tagged ‘programming’

Almost four years ago I took a job as a summer student of Dr. Steve Easterbrook, in the software engineering lab of the University of Toronto. This was my first time taking part in research, but also my first time living away from home and my first time using a Unix terminal (both of which are challenging, but immensely rewarding, life skills).

While working with Steve I discovered that climate model output is really pretty (an opinion which hasn’t changed in the four years since) and that climate models are really hard to install (that hasn’t changed either).

At Steve’s suggestion I got a hold of the code for various climate models and started to pick it apart. By the end of the summer I had created a series of standardised diagrams showing the software architecture of each model.

These diagrams proved to be really useful communication tools: we presented our work at AGU the following December, and at NCAR about a year after that, to very positive feedback. Many climate modellers we met at these conferences were pleased to have a software diagram of the model they used (which is very useful to show during presentations), but they were generally more interested in the diagrams for other models, to see how other research groups used different software structures to solve the same problems. “I had no idea they did it like that,” was a remark I heard more than a few times.

Between my undergrad and PhD, I went back to Toronto for a month where I analysed the model code more rigorously. We made a new set of diagrams which was more accurate: the code was sorted into components based on dependency structure, and the area of each component in a given diagram was exactly proportional to the line count of its source code.

Here is the diagram we made for the GFDL-ESM2M model, which is developed at the Geophysical Fluid Dynamics Laboratory in Princeton:

We wrote this all up into a paper, submitted it to GMD, and after several months and several rounds of revision it was published just yesterday! The paper is open access, and can be downloaded for free here. It’s my first paper as lead author which is pretty exciting.

I could go on about all the interesting things we discovered while comparing the diagrams, but that’s all in the paper. Instead I wanted to talk about what’s not in the paper: the story of the long and winding journey we took to get there, from my first day as a nervous summer student in Toronto to the final publication yesterday. These are the stories you don’t read about in scientific papers, which out of necessity detail the methodology as if the authors knew exactly where they were going and got there using the shortest possible path. Science doesn’t often work like that. Science is about messing around and exploring and getting a bit lost and eventually figuring it out and feeling like a superhero when you do. And then writing it up as if it was easy.

I also wanted to express my gratitude to Steve, who has been an amazing source of support, advice, conversations, book recommendations, introductions to scientists, and career advice. I’m so happy that I got to be your student. See you before long on one continent or another!

Read Full Post »

This winter I took a course in computational physics, which has probably been my favourite undergraduate course to date. Essentially it was an advanced numerical methods course, but from a very practical point of view. We got a lot of practice using numerical techniques to solve realistic problems, rather than just analysing error estimates and proving conditions of convergence. As a math student I found this refreshing, and incredibly useful for my research career.

We all had to complete a term project of our choice, and I decided to build a small climate model. I was particularly interested in the stochastic techniques taught in the course, and given that modern GCMs and EMICs are almost entirely deterministic, it was possible that I could contribute something original to the field.

The basic premise of my model is this: All anthropogenic forcings are deterministic, and chosen by the user. Everything else is determined stochastically: parameters such as climate sensitivity are sampled from probability distributions, whereas natural forcings are randomly generated but follow the same general pattern that exists in observations. The idea is to run this model with the same anthropogenic input hundreds of times and build up a probability distribution of future temperature trajectories. The spread in possible scenarios is entirely due to uncertainty in the natural processes involved.

This approach mimics the real world, because the only part of the climate system we have full control over is our own actions. Other influences on climate are out of our control, sometimes poorly understood, and often unpredictable. It is just begging to be modelled as a stochastic system. (Not that it is actually stochastic, of course; in fact, I understand that nothing is truly stochastic, even random number generators – unless you can find a counterexample using quantum mechanics? But that’s a discussion for another time.)

A word of caution: I built this model in about eight weeks. As such, it is highly simplified and leaves out a lot of processes. You should never ever use it for real climate projections. This project is purely an exercise in numerical methods, and an exploration of the possible role of stochastic techniques in climate modelling.

Over the coming weeks, I will write a series of posts that explains each component of my simple stochastic climate model in detail. I will show the results from some sample simulations, and discuss how one might apply these stochastic techniques to existing GCMs. I also plan to make the code available to anyone who’s interested – it’s written in Matlab, although I might translate it to a free language like Python, partly because I need an excuse to finally learn Python.

I am very excited to finally share this project with you all! Check back soon for the next installment.

Read Full Post »

Today my very first scientific publication is appearing in Geophysical Research Letters. During my summer at UVic, I helped out with a model intercomparison project regarding the effect of climate change on Atlantic circulation, and was listed as a coauthor on the resulting paper. I suppose I am a proper scientist now, rather than just a scientist larva.

The Atlantic meridional overturning circulation (AMOC for short) is an integral part of the global ocean conveyor belt. In the North Atlantic, a massive amount of water near the surface, cooling down on its way to the poles, becomes dense enough to sink. From there it goes on a thousand-year journey around the world – inching its way along the bottom of the ocean, looping around Antarctica – before finally warming up enough to rise back to the surface. A whole multitude of currents depend on the AMOC, most famously the Gulf Stream, which keeps Europe pleasantly warm.

Some have hypothesized that climate change might shut down the AMOC: the extra heat and freshwater (from melting ice) coming into the North Atlantic could conceivably lower the density of surface water enough to stop it sinking. This happened as the world was coming out of the last ice age, in an event known as the Younger Dryas: a huge ice sheet over North America suddenly gave way, drained into the North Atlantic, and shut down the AMOC. Europe, cut off from the Gulf Stream and at the mercy of the ice-albedo feedback, experienced another thousand years of glacial conditions.

A shutdown today would not lead to another ice age, but it could cause some serious regional cooling over Europe, among other impacts that we don’t fully understand. Today, though, there’s a lot less ice to start with. Could the AMOC still shut down? If not, how much will it weaken due to climate change? So far, scientists have answered these two questions with “probably not” and “something like 25%” respectively. In this study, we analysed 30 climate models (25 complex CMIP5 models, and 5 smaller, less complex EMICs) and came up with basically the same answer. It’s important to note that none of the models include dynamic ice sheets (computational glacial dynamics is a headache and a half), which might affect our results.

Models ran the four standard RCP experiments from 2006-2100. Not every model completed every RCP, and some extended their simulations to 2300 or 3000. In total, there were over 30 000 model years of data. We measured the “strength” of the AMOC using the standard unit Sv (Sverdrups), where each Sv is 1 million cubic metres of water per second.

Only two models simulated an AMOC collapse, and only at the tail end of the most extreme scenario (RCP8.5, which quite frankly gives me a stomachache). Bern3D, an EMIC from Switzerland, showed a MOC strength of essentially zero by the year 3000; CNRM-CM5, a GCM from France, stabilized near zero by 2300. In general, the models showed only a moderate weakening of the AMOC by 2100, with best estimates ranging from a 22% drop for RCP2.6 to a 40% drop for RCP8.5 (with respect to preindustrial conditions).

Are these somewhat-reassuring results trustworthy? Or is the Atlantic circulation in today’s climate models intrinsically too stable? Our model intercomparison also addressed that question, using a neat little scalar metric known as Fov: the net amount of freshwater travelling from the AMOC to the South Atlantic.

The current thinking in physical oceanography is that the AMOC is more or less binary – it’s either “on” or “off”. When AMOC strength is below a certain level (let’s call it A), its only stable state is “off”, and the strength will converge to zero as the currents shut down. When AMOC strength is above some other level (let’s call it B), its only stable state is “on”, and if you were to artificially shut it off, it would bounce right back up to its original level. However, when AMOC strength is between A and B, both conditions can be stable, so whether it’s on or off depends on where it started. This phenomenon is known as hysteresis, and is found in many systems in nature.

This figure was not part of the paper. I made it just now in MS Paint.

Here’s the key part: when AMOC strength is less than A or greater than B, Fov is positive and the system is monostable. When AMOC strength is between A and B, Fov is negative and the system is bistable. The physical justification for Fov is its association with the salt advection feedback, the sign of which is opposite Fov: positive Fov means the salt advection feedback is negative (i.e. stabilizing the current state, so monostable); a negative Fov means the salt advection feedback is positive (i.e. reinforcing changes in either direction, so bistable).

Most observational estimates (largely ocean reanalyses) have Fov as slightly negative. If models’ AMOCs really were too stable, their Fov‘s should be positive. In our intercomparison, we found both positives and negatives – the models were kind of all over the place with respect to Fov. So maybe some models are overly stable, but certainly not all of them, or even the majority.

As part of this project, I got to write a new section of code for the UVic model, which calculated Fov each timestep and included the annual mean in the model output. Software development on a large, established project with many contributors can be tricky, and the process involved a great deal of head-scratching, but it was a lot of fun. Programming is so satisfying.

Beyond that, my main contribution to the project was creating the figures and calculating the multi-model statistics, which got a bit unwieldy as the model count approached 30, but we made it work. I am now extremely well-versed in IDL graphics keywords, which I’m sure will come in handy again. Unfortunately I don’t think I can reproduce any figures here, as the paper’s not open-access.

I was pretty paranoid while coding and doing calculations, though – I kept worrying that I would make a mistake, never catch it, and have it dredged out by contrarians a decade later (“Kate-gate”, they would call it). As a climate scientist, I suppose that comes with the job these days. But I can live with it, because this stuff is just so darned interesting.

Read Full Post »

Near the end of my summer at the UVic Climate Lab, all the scientists seemed to go on vacation at the same time and us summer students were left to our own devices. I was instructed to teach Jeremy, Andrew Weaver’s other summer student, how to use the UVic climate model – he had been working with weather station data for most of the summer, but was interested in Earth system modelling too.

Jeremy caught on quickly to the basics of configuration and I/O, and after only a day or two, we wanted to do something more exciting than the standard test simulations. Remembering an old post I wrote, I dug up this paper (open access) by Damon Matthews and Ken Caldeira, which modelled geoengineering by reducing incoming solar radiation uniformly across the globe. We decided to replicate their method on the newest version of the UVic ESCM, using the four RCP scenarios in place of the old A2 scenario. We only took CO2 forcing into account, though: other greenhouse gases would have been easy enough to add in, but sulphate aerosols are spatially heterogeneous and would complicate the algorithm substantially.

Since we were interested in the carbon cycle response to geoengineering, we wanted to prescribe CO2 emissions, rather than concentrations. However, the RCP scenarios prescribe concentrations, so we had to run the model with each concentration trajectory and find the equivalent emissions timeseries. Since the UVic model includes a reasonably complete carbon cycle, it can “diagnose” emissions by calculating the change in atmospheric carbon, subtracting contributions from land and ocean CO2 fluxes, and assigning the residual to anthropogenic sources.

After a few failed attempts to represent geoengineering without editing the model code (e.g., altering the volcanic forcing input file), we realized it was unavoidable. Model development is always a bit of a headache, but it makes you feel like a superhero when everything falls into place. The job was fairly small – just a few lines that culminated in equation 1 from the original paper – but it still took several hours to puzzle through the necessary variable names and header files! Essentially, every timestep the model calculates the forcing from CO2 and reduces incoming solar radiation to offset that, taking changing planetary albedo into account. When we were confident that the code was working correctly, we ran all four RCPs from 2006-2300 with geoengineering turned on. The results were interesting (see below for further discussion) but we had one burning question: what would happen if geoengineering were suddenly turned off?

By this time, having completed several thousand years of model simulations, we realized that we were getting a bit carried away. But nobody else had models in the queue – again, they were all on vacation – so our simulations were running three times faster than normal. Using restart files (written every 100 years) as our starting point, we turned off geoengineering instantaneously for RCPs 6.0 and 8.5, after 100 years as well as 200 years.

Results

Similarly to previous experiments, our representation of geoengineering still led to sizable regional climate changes. Although average global temperatures fell down to preindustrial levels, the poles remained warmer than preindustrial while the tropics were cooler:

Also, nearly everywhere on the globe became drier than in preindustrial times. Subtropical areas were particularly hard-hit. I suspect that some of the drying over the Amazon and the Congo is due to deforestation since preindustrial times, though:

Jeremy also made some plots of key one-dimensional variables for RCP8.5, showing the results of no geoengineering (i.e. the regular RCP – yellow), geoengineering for the entire simulation (red), and geoengineering turned off in 2106 (green) or 2206 (blue):

It only took about 20 years for average global temperature to fall back to preindustrial levels. Changes in solar radiation definitely work quickly. Unfortunately, changes in the other direction work quickly too: shutting off geoengineering overnight led to rates of warming up to 5 C / decade, as the climate system finally reacted to all the extra CO2. To put that in perspective, we’re currently warming around 0.2 C / decade, which far surpasses historical climate changes like the Ice Ages.

Sea level rise (due to thermal expansion only – the ice sheet component of the model isn’t yet fully implemented) is directly related to temperature, but changes extremely slowly. When geoengineering is turned off, the reversals in sea level trajectory look more like linear offsets from the regular RCP.

Sea ice area, in contrast, reacts quite quickly to changes in temperature. Note that this data gives annual averages, rather than annual minimums, so we can’t tell when the Arctic Ocean first becomes ice-free. Also, note that sea ice area is declining ever so slightly even with geoengineering – this is because the poles are still warming a little bit, while the tropics cool.

Things get really interesting when you look at the carbon cycle. Geoengineering actually reduced atmospheric CO2 concentrations compared to the regular RCP. This was expected, due to the dual nature of carbon cycle feedbacks. Geoengineering allows natural carbon sinks to enjoy all the benefits of high CO2 without the associated drawbacks of high temperatures, and these sinks become stronger as a result. From looking at the different sinks, we found that the sequestration was due almost entirely to the land, rather than the ocean:

In this graph, positive values mean that the land is a net carbon sink (absorbing CO2), while negative values mean it is a net carbon source (releasing CO2). Note the large negative spikes when geoengineering is turned off: the land, adjusting to the sudden warming, spits out much of the carbon that it had previously absorbed.

Within the land component, we found that the strengthening carbon sink was due almost entirely to soil carbon, rather than vegetation:

This graph shows total carbon content, rather than fluxes – think of it as the integral of the previous graph, but discounting vegetation carbon.

Finally, the lower atmospheric CO2 led to lower dissolved CO2 in the ocean, and alleviated ocean acidification very slightly. Again, this benefit quickly went away when geoengineering was turned off.

Conclusions

Is geoengineering worth it? I don’t know. I can certainly imagine scenarios in which it’s the lesser of two evils, and find it plausible (even probable) that we will reach such a scenario within my lifetime. But it’s not something to undertake lightly. As I’ve said before, desperate governments are likely to use geoengineering whether or not it’s safe, so we should do as much research as possible ahead of time to find the safest form of implementation.

The modelling of geoengineering is in its infancy, and I have a few ideas for improvement. In particular, I think it would be interesting to use a complex atmospheric chemistry component to allow for spatial variation in the forcing reduction through sulphate aerosols: increase the aerosol optical depth over one source country, for example, and let it disperse over time. I’d also like to try modelling different kinds of geoengineering – sulphate aerosols as well as mirrors in space and iron fertilization of the ocean.

Jeremy and I didn’t research anything that others haven’t, so this project isn’t original enough for publication, but it was a fun way to stretch our brains. It was also a good topic for a post, and hopefully others will learn something from our experiments.

Above all, leave over-eager summer students alone at your own risk. They just might get into something like this.

Read Full Post »

Cross-posted from NextGen Journal

Ask most people to picture a scientist at work, and they’ll probably imagine someone in a lab coat and safety goggles, surrounded by test tubes and Bunsen burners. If they’re fans of The Big Bang Theory, maybe they’ll picture complicated equations being scribbled on whiteboards. Others might think of the Large Hadron Collider, or people wading through a swamp taking water samples.

All of these images are pretty accurate – real scientists, in one field or another, do these things as part of their job. But a large and growing approach to science, which is present in nearly every field, replaces the lab bench or swamp with a computer. Mathematical modelling, which essentially means programming the complicated equations from the whiteboard into a computer and solving them many times, is the science of today.

Computer models are used for all sorts of research questions. Epidemiologists build models of an avian flu outbreak, to see how the virus might spread through the population. Paleontologists build biomechanical models of different dinosaurs, to figure out how fast they could run or how high they could stretch their necks. I’m a research student in climate science, where we build models of the entire planet, to study the possible effects of global warming.

All of these models simulate systems which aren’t available in the real world. Avian flu hasn’t taken hold yet, and no sane scientist would deliberately start an outbreak just so they could study it! Dinosaurs are extinct, and playing around with their fossilized bones to see how they might move would be heavy and expensive. Finally, there’s only one Earth, and it’s currently in use. So models don’t replace lab and field work – rather, they add to it. Mathematical models let us perform controlled experiments that would otherwise be impossible.

If you’re interested in scientific modelling, spend your college years learning a lot of math, particularly calculus, differential equations, and numerical methods. The actual application of the modelling, like paleontology or climatology, is less important for now – you can pick that up later, or read about it on your own time. It might seem counter-intuitive to neglect the very system you’re planning to spend your life studying, but it’s far easier this way. A few weeks ago I was writing some computer code for our lab’s climate model, and I needed to calculate a double integral of baroclinic velocity in the Atlantic Ocean. I didn’t know what baroclinic velocity was, but it only took a few minutes to dig up a paper that defined it. My work would have been a lot harder if, instead, I hadn’t known what a double integral was.

It’s also important to become comfortable with computer programming. You might think it’s just the domain of software developers at Google or Apple, but it’s also the main tool of scientists all over the world. Two or three courses in computer science, where you’ll learn a multi-purpose language like C or Java, are all you need. Any other languages you need in the future will take you days, rather than months, to master. If you own a Mac or run Linux on a PC, spend a few hours learning some basic UNIX commands – it’ll save you a lot of time down the road. (Also, if the science plan falls through, computer science is one of the only majors which will almost definitely get you a high-paying job straight out of college.)

Computer models might seem mysterious, or even untrustworthy, when the news anchor mentions them in passing. In fact, they’re no less scientific than the equations that Sheldon Cooper scrawls on his whiteboard. They’re just packaged together in a different form.

Read Full Post »

  1. Scientists do not blindly trust their own models of global warming. In fact, nobody is more aware of a model’s specific weaknesses than the developers themselves. Most of our time is spent comparing model output to observations, searching for discrepancies, and hunting down bugs.
  2. If 1.5 C global warming above preindustrial temperatures really does represent the threshold for “dangerous climate change” (rather than 2 C, as some have argued), then we’re in trouble. Stabilizing global temperatures at this level isn’t just climatically difficult, it’s also mathematically difficult. Given current global temperatures, and their current rate of change, it’s nearly impossible to smoothly extend the curve to stabilize at 1.5 C without overshooting.
  3. Sometimes computers do weird things. Some bugs appear for the most illogical reasons (last week, the act of declaring a variable altered every single metric of the model output). Other bugs show up once, then disappear before you can track down the source, and you’re never able to reproduce them. It’s not uncommon to fix a problem without ever understanding why the problem occurred in the first place.
  4. For anyone working with climate model output, one of the best tools to have in your arsenal is the combination of IDL and NetCDF. Hardly an hour of work goes by when I don’t use one or both of these programming tools in some way.
  5. Developing model code for the first time is a lot like moving to a new city. At first you wander around aimlessly, clutching your map and hesitantly asking for directions. Then you begin to recognize street names and orient yourself around landmarks. Eventually you’re considered a resident of the city, as your little house is there on the map with your name on it. You feel inordinately proud of the fact that you managed to build that house without burning the entire city down in the process.
  6. The RCP 8.5 scenario is really, really scary. Looking at the output from that experiment is enough to give me a stomachache. Let’s just not let that scenario happen, okay?
  7. It’s entirely possible to get up in the morning and just decide to be enthusiastic about your work. You don’t have to pretend, or lie to yourself – all you do is consciously choose to revel in the interesting discoveries, and to view your setbacks as challenges rather than chores. It works really well, and everything is easier and more fun as a result.
  8. Climate models are fabulous experimental subjects. If you run the UVic model twice with the same code, data, options, and initial state, you get exactly the same results. (I’m not sure if this holds for more complex GCMs which include elements of random weather variation.) For this reason, if you change one factor, you can be sure that the model is reacting only to that factor. Control runs are completely free of external influences, and deconstructing confounding variables is only a matter of CPU time. Most experimental scientists don’t have this element of perfection in their subjects – it makes me feel very lucky.
  9. The permafrost is in big trouble, and scientists are remarkably calm about it.
  10. Tasks that seem impossible at first glance are often second nature by the end of the day. No bug lasts forever, and no problem goes unsolved if you exert enough effort.

Read Full Post »

I recently started working for the summer, with Andrew Weaver’s research group at the University of Victoria. If you’re studying climate modelling in Canada, this is the place to be. They are a fairly small group, but continually churn out world-class research.

Many of the projects here use the group’s climate model, the UVic ESCM (Earth System Climate Model). I am working with the ESCM this summer, and have previously read most of the code, so I feel pretty well acquainted with it.

The climate models that most people are familiar with are the really complex ones. GCMs (General Circulation Models or Global Climate Models, depending on who you talk to) use high resolution, a large number of physical processes, and relatively few parameterizations to emulate the climate system as realistically as possible. These are the models that take weeks to run on the world’s most powerful supercomputers.

EMICs (Earth System Models of Intermediate Complexity) are a step down in complexity. They run at a lower resolution than GCMs and have more paramaterizations. Individual storms and wind patterns (and sometimes ocean currents as well) typically are not resolved – instead, the model predicts the statistics of these phenomena. Often, at least one component (such as sea ice) is two-dimensional.

The UVic ESCM is one of the most complex EMICs – it really sits somewhere between a GCM and an EMIC. It has a moderately high resolution, with a grid of 3.6° longitude by 1.8° latitude (ten thousand squares in all), and 19 vertical layers in the ocean. Its ocean, land, and sea ice component would all belong in a GCM. It even has a sediment component, which simulates processes that most GCMs ignore.

The only reason that the UVic model is considered an EMIC is because of its atmosphere component. This part of the model is two-dimensional and parameterizes most processes. For example, clouds aren’t explicitly simulated – instead, as soon as the relative humidity of a region reaches 85%, the atmospheric moisture falls out as rain (or snow). You would never see this kind of atmosphere in a GCM, and it might seem strange for scientists to deliberately build an unrealistic model. However, this simplified atmosphere gives the UVic ESCM a huge advantage over GCMs: speed.

For example, today I tested out the model with an example simulation. It ran on a Linux cluster with 32 cores, which I accessed remotely from a regular desktop. It took about 7 minutes of real time to simulate each year and record annual averages for several dozen variables. In comparison, many GCMs take an entire day of real time to simulate a year, while running on a machine with thousands of cores. Most of this work is coming from the atmospheric component, which requires short time steps. Consequently, cutting down on complexity in the atmosphere gives the best return on model efficiency.

Because the UVic model is so fast, it’s suitable for very long runs. Simulating a century is an “overnight job”, and several millennia is no big deal (especially if you run it on WestGrid). As a result, long-term processes have come to dominate the research in this lab: carbon cycle feedbacks, sensitivity studies, circulation in the North Atlantic. It simply isn’t feasible to simulate these millennial-scale processes on a GCM – so, by sacrificing complexity, we’re able to open up brand new areas of research. Perfectly emulating the real world isn’t actually the goal of most climate modelling.

Of course, the UVic ESCM is imperfect. Like all models, it has its quirks – an absolute surface temperature that’s a bit too low, projections of ocean heat uptake that are a bit too high. It doesn’t give reliable projections of regional climate, so you can only really use globally or hemispherically averaged quantities. It’s not very good at decadal-scale projection. However, other models are suitable for these short-term and small-scale simulations: the same GCMs that suffer when it comes to speed. In this way, climate models perform “division of labour”. By developing many different models of varying complexity, we can make better use of the limited computer power available to us.

I have several projects lined up for the summer, and right now I’m reading a lot of papers to familiarize myself with the relevant sub-fields. There have been some really cool discoveries in the past few years that I wasn’t aware of. I have lots of ideas for posts to write about these papers, as well as the projects I’m involved in, so check back often!

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.

Join 399 other followers