Progress?

I have made slight headway regarding my installation of CESM. It still isn’t running, but now it’s not running for a different reason than previously! Progress!

It appears that, at some point while porting, I mangled the scripts/ccsm_utils/Machines/mkbatch.kate file for my machine such that the actual call to launch the model wasn’t getting copied from mkbatch.kate to test.kate.run. A bit of trial and error fixed that problem.

I finally got Torque working. The only reason that jobs were getting stuck in the queue was that I didn’t start the pbs_sched daemon! It turns out that qsub isn’t related to the problems I was having, and isn’t necessary to run the model, but it’s nice to have it working just in case I need it in the future.

So, with the relevant call in test.kate.run as

mpiexec -n 16 ./ccsm.exe >&! ccsm.log.$LID

the command line output is

Wed July 6 11:02:33 EDT 2011 -- CSM EXECUTION BEGINS HERE
Wed July 6 11:02:34 EDT 2011 -- CSM EXECUTION HAS FINISHED
ls: No match.
Model did not complete - no cpl.log file present - exiting

The only log file created is ccsm.log, and it is completely empty.

I have MPICH2 installed, the command mpiexec seems to work fine, and I have mpd running. Regardless, I tried taking out mpiexec and calling the executable directly in test.kate.run:

./ccsm.exe >&! ccsm.log.$LID

The command line output becomes

Wed July 6 11:02:33 EDT 2011 -- CSM EXECUTION BEGINS HERE
Segmentation fault.
Wed July 6 11:02:34 EDT 2011 -- CSM EXECUTION HAS FINISHED
ls: No match.
Model did not complete - no cpl.log file present - exiting

Again, ccsm.log is empty, and there seems to be no trace of why the model is failing to launch beyond Segmentation fault. The CESM guide recommends setting the stack size to unlimited, which I did to no avail. Submitting test.kate.run using qsub produces the same messages, but in the output and error files, rather than the terminal.

Thoughts?

Advertisement