I have made slight headway regarding my installation of CESM. It still isn’t running, but now it’s not running for a different reason than previously! Progress!
It appears that, at some point while porting, I mangled the scripts/ccsm_utils/Machines/mkbatch.kate
file for my machine such that the actual call to launch the model wasn’t getting copied from mkbatch.kate
to test.kate.run
. A bit of trial and error fixed that problem.
I finally got Torque working. The only reason that jobs were getting stuck in the queue was that I didn’t start the pbs_sched
daemon! It turns out that qsub isn’t related to the problems I was having, and isn’t necessary to run the model, but it’s nice to have it working just in case I need it in the future.
So, with the relevant call in test.kate.run
as
mpiexec -n 16 ./ccsm.exe >&! ccsm.log.$LID
the command line output is
Wed July 6 11:02:33 EDT 2011 -- CSM EXECUTION BEGINS HERE
Wed July 6 11:02:34 EDT 2011 -- CSM EXECUTION HAS FINISHED
ls: No match.
Model did not complete - no cpl.log file present - exiting
The only log file created is ccsm.log
, and it is completely empty.
I have MPICH2 installed, the command mpiexec
seems to work fine, and I have mpd running. Regardless, I tried taking out mpiexec
and calling the executable directly in test.kate.run
:
./ccsm.exe >&! ccsm.log.$LID
The command line output becomes
Wed July 6 11:02:33 EDT 2011 -- CSM EXECUTION BEGINS HERE
Segmentation fault.
Wed July 6 11:02:34 EDT 2011 -- CSM EXECUTION HAS FINISHED
ls: No match.
Model did not complete - no cpl.log file present - exiting
Again, ccsm.log
is empty, and there seems to be no trace of why the model is failing to launch beyond Segmentation fault
. The CESM guide recommends setting the stack size to unlimited, which I did to no avail. Submitting test.kate.run
using qsub produces the same messages, but in the output and error files, rather than the terminal.
Thoughts?