If you have problems during the execution of MRCC, please attach the output with an adequate description of your case as well as the followings:
  • the way mrcc was invoked
  • the way build.mrcc was invoked
  • the output of build.mrcc
  • compiler version (for example: ifort -V, gfortran -v)
  • blas/lapack versions
  • as well as gcc and glibc versions

This information really helps us during troubleshooting :)

Problems requesting large memory for CCSDT(Q)

  • bakowies
  • Topic Author
  • Offline
  • New Member
  • New Member
More
8 years 11 months ago #134 by bakowies
Dear developers,

I have been running into problems with CCSDT(Q) calculations
on closed-shell, neutral molecules, using the stand-alone MRCC
code (V. 2014-07-10; compile log appended; for comments on
the 2015-02-04 version see below) when requesting large amounts
of memory.

If, on one particular machine with 256 GB memory and 24 cores, I
run the sample input of acetylene (below) requesting 60000 MB of
memory, it just runs fine. If I increase the request up to 123000 MB,
it runs fine as well. However, if I go to 124000 MB or beyond, then
the calculation runs into a problem at the very beginning (when
generating an initial SCF guess). Below I include the relevant section
of the output (Sample output 1).

Note that the last 5 lines (2 empty, 3 with messages "Fatal error in...",
"echo ...", "Program will stop.") are repeated over and over again,
at a fast pace. The output file easily grows to hundreds of MB
within minutes!

Obviously this much memory is not needed for the small example demonstrated
here, but the very same problem occurs for all larger cases for which I need
a lot of memory. The problem always appears to occur at the same cutoff on
that particular cluster (up to ca 123 GB fine, problems beyond). I experience
the same problems on other machines, but at different cutoffs (usually smaller,
like 40 GB). Note that I always request less memory in the input than I request
in the batch queuing script and than the machine actually has. Obviously this
problem prevents me from doing certain large calculations as the memory requirements
scale (almost) linearly with the number of OMP-cores used and I wish to use
16-24 cores to keep the total wall-clock time within queuing system limits.

I have tested the same example with the new version of MRCC (2015-02-04, compiled
the same way on the same machine), and I experience similar problems, actually
a little worse:

If I request 65000 MB or less the job just runs fine, if I request 124000 MB, it
shows very similar behavior as described above but the output is a little
different (Sample output 2).

Apparently the error appears even earlier now and the message
"Fatal error in pwd > junkscript."
is replaced by
"Fatal error in which dmrcc > mrccjunk1".

If I request 85000 MB, 100000 MB, or 120000 MB, the program proceeds until
"Generating initial guess for the SCF calculation..." but produces no further
output. I have waited for at least 10 minutes before I killed the job.
If I request 123000 MB I get the final message:

Fatal error in cd mrccjunk/; ./junkscript > /dev/null.
Program will stop.

but no endless repetition of the 5 lines mentioned above.


Do you have any idea what the problem might be and how it may be
fixed?

Thank you,
best regards,
Dirk Bakowies

File Attachment:

File Name: mail.attach.txt
File Size:5 KB
Attachments:

Please Log in or Create an account to join the conversation.

  • kallay
  • Offline
  • Administrator
  • Administrator
  • Mihaly Kallay
More
8 years 11 months ago #135 by kallay
Dear Dirk,
Please set scfiguess=ao in the input. A better solution will come soon.

Best regards,
Mihaly Kallay

Please Log in or Create an account to join the conversation.

  • bakowies
  • Topic Author
  • Offline
  • New Member
  • New Member
More
8 years 11 months ago #136 by bakowies
Replied by bakowies on topic Problems requesting large memory for CCSDT(Q)
Dear Mihaly,

I didn't quite expect a working solution within 10 minutes! Thanks a lot,

Dirk

Please Log in or Create an account to join the conversation.

  • kallay
  • Offline
  • Administrator
  • Administrator
  • Mihaly Kallay
More
8 years 11 months ago #137 by kallay
Dear Dirk,
You can find a file integ.f in the download area. If you download it and recompile the program, it will fix the problem.
Thank you very much for reporting this.

Best regards,
Mihaly Kallay

Please Log in or Create an account to join the conversation.

More
7 years 6 months ago #286 by Nike
Dear Colleagues,
I am running into the same problem with the 2016-07-15 version.

On a machine with 768GB of RAM, I get:
Code:
************************ 2016-08-30 19:15:55 ************************* Executing integ... Allocation of 700.0 Gbytes of memory... Fatal error in which dmrcc > mrccjunk1. Program will stop. Fatal error in echo " ************************ "`date +"%F %T"`" *************************". Program will stop.

With the last 3 lines repeating ad infinitum.

I looked in the Download MRCC page for "integ.f" but I do not see it there.

Is there a different work around?
With best wishes,
Nike Dattani

Please Log in or Create an account to join the conversation.

  • kallay
  • Offline
  • Administrator
  • Administrator
  • Mihaly Kallay
More
7 years 6 months ago #287 by kallay
Dear Nike,
It is strange. Maybe the disk of the machine is full or the which unix command does not work for some reason. Please check these.

Best regards,
Mihaly Kallay

Please Log in or Create an account to join the conversation.

Time to create page: 0.043 seconds
Powered by Kunena Forum