-
Forum
-
MRCC Forum
-
Running MRCC
-
OpenMP performance for CCSDTQ, etc.
OpenMP performance for CCSDTQ, etc.
Less
More
-
Posts: 19
-
Thank you received: 0
-
-
1 year 5 months ago - 1 year 5 months ago #1253
by TiborGY
Dear all,
For what its worth, on systems with plenty of RAM and a fast NVMe SSD, the parallel scaling does not appear to be limited by disk access speed, but the amount of time spent outside of parallel regions (or inside critical sections, serial either way).
Running a fairly large CCSDT calculation on 12 OMP threads, I observed the CPU usage alternating between 1 core and 12 cores.
I have not looked at the code yet, but as a first guess finding a way to parallelize the remaining serial sections should improve parallel scaling, even if heavy IO is involved. Modern SSDs not only do not mind receiving parallel IO, but many SSDs can only reach their advertised peak throughput when servicing multiple IO requests in parallel. (Queue Depth > 1)
Of course, as it is always, easier said than done :)
Last edit: 1 year 5 months ago by
TiborGY.
Please Log in or Create an account to join the conversation.
Less
More
-
Posts: 34
-
Thank you received: 0
-
-
1 year 4 months ago #1256
by kipeters
Just an update to this thread - it seems the issues I've been seeing with OpenMP performance is due to the interface with Molpro. Just invoking mrcc by itself has the expected thread activity. No fix yet, but now I know where to look.
Please Log in or Create an account to join the conversation.
-
Forum
-
MRCC Forum
-
Running MRCC
-
OpenMP performance for CCSDTQ, etc.
Time to create page: 0.019 seconds