Calculation stuck at the end

Various technical topics such as parallelism and efficiency, netCDF problems, the Yambo code structure itself, are posted here.

Moderators: Daniele Varsano, andrea.ferretti, andrea marini, Conor Hogan, myrta gruning

Calculation stuck at the end

Postby Flex » Sat Jun 10, 2017 5:37 pm

Hello,

I am doing a mid to heavy calculation of GW corrections. About 75 qpts over 6 bands. 300 bands are used.

Input, log and output are attached. You can see the calculation in itself is finished after 18 hours, but gets stuck at the writing phase. Then, after idling, it got killed at the 24 hours limit of the machine I work on.

Is it some kind of netCDF limitation ? Is the IO really that slow ?

Note that I use quite a lot of processors.

If I restart the calculation, is it going to restart the GW part or just retry to write ?

Thanks in advance
You do not have the required permissions to view the files attached to this post.
Thierry Clette
Student at Université Libre de Bruxelles, Belgium
Flex
 
Posts: 37
Joined: Fri Mar 25, 2016 4:21 pm

Re: Calculation stuck at the end

Postby Daniele Varsano » Sun Jun 11, 2017 7:44 am

Dear Thierry,
my impression is that not all the cpus finished their task and that the calculation is unbalanced.
You have 6 bands and 75 kpoints, which means 450 corrections. I suggest you change the parallelization strategy considering a factor of 450 as "qp". Maintaining the number of cpus you are using a possible strategy could be:
SE_CPU= "1 6 76"

When restarting, Yambo will read the ndb.pp database, but we recalculate the QP correction from scratch as they have not be written in databases.

Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
User avatar
Daniele Varsano
 
Posts: 2027
Joined: Tue Mar 17, 2009 2:23 pm

Re: Calculation stuck at the end

Postby Flex » Tue Jun 20, 2017 1:43 pm

Thanks a lot, it worked with these parameters.
Thierry Clette
Student at Université Libre de Bruxelles, Belgium
Flex
 
Posts: 37
Joined: Fri Mar 25, 2016 4:21 pm


Return to Technical Issues

Who is online

Users browsing this forum: No registered users and 1 guest

cron