Nodes, cores and yambo parallelization

Various technical topics such as parallelism and efficiency, netCDF problems, the Yambo code structure itself, are posted here.

Moderators: Daniele Varsano, andrea.ferretti, andrea marini, Conor Hogan, myrta gruning

Nodes, cores and yambo parallelization

Postby Flex » Sun Jul 17, 2016 11:45 pm

Hello,

I finally updated to yambo 4.0.2 and I am trying to use the new parallelization system. I'm working on 16 nodes*16cores (so, 256 cpu's ?) and I plan on using more later. The batch script is attached.

So, I read carefully the tutorials and set the para variables (for a GW calculation) to

X_all_q_CPU= "4 8 8 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)
X_all_q_nCPU_invert=0 # [PARALLEL] CPUs for matrix inversion
SE_CPU= "2 32 4" # [PARALLEL] CPUs for each role
SE_ROLEs= "q qp b" # [PARALLEL] CPUs roles (q,qp,b)

(see also input attached)

That should be powers of 2 and the products equal to 256

I still get this error :

<06s> P0001: CPU structure provided for the Response_G_space ENVIRONMENT is incomplete. Switching to defaults
P0001: [ERROR] STOP signal received while in :[05] Dynamic Dielectric Matrix (PPA)
P0001: [ERROR]Impossible to define an appropriate parallel structure

(see also log attached, the other 16 are similar)

Did I miss something in my cpu count ? Is Yambo configured the right way ?

BTW, is my processor distribution of the different tasks good enough ?

Thanks in advance

Thierry Clette
You do not have the required permissions to view the files attached to this post.
Thierry Clette
Student at Université Libre de Bruxelles, Belgium
Flex
 
Posts: 37
Joined: Fri Mar 25, 2016 4:21 pm

Re: Nodes, cores and yambo parallelization

Postby amolina » Mon Jul 18, 2016 8:34 am

Dear Thierry,
I am not sure, but it seems the submission is not correctly done. Yambo is not seeing the 256 process. Maybe you need to add some options/flags to the mpirun as:
mpirun -np 256 yambo -F GW.in
Best,
Alejandro.
Alejandro Molina-Sánchez
Physics and Materials Research Unit
Campus Limpertsberg
Université du Luxembourg
User avatar
amolina
 
Posts: 111
Joined: Fri Jul 15, 2011 11:23 am
Location: Luxembourg

Re: Nodes, cores and yambo parallelization

Postby Daniele Varsano » Mon Jul 18, 2016 2:30 pm

Dear Thierry,
beside that:

BTW, is my processor distribution of the different tasks good enough ?


I would avoid to parallelize on q points so I would put 1 both in X_all_q_CPU and in SE_CPU, as it can strongly unbalance your calculation.

Moreover:
Code: Select all
%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1|121|  1|40|
%


are you sure you need to calculate corrections for 40 bands and all k points: these are 4840 corrections which is a huge number.

In the case the suggestion of Ale does not work, please post it again, including your report file.
Best,
Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
User avatar
Daniele Varsano
 
Posts: 2027
Joined: Tue Mar 17, 2009 2:23 pm

Re: Nodes, cores and yambo parallelization

Postby Flex » Tue Jul 19, 2016 6:29 pm

Here is another calculation, this time a BSE (the first that started), but with the same idea, this time 4*4 cpu's and I added the tag "-np 16"

I still get the same error.

The in, r, l, and submit files are attached (r in next post)

also, thanks for the tips about roles and numbers of cpu's
You do not have the required permissions to view the files attached to this post.
Thierry Clette
Student at Université Libre de Bruxelles, Belgium
Flex
 
Posts: 37
Joined: Fri Mar 25, 2016 4:21 pm

Re: Nodes, cores and yambo parallelization

Postby Flex » Tue Jul 19, 2016 6:33 pm

report file in 2 parts
You do not have the required permissions to view the files attached to this post.
Thierry Clette
Student at Université Libre de Bruxelles, Belgium
Flex
 
Posts: 37
Joined: Fri Mar 25, 2016 4:21 pm

Re: Nodes, cores and yambo parallelization

Postby amolina » Thu Jul 21, 2016 12:49 pm

Dear Thierry,

in your input file I don't see the variables for parallelization of the dielectric function. You can see at the end of the report file that yambo complains when starting with the dielectric function.

If you are running 16 processes I recommend you to try this:

X_all_q_CPU= "1 16 1 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)

or this.

X_all_q_CPU= "1 8 2 1" # [PARALLEL] CPUs for each role
X_all_q_ROLEs= "q k c v" # [PARALLEL] CPUs roles (q,k,c,v)

Cheers,
Alejandro.
Alejandro Molina-Sánchez
Physics and Materials Research Unit
Campus Limpertsberg
Université du Luxembourg
User avatar
amolina
 
Posts: 111
Joined: Fri Jul 15, 2011 11:23 am
Location: Luxembourg

Re: Nodes, cores and yambo parallelization

Postby Flex » Sat Jul 23, 2016 11:55 am

Hello all,

So far, the suggestions from amolina does seem to work, thanks a lot for that.

I still have a question about Q points

Daniele Varsano wrote:Dear Thierry,
beside that:

BTW, is my processor distribution of the different tasks good enough ?


I would avoid to parallelize on q points so I would put 1 both in X_all_q_CPU and in SE_CPU, as it can strongly unbalance your calculation.

Moreover:
Code: Select all
%QPkrange                    # [GW] QP generalized Kpoint/Band indices
  1|121|  1|40|
%


are you sure you need to calculate corrections for 40 bands and all k points: these are 4840 corrections which is a huge number.

In the case the suggestion of Ale does not work, please post it again, including your report file.
Best,
Daniele


So, regarding this suggestion, I reduced the band number, but I don't know how to reduce the Q points. Since they come from the QE grid, I can't remove Q points without damaging the structure. I think I would rather have to reduce the grid in the QE calculation instead. Doesn't this reduce the scf and nscf precision ?

Thanks again for the answers
Thierry Clette
Student at Université Libre de Bruxelles, Belgium
Flex
 
Posts: 37
Joined: Fri Mar 25, 2016 4:21 pm

Re: Nodes, cores and yambo parallelization

Postby Daniele Varsano » Sat Jul 23, 2016 3:06 pm

Dear Flex,

Qpoint as you argued cannot be reduced, otherwise the BZ integration would be wrong, and as you say it also reduce precision in the ground state calculations. I was not meaning at all to reduce q points.

There I was talking about:
1) avoid parallelization of q points, this does not mean to discard any of them, but is just the tuning of the parallelization strategy.
2) Asking if you are really interested in calculating the GW corrections for all that bands: please note that QPkrange it is not a convergence parameter but indicates the bands and kpoints you want to calculate the QP corrections. Usually one it is interested in the bands around the Fermi energy, or more bands if needed for the BSE calculations, but the first deep bands usually are not of much interest. Of course what I'm saying it is not a rule and may be you are interested in that for some reason. Calculating corrections for that number of points could be quite consuming.

Best,

Daniele
Dr. Daniele Varsano
S3-CNR Institute of Nanoscience and MaX Center, Italy
MaX - Materials design at the Exascale
http://www.nano.cnr.it
http://www.max-centre.eu/
User avatar
Daniele Varsano
 
Posts: 2027
Joined: Tue Mar 17, 2009 2:23 pm


Return to Technical Issues

Who is online

Users browsing this forum: No registered users and 1 guest