This a question about Quantum Espresso, but I as it is specific to preliminary NSCF runs for Yambo (GW), I though I would ask it here.

I work with a system that is a bit problematic, as it is a 2D monolayer with a lot of empty space, and I need a lot of very high bands to converge GW accurately, i.e. about 3000-4000 bands and two times that with spin-orbit. I also have to achieve good convergence up to the highest bands. This is what a typical input looks like :

----------------------------------------------------------

&control

calculation = 'nscf'

wf_collect= .true.

prefix = 'MoS2',

pseudo_dir='./'

/

&system

ibrav = 4, celldm(1) = 6.0203189746E+00, celldm(3) = 6.644166E+00, nat = 3, ntyp = 2,

!relax Abinit

ecutwfc = 76,

nbnd=3000,

occupations = 'smearing', smearing = 'gaussian', degauss = 0.001,

noncolin=.true.

lspinorb=.true.

starting_magnetization(1) = 0.1d0 ,

nosym = .false.

force_symmorphic= .true.

/

&electrons

diago_full_acc = .true.

diago_thr_init = 1.0e-10

!conv_thr = 1.0d-10,

/

ATOMIC_SPECIES

Mo 95.96 Mo-sp_r.upf

S 32.06 S_r.upf

ATOMIC_POSITIONS (bohr)

Mo 1.3322676296E-15 3.4758327806E+00 8.7435638920E-20

S 3.0101594873E+00 1.7379163904E+00 2.9540721569E+00

S 3.0101594873E+00 1.7379163904E+00 -2.9540721569E+00

K_POINTS AUTOMATIC

9 9 1 0 0 0

------------------------------------------------------

I use a ¨conv_thr¨ of about 1.0d-10 in the SCF to have good convergence, and I wanted to achieve a similar precision for the NSCF (because I also use a similar cut in Abinit, i.e. ¨tolwfr 1.0d-18¨). However, this means a ¨diago_thr_init¨ of about 1.0d-16, and up to now, I only managed to get to ¨diago_thr_init=1.0d-10¨, and when I tried smaller, I got the standard crash of ¨ Error in routine cdiaghg (10503): problems computing cholesky¨ which basically means it can't converge to a treshold so small.

So, I was wondering :

1) Am I simply going too far and is a cut of 1.0d-10 sufficiently converged ? How should I check the ¨quality¨ of the bands computed ?

2) Is there a way to optimize the calculation for so many bands, for example by adding unconverged buffer bands on top ? Perhaps that would help to achieve a better cut ? (if it is needed)

Thanks in advance