Page tree
Skip to end of metadata
Go to start of metadata

Structure of SCITAS filesystem

  •   The structure and purpose of each filesystem is described here File systems
  •  $HOME and $WORK are shared across the site, while $SCRATCH is local to each machine.
  •  For $SCRATCH Automatic deletion of files older than 2 weeks may happen without notice.
  •  Production jobs should run on $SCRATCH.


What to do when CPU time and WALL time are significantly different?

  •  $SCRATCH mount a GPFS parallel filesystem which is designed to perform well with parallel I/O.
  •  In certain cases a big number of files is produced at runtime. This IO pattern puts stress on the $SCRATCH and can cause hardware failure from time to time.
  •  A proper solution would require the usage of external libraries like HDF5 or ADIOS. Those libraries gives certain flexibility in the way data are saved/handled.
  •  A workaround would be to use the local filesystem $TMPDIR
  •  $TMPDIR is visible only once resources are allocated. You can try to query the value of $TMPDIR after the login :
    ssh fidis   
    [nvarini@fidis ~]$ echo $TMPDIR

    [nvarini@fidis ~]$



  •  However:


    [nvarini@fidis ~]$ Sinteract
    Cores:            1
    Tasks:            1
    Time:             00:30:00
    Memory:           4G
    Partition:        parallel
    Account:          scitas-ge
    Jobname:          interact
    Resource:         
    QOS:              normal
    Reservation:      
   salloc: Pending job allocation 159671
    salloc: job 159671 queued and waiting for resources
    salloc: job 159671 has been allocated resources
    salloc: Granted job allocation 159671
    srun: Job step created

    [nvarini@f061 ~]$ echo $TMPDIR
    /tmp/159671



  •  The variables $TMPDIR, $WORK and $SCRATCH are set by the SLURM prescheduler


How to use the $TMPDIR in your simulations

  • The following example show how to set the $TMPDIR with Quantum-ESPRESSO(QE).
  • QE relies on fortran namelists to read certain parameters used during the simulation.
  • The only change that has to be done to a standard pw input is related to the outdir in the &CONTROL namelist. For example, in the input below the outdir is set equal to fakeoutdir:



   &CONTROL
    calculation = 'scf',
    restart_mode = 'from_scratch',
    prefix = 'lgps_diel'
    tstress = .false.
    tprnfor = .false.  
    outdir = 'fakeoutdir'
    pseudo_dir = '/scratch/nvarini/pseudo'
    disk_io='low'
    max_seconds=1800
    \



  • The submission script would look like


#!/bin/bash
    #SBATCH --nodes 2
    #SBATCH --time=1:00:00
    #SBATCH -p debug

    module purge
    module load intel/16.0.3
    module load intelmpi/5.1.3
    module load fftw/3.3.4-mpi
    module load mkl/11.3.3


    sed "s|fakeoutdir|${TMPDIR}|g" temp_pw > ${TMPDIR}/${SLURM_JOB_ID}_pw
    srun pw.x < ${TMPDIR}/${SLURM_JOB_ID}_pw>${TMPDIR}/${SLURM_JOB_ID}.tmp.out
    tar cvf ${SLURM_JOB_ID}.archive.tar ${TMPDIR}/* .



  • After the sed command the CONTROL namelist looks like:
    &CONTROL
    calculation = 'scf',
    restart_mode = 'from_scratch',
    prefix = 'lgps_diel'
    tstress = .false.
    tprnfor = .false.
    outdir = '/tmp/1325324'
    pseudo_dir = '/scratch/marcolon/test_LGPS/pseudo'
    disk_io='low'
    max_seconds=1800
    /





  • For a single 100GB file, all results in MB/s, <write into TMPDIR> : <copy from TMPDIR to /scratch>:


    deneb E5v2: 76 : 74
    eltanin E5v3: 109 - 103
    fidis: 529 - 498





~