Adastra user documentation
Warning
TO READ BEFORE CONTINUING: Know that the documentation is always a work in progress and it may be slightly behind, or slight in advance of the current state of the machine. For instance as of 2023/01/23, some SLURM options such as the implicit binding (when you ask SLURM to do it for you) may not be working properly. For the binding, refer to Proper binding, why and how. On a final note, do not forget to check the Known issues.
Note
The CINES High Performance Computing (HPC) support team remains open and available for support. Users should follow normal support procedures when reporting issues or requesting help. Consider the guidelines for Contacting CINES.
Ask for help by opening a ticket using one of the two following options:
- Emailing us at svp@cines.fr;
- Using this GUI.
The manual can be seen is the external specification of the product, it should describe be every detail the the user sees, as such, it is the chief product for CINES. The style must be precise and full and accurately detailed. The definition must repeat the essentials yet all definitions must agree. This tend to make manual dull reading but precision is more important than liveliness. The ideas of about twenty men are cast into the manual but only one or two are converting them into prose if the consistency of the prose and product is to be maintained.
Extract from The Mythical Man-Month by Fred Brooks.
This document is intended as a reference for the CINES’ computing community: les Chercheurs CalCulant au Cines (C4). However, we are still far from the level of completion mentioned above. We welcome comments on missing information, technical and typographical errors, etc..
Some explicit requirements to follow this documentation is a basic understanding of the Unix-like shell, of the Bash implementation and of the environment variable concept; you may find learning material here.
A big problem with much documentation is that you often need to know the content before, so you can make sense of it when read. It’s a very good aide-memoire, but if you don’t know what you’re doing it’s pretty much useless. It will often be the teaching materials that are so lacking, that said, we do not want to reduce our user to read a monkey see monkey do documentation. As such, we may be a bit verbose in our explanations (explaining why something exists, how it works and how to use it), hoping that in the long run, true knowledge can be shared.
You may benefit from reading the documentation linearly, but assuming you know what to look for, you should make use of the search bar functionality or the left-hand side index.
This document was last updated: Thursday, 21 September 2023 11:00:18 +0200.
Table of contents
- Accessing Adastra
- Programming environment
- Module, why and how
- CrayPE basics
- Changing CrayPE version
- Cray compiler wrapper
PrgEnv
and compilersPrgEnv
subtleties- Mixing environments
- Targeting an architecture
- CINES Spack modules
- Using Cray’s MPICH
- Cray MPICH and ROCm compatibility
- OpenMP
- OpenMP GPU Offload
- HIP
- HIP + OpenMP CPU Threading
- Some other compilation flags
- Spack
- Running jobs
- The SLURM batch scheduler and job launcher
- Batch scripts
- Common SLURM submission options
- Resource consumption and charging
- Quality Of Service (QOS) queues
srun
- Interactive jobs
- Chained job
- Other common SLURM commands
- Job state
- Job reason codes
- Monitoring and modifying batch jobs
- Process and thread mapping
- Core-dump files
- External documentation and training resources
- Software engineering
- Algorithm and generic programming
- Code quality
- The C and C++ languages
- Floating point computation
- Concurrency for parallel software
- Compiler infrastructure
- Generating (pseudo) random numbers
- CPU programming
- GPU programming
- Notions of debugging
- The Bash Unix shell
- On the issues encountered in software engineering
- Network programming, distributed and shared memory abstractions
- Choosing a license
- Other
- Software engineering