Frequently Asked Questions (FAQ)
General
Mention the use of EXPLOR in your communications:
* The computing resources were partially provided by the EXPLOR mesocentre hosted by the Université de Lorraine.
* High Performance Computing resources were partially provided by the EXPLOR centre hosted by the Université de Lorraine.
Send information to explor-contact@univ-lorraine.fr.
Connection
You can connect to your EXPLOR user account using SSH or X2GO (–ssh; –x2go).
For connection issues, send an email describing the problem to explor-support@univ-lorraine.fr.
X2Go is software that provides graphical (desktop) remote access to your EXPLOR access node over SSH. Client versions are available for Windows, Linux and macOS. See http://wiki.x2go.org/doku.php/download:start for more information.
Two options are available for transferring data. They are detailed on the following page.
vm-XXX?To guarantee user, server, project and resource confidentiality, user accounts and projects are anonymized.
From your working environment (access node) you have internet access. Compute nodes, however, cannot communicate outside EXPLOR.
Each project has its own anonymized environment accessible via a dedicated virtual machine (access node). Connection settings therefore differ per project.
Procedures for transferring data between projects are explained on the following page.
An operating procedure will be provided when your servers are installed explaining how to access them under SLURM.
Resource usage
From your environment, use the SLURM job scheduler to submit jobs. See the examples at Example submission scripts.
Request a node reservation via salloc.
Example (request 1 node on partition std for 1 hour):
salloc -N1 -p std -t 1:00:00 srun --pty bash
Choose a partition according to job type and resource needs (CPU, memory, GPU, etc). See the list at: Table of associations.
In general:
- nodes from the old hf partition (cne[01-16]) are reserved for sequential or low-parallelism jobs (1 to 8 cores max).
- gpu partitions are suitable for jobs requiring GPUs.
If in doubt, contact support at explor-support@univ-lorraine.fr.
The maximum runtime depends on the partition and number of nodes chosen. See Available computing resources.
We recommend estimating runtime as accurately as possible to help SLURM optimize resource usage.
squeue shows job status (last column "REASON"). A queued job may be in status:
- Resources: SLURM is reserving the resources and the job will start soon.
- Priority: other jobs are ahead of yours in the queue.
- QOSGrpJobsLimit: limits apply per partition, project or user depending on requested resources.
These limits are defined at Limitations on job duration and resources.
Your job will change status when those limits allow it.
The --start option of squeue can give an estimate of when SLURM expects the job to start.
Your SLURM submission parameters (partition, nodes, time, etc.) are incorrect. Check the technical documentation: Computing resources and Job submission to verify compatibility.
Invalid qos specification means you requested a resource you don't have access to.
See Limitations on job duration and resources for details.
module loads software, compilers, etc., into the user environment. See usage at this page.
Several Python versions are available via module:
- python2:
module load anaconda/2 - python3:
module load anaconda/3 - Intel-optimized specific versions:
module load python/versionX/intel
Choose a partition that provides GPUs, then request GPUs with the --gres sbatch option (e.g.: --gres=gpu:2 to request 2 GPUs).