toil.lib.accelerators¶
Accelerator (i.e. GPU) utilities for Toil
Functions¶
Return True if the nvidia-smi binary, from nvidia's CUDA userspace |
|
Work out what accelerator is what. |
|
Return True if Docker exists and can handle an "nvidia" runtime and the "--gpus" option. |
|
Return the number of nvidia GPUs seen by nvidia-smi, or 0 if it is not working. |
|
Return the number of amd GPUs seen by rocm-smi, or 0 if it is not working. |
|
Determine all the local accelerators available. Report each with count 1, |
|
Get environment variables which can be applied to a process to restrict it |
Module Contents¶
- toil.lib.accelerators.have_working_nvidia_smi()[source]¶
Return True if the nvidia-smi binary, from nvidia’s CUDA userspace utilities, is installed and can be run successfully.
TODO: This isn’t quite the same as the check that cwltool uses to decide if it can fulfill a CUDARequirement.
- Return type:
- toil.lib.accelerators.get_host_accelerator_numbers()[source]¶
Work out what accelerator is what.
For each accelerator visible to us, returns the host-side (for example, outside-of-Slurm-job) number for that accelerator. It is often the same as the apparent number.
Can be used with Docker’s –gpus=‘“device=#,#,#”’ option to forward the right GPUs as seen from a Docker daemon.
- Return type:
List[int]
- toil.lib.accelerators.have_working_nvidia_docker_runtime()[source]¶
Return True if Docker exists and can handle an “nvidia” runtime and the “–gpus” option.
- Return type:
- toil.lib.accelerators.count_nvidia_gpus()[source]¶
Return the number of nvidia GPUs seen by nvidia-smi, or 0 if it is not working.
- Return type:
- toil.lib.accelerators.count_amd_gpus()[source]¶
Return the number of amd GPUs seen by rocm-smi, or 0 if it is not working. :return:
- Return type:
- toil.lib.accelerators.get_individual_local_accelerators()[source]¶
Determine all the local accelerators available. Report each with count 1, in the order of the number that can be used to assign them.
TODO: How will numbers work with multiple types of accelerator? We need an accelerator assignment API.
- Return type:
- toil.lib.accelerators.get_restrictive_environment_for_local_accelerators(accelerator_numbers)[source]¶
Get environment variables which can be applied to a process to restrict it to using only the given accelerator numbers.
The numbers are in the space of accelerators returned by get_individual_local_accelerators().