toil.lib.accelerators¶

Accelerator (i.e. GPU) utilities for Toil

Functions¶

`have_working_nvidia_smi`()	Return True if the nvidia-smi binary, from nvidia's CUDA userspace
`get_host_accelerator_numbers`()	Work out what accelerator is what.
`have_working_nvidia_docker_runtime`()	Return True if Docker exists and can handle an "nvidia" runtime and the "--gpus" option.
`count_nvidia_gpus`()	Return the number of nvidia GPUs seen by nvidia-smi, or 0 if it is not working.
`count_amd_gpus`()	Return the number of amd GPUs seen by rocm-smi, or 0 if it is not working.
`get_individual_local_accelerators`()	Determine all the local accelerators available. Report each with count 1,
`get_restrictive_environment_for_local_accelerators`(...)	Get environment variables which can be applied to a process to restrict it

Module Contents¶

toil.lib.accelerators.have_working_nvidia_smi()[source]¶

Return True if the nvidia-smi binary, from nvidia’s CUDA userspace utilities, is installed and can be run successfully.

TODO: This isn’t quite the same as the check that cwltool uses to decide if it can fulfill a CUDARequirement.

Return type:: bool

toil.lib.accelerators.get_host_accelerator_numbers()[source]¶

Work out what accelerator is what.

For each accelerator visible to us, returns the host-side (for example, outside-of-Slurm-job) number for that accelerator. It is often the same as the apparent number.

Can be used with Docker’s –gpus=‘“device=#,#,#”’ option to forward the right GPUs as seen from a Docker daemon.

Return type:: list[int]

toil.lib.accelerators.have_working_nvidia_docker_runtime()[source]¶

Return True if Docker exists and can handle an “nvidia” runtime and the “–gpus” option.

Return type:: bool

toil.lib.accelerators.count_nvidia_gpus()[source]¶

Return the number of nvidia GPUs seen by nvidia-smi, or 0 if it is not working.

Return type:: int

toil.lib.accelerators.count_amd_gpus()[source]¶

Return the number of amd GPUs seen by rocm-smi, or 0 if it is not working. :return:

Return type:: int

toil.lib.accelerators.get_individual_local_accelerators()[source]¶

Determine all the local accelerators available. Report each with count 1, in the order of the number that can be used to assign them.

TODO: How will numbers work with multiple types of accelerator? We need an accelerator assignment API.

Return type:: list[toil.job.AcceleratorRequirement]

toil.lib.accelerators.get_restrictive_environment_for_local_accelerators(accelerator_numbers)[source]¶

Get environment variables which can be applied to a process to restrict it to using only the given accelerator numbers.

The numbers are in the space of accelerators returned by get_individual_local_accelerators().

Parameters:: accelerator_numbers (set[int] | list[int])
Return type:: dict[str, str]

toil.lib.accelerators¶

Functions¶

Module Contents¶

Toil

Navigation

Related Topics