Skip to content

GPU support

We run the most important parallizable computations on the GPU. An Nvidia GPU, RTX 3 series or higher is required. The minimume supported Nvidia driver version is 580. If such a GPU isn't available, the CPU is used for all computations. If CPU SIMD instructions are used, we take advantage of those. Here are the computations we run on GPU, or using CPU SIMD:

Molecular dynamics short range interactions

This is the short-range Coulomb (Up to a cutoff), and Lennard Jones interactions. These are pairwise computations; we run them in a single GPU kernel.

Molecular dynamics long range interactions

We use the cuFFT or vkFFT libraries to compute long range SPME (Ewald) computations on the GPU. This is an approximation for the Coulomb force at long range. We also perform other parts of this computation using GPU kernels, mainly to reduce I/O on the PCIe bus, in preparation for the FFTs.

Interpreting electron density data

We compute 3D FFTs when loading electron density data from 2fo-2fc files. This converts from reflections data (Miller indices) to electron density as a function of space. This isn't require for map or MTZ files, as these are already prepared.

CPU parallelism

When the GPU is unavailable, we perform the above computation using two forms of CPU parallelization: Thread pools, and SIMD. Thread pools split up the computation into threads, which the operating system delegates across multiple (For example, all) CPU cores. This can provide varying levels of speedups; we have observed 5x-20x. We accomplish this using the Rayon library.

We plan to add CPU SIMD use as well, for further speed improvements. This uses special instructions which execute multiple floating point computations concurrently. The amount of parallelism here depends on your CPU architectures, which Molchanica detects automatically. AVX-512 is available on newer CPUs; this lets us perform 16 floating point operations concurrently. Older CPUs support AVX-256, which supports 8 operations concurrently.