Pytorch only uses one GPU by default. In this tutorial by Soumith Chintala, one of the creators of PyTorch, you'll learn how to use multiple GPUs in PyTorch with the DataParallel class. This will allow you to split each mini-batch of samples into multiple smaller mini-batches, and run the computation for each of these in parallel.