gpuArray slower on newer graphics card in double precision
Show older comments
I have been making the following speed test in R2015a on two different computers running two different graphics cards,
>> A=gpuArray(rand(5e3));
>> T=gputimeit(@()A*A)
The first computer is an older model (Dell Precision T7500) running an older graphics card (GTX 580). The second, newer computer (Dell Precision Tower 7910) is running a newer graphics card (Titan X).
Oddly, I find that the older configuration outperforms the newer by about 20%. The GTX 580 gives T=1.1178 seconds, whereas the Titan X gives T=1.3097 seconds. When I redo the test in single precision,
>> A=gpuArray(rand(5e3,'single'));
>> T=gputimeit(@()A*A)
the results are more in line with my expectations. The GTX 580 gives T=0.2121 seconds, whereas the Titan X gives T=0.0491 seconds.
I'm wondering what could account for this difference. One thing that might be worth mentioning is that the Titan X is not using a fully updated driver. At the time of this writing, there is some bug in its newest driver release, making it unusable, and I am instead using driver version 353.62. Could this be the reason? If not, any other ideas?
7 Comments
Surprising. My older EVGA Titan Black gives T=1.0277 in double, and T=0.0737 in single.
No answer but an anecdote: when I first installed the Titan Black, the power unit was not supporting the load and I got terrible performance, crashes, etc. I spent a good 4 hours cursing at Windows, MATLAB (not proud of that though ;-)), EVGA, until I got lucky enough that a heavy computation triggered a power off of the machine.. which helped me finding the cause.
Matt J
on 3 Aug 2015
Hi Matt, it was a good thing because otherwise I would never have thought about the power supply. My PSU has two pairs of 6 and 8 pins PCI-E power outputs; one pair is white-black (6+8) and the other is blue-black (6+8). I used all white-black at first and it crashed. Then I mixed and it worked (I also tried with dual 4 + adapter and it went well), which seems to indicate that they are wired to separate circuits internally and mixing just splits the load.
PS :
- My machine is a Dell Precision T7500.
- The issue was silly enough that FurMark was running fine, but the GPU benchmarking tool distributed on the FEX was crashing the machine occasionally.
Brendan Hamm
on 3 Aug 2015
The Titan X is a terrible card to use for GPGPU as it was designed as a cheaper alternative to other Titans with a focus on single precision (gaming). You will see that the GFLOPS for double precision is about 1/32 that of single precision on the Maxwell chips. Compare that with the Fermi architecture used on the GTX 580 which has 1/5 the GFLOPS for double precision compared with its single precision. If you intended to use this for double precision I would highly recommend using the Titan Z (or Black) which uses the Kepler architecture. Therefore if you have a Titan Black, this would not be rolling back at all, but rather using a card which considered double precision as being important.
Cedric
on 3 Aug 2015
This looks like an answer!
Brendan Hamm
on 3 Aug 2015
Added double precision to that terrible line :)
Accepted Answer
More Answers (0)
Categories
Find more on Language Fundamentals in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!