NVIDIA CUDA Toolkit: All versions,NVIDIA CUDA Toolkit All versions information

12.2 (latest)

9.0

Dec 31, 2017

7.5

Jun 22, 2015

7.0

Mar 16, 2015

6.5

Jul 24, 2014

6.0

Mar 5, 2014

5.5

Jun 2, 2013

5.0

Sep 11, 2012

4.2

Apr 21, 2012

cudatoolkit_4.2.9_win_32.msi

x86 197MB

4.1.28

Dec 16, 2011

cudatoolkit_4.1.28_win_32.msi

x86 185MB

4.0.17

Mar 29, 2011

cudatoolkit_4.0.17_win_32.msi

x86 131MB

3.2

Nov 11, 2010

cudatoolkit_3.1_win_32.exe

x86 63.3MB

3.1

Jun 27, 2010

What's new

v4.0 [Mar 29, 2011]
Share GPUs across multiple threads
Use all GPUs in the system concurrently from a single host thread
No-copy pinning of system memory, a faster alternative to cudaMallocHost()
C new/delete and support for virtual functions
Support for inline PTX assembly
Thrust library of templated performance primitives such as sort, reduce, etc.
NVIDIA Performance Primitives (NPP) library for image/video processing
Layered Textures for working with same size/format textures at larger sizes and higher performance
Unified Virtual Addressing
GPUDirect v2.0 support for Peer-to-Peer Communication
Automated Performance Analysis in Visual Profiler
C debugging in CUDA-GDB for Linux and MacOS
GPU binary disassembler for Fermi architecture (cuobjdump)
Parallel Nsight 2.0 now available for Windows developers with new debugging and profiling features.