I recently ported the CUDA program for 1D heat transfer onto Mac OS X 10.6.8 and CUDA 4.0.31. I only modified the parts related to timing and the program could work perfectly.
The hardware platform comprises GeForce 320M, 950 MHz, and Intel Core 2 Duo P8600, 2.40 GHz. The computation times for GPU and CPU are 1.86 s and 2.15 s respectively; speedup of 1.16. Previously on Windows 7 and CUDA 3.1, GeForce 9800 GTX+ and Q6600 used calculation times 1.43 s and 5.04 s respectively.
The source code is attached for comments.