Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemon native code in Haskell with Accelerate #12

Open
janm399 opened this issue Feb 13, 2013 · 5 comments
Open

Daemon native code in Haskell with Accelerate #12

janm399 opened this issue Feb 13, 2013 · 5 comments
Assignees

Comments

@janm399
Copy link
Member

janm399 commented Feb 13, 2013

Similar code to the OpenCV daemon/main.cpp, but in Haskell, with Accelerate and CUDA. Use the Haskell RabbitMQ libraries to talk to AMQP in the same way. The aim is to be able to run either the compiled C++ code or the compiled Haskell code and get the same result in Scala.

@ghost ghost assigned adinapoli Feb 13, 2013
@adinapoli
Copy link

If we succeed, this is gonna be incredibly cool. In my new laptop I have a GeForce (same as the machine at home) so I can test CUDA-powered code. As far as I remember, Accelerate provide only CUDA out of the box, whilst the OpenCL backend was experimental. Maybe is the right time to do a survey on the state of the art :)
Probably also Repa is worth a look, never had the time to check it properly:

http://repa.ouroborus.net/

Even though I think is just CPU parallelism, not GPU :)

@janm399
Copy link
Member Author

janm399 commented Feb 13, 2013

Is your work computer the new MacBook Pro with the funky GeForce GT 650M? (Ask Guy or Pete to buy it for you if it isn't ;))

@adinapoli
Copy link

Yes, it is :)

@adinapoli
Copy link

./deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GT 650M"
  CUDA Driver Version / Runtime Version          5.0 / 5.0
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 1024 MBytes (1073414144 bytes)
  ( 2) Multiprocessors x (192) CUDA Cores/MP:    384 CUDA Cores
  GPU Clock rate:                                775 MHz (0.77 GHz)
  Memory Clock rate:                             2000 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 262144 bytes
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65536), 3D=(4096,4096,4096)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime Version = 5.0, NumDevs = 1, Device0 = GeForce GT 650M

🍰

@janm399
Copy link
Member Author

janm399 commented Feb 13, 2013

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants