Released Data.Array.Accelerate 0.9.0.0 — the Haskell array library for GPUs

I just released accelerate 0.9.0.0 on Hackage. This is the version that has been available from the GitHub repository for a while (supporting shape polymorphism, stencil computations, block I/O, and much more), but adapted such that it works with the forthcoming GHC 7.4.1 release. (I tested it with 7.4.1 RC2).

It doesn't yet include Trevor's recent work that improved the CUDA backend in many significant ways — you can get that code from Trevor's fork on GitHub.

For more details, see the main GitHub repository and the GitHub wiki pages.

Filed under  //  accelerate   edsl   gpgpu   parallelism  
Posted

Video and slides of "Data Parallelism in Haskell" @ Brisbane FP Group

The Brisbane FP Group (BFPG) kindly invited me to give a talk about our work on data parallel programming in Haskell. The talk motivates the use of functional programming for parallel, and in particular, data parallel programming and explains the difference between regular and irregular (or nested) data parallelism. It also discusses the Repa library (regular data parallelism for multicore CPUs), the embedded language Accelerate (regular data parallelism for GPUs), and Data Parallel Haskell (nested data parallelism for multicore CPUs).  The slides of the presentation are available in two formats: HTML5 slideshow and PDF. Thank you to everybody who attended. Special thanks go to OJ Reeves and Tony Morris for organising everything, to Rob Manthey for producing the video, to Mincom for the venue, and to Functional IO for the sponsorship.

Filed under  //  accelerate   dph   haskell   parallelism   repa  
Posted

Data.Array.Accelerate now on GitHub

Prompted by GHC's move to Git and the unreliability of the community.haskell.org infrastructure, I decided to move Data.Array.Accelerate to GitHub: mchakravarty/accelerate. The GitHub repository is now the main public repository for the project. I will also move the tickets of the old Trac bug tracker over to GitHub Issues.

Filed under  //  edsl   gpgpu   haskell   parallelism  
Posted

Final version of the Accelerate paper (GPGPU in Haskell)

We will present our paper Accelerating Haskell Array Codes with Multicore GPUs at ACM SIGLAN Declarative Aspects of Multicore Programming (DAMP 2011), which is co-located with POPL'11 in Austin, TX, in January. The final version of our paper is now available, and we plan to soon release a significantly improved version of the Accelerate library (matching the API used in the paper), which enables high-level GPGPU programming in Haskell based on NVIDIA's CUDA environment.

Filed under  //  edsl   gpgpu   haskell   parallelism  
Posted

Accelerating Haskell Array Codes with Multicore GPUs

In the paper Accelerating Haskell Array Codes with Multicore GPUs, we describe Accelerate, embedded language for array computations in Haskell, as well as a dynamic code generator targeting NVIDIA's CUDA GPGPU programming environment.  Our CUDA code generator is based on the idea of algorithmic skeletons for parallel programming, minimises re-compilation overheads, and overlaps host-to-device data transfers, code generation, and kernel execution.

The paper outlines our embedding in Haskell, details the design and implementation of the dynamic code generator, and reports on initial benchmark results.  These results indicate that we can compete with moderately optimised native CUDA code, while enabling much simpler source programs.

Filed under  //  edsl   gpgpu   haskell   parallelism  
Posted

Second beta release for the CUDA backend of Accelerate

The second beta release of Data.Array.Accelerate (version 0.8.0.0) includes support for all array operations in the CUDA backend — except for a new stencil operation that is new in this release and currently only supported by the interpreter.  The CUDA backend requires CUDA 3.x and has been tested on consumer-grade GeForce cards, TESLA high-performance GPUs, and the new Fermi cards.

If you are interested in general-purpose GPU programming in Haskell, give this release a spin — it is available on Hackage.  While it is far from perfect, it should already prove useful for a considerable range of applications.  Together with the other Accelerate developers, I would be very interested to hear what works for you and what doesn't.  We are also interested in suggestions for the future development of the library.  Accelerate has a mailing list and a bug tracker on which we also accept feature requests.

Filed under  //  edsl   gpgpu   haskell   parallelism  
Posted

First release of the CUDA backend for Accelerate

During the first Australian Haskell Hackathon, AusHac2010, we finally managed to complete the integration of the CUDA backend, written by Sean Lee and Trevor McDonell, into the array EDSL Accelerate.  It still doesn't cover the complete functionality of the current version of the Accelerate EDSL, but already allows for interesting GPU computations.  The package source (version 0.7.1.0) can be obtained from Hackage as usual.  There is now also a bug tracker.

Filed under  //  edsl   gpgpu   haskell   parallelism  
Posted

Regular, shape-polymorphic, parallel arrays in Haskell

The paper Regular, shape-polymorphic, parallel arrays in Haskell describes a new Haskell library, named Repa, for multi-dimensional, regular arrays based on our work on Data Parallel Haskell and a delayed array representation avoiding unnecessary data structures. As the following table shows, the sequential performance already gets close to the performance of C programs. More importantly, the Haskell code transparently makes use of multicore hardware — something that is much more difficult in C.

Pastedgraphic-2

The Haskell code is purely functional, using collective array operations.  Here is matrix-matrix multiplication:

mmMult :: (Num e, Elt e)
       => Array DIM2 e -> Array DIM2 e -> Array DIM2 e
mmMult arr brr = sum (zipWith (*) arrRepl brrRepl)
 where
  trr     = force (transpose2D brr)
  arrRepl = replicate (Z :.All :.colsB :.All)   arr 
  brrRepl = replicate (Z :.All :.All   :.rowsA) trr               
  (Z:. colsA:. rowsA) = extent arr
  (Z:. colsB:. rowsB) = extent brr

In addition to good performance, the library facilitates reuse through the use of shape polymorphism.

Filed under  //  haskell   multicore   parallelism  
Posted

Multicore Haskell Now!

Don Stewart summarised the state of play of parallel programming in Haskell at "ACM Reflections | Projections 2009". He covers strategies, Concurrent Haskell, STM, and Data Parallel Haskell: http://donsbot.wordpress.com/2009/10/17/multicore-haskell-now-acm-reflections...

Filed under  //  dph   haskell   multicore   parallelism   stm  
Posted

Don Stewart's talk on Domain Specific Languages and Haskell

Don argues in favour of domain-specific languages for high-performance computing. Not surprisingly, he suggests that Haskell is well suited as a host for realising such domain-specific languages as embedded languages: http://donsbot.wordpress.com/2009/10/16/lacss-2009-domain-specific-languages-...

Filed under  //  edsl   haskell   parallelism  
Posted