Coconut: COde CONstructing User Tool



Uploaded by: googletechtalks
Video Description:
Google Tech Talks
March, 7 2008
ABSTRACT Coconut is a developing system for high-assurance, high-performance software. It was used to develop a library of special functions for the Cell BE processor, which is distributed in the Cell BE SDK 3.0 as MASS. Average performance is 4X better than the alternative hand-tuned C library,
SimdMath. Coconut has been successful where patterns of efficient hardware-specific computation can be captured as higher-order functions and encoded in a Domain Specific Language embedded in Haskell. Patterns include efficient control structures not expressible in C, e.g., the MultiLoop, and effic
ient uses of SIMD instructions which require significant compile-time computation for pattern specialization. Some patterns interact with a novel instruction scheduler called Explicitly Staged Software Pipelining, based on a min-cut approach, which outperforms SWING modulo scheduling in our tests.
A less developed aspect of Coconut is the parallel production of proofs of correctness along with executables. Current work aims to prove only limited properties about programs---the ones most likely to be broken---creative use of SIMD instructions, and parallelization. Coconut intermediate code is
represented as nested code (hyper)graphs. At the lowest level, we transform acyclic loop bodies to remove the effect of SIMDization, and produce machine and/or human readable specifications. This has been used to verify opaque patterns of optimizing linear algebra for SIMD processors. Such code gr
aphs are embedded in higher levels containing control flow, first single-threaded control flow optimized for ILP, and then parallel control-flow, optimized to hide communication latency. At this level control flow is restricted to allow peak utilization of multi-core hardware, but enable efficient c
ompile-time verification of soundness. Soundness, in this context, means that the parallelized program can be transformed into a code graph without synchronizing control flow, because every execution can be shown to produce the same result. Think of it as reducing the parallel debugging effort to th
e single-threaded debugging problem by eliminating the non-determinism inherent in parallel code. I will give a formal language description of the language, and the O(n) algorithm which proves soundness and produces the equivalent ``single threaded'' code graph. Speaker: Christopher Anand Christoph
er Anand is a professor of Computing and Software at McMaster University. His main research areas are software correctness, high performance computation, and automatic code generation. He has also founded the company Optimal Computational Algorithms to provide hardware-specific libraries for scien
tific applications on novel architectures.


Tags for this video: education engedu google googletechtalks talk talks techtalk techtalks

Find more videos in the "People" category
See more videos uploaded by googletechtalks

Related Videos
Ruby 1.9The Web That Wasn'tjQuery
ruby-19.htmlruby-19.htmlruby-19.html
Tech talk: Gauche Schemesex on the internet, the realities of porn, sexual privacy,High Performance Web Sites and YSlow
ruby-19.htmlruby-19.htmlruby-19.html


Share This Video:       StumbleUpon       del.icio.us       Reddit       digg       Furl       Spurl       Simpy       YahooMyWeb


Comments for this video: Show || Hide
Comments for this video on YouTube
1:34 HE FARTED!!! ... ( 4 months ago by stonerj0e)
1:34 HE FARTED!!! LMAO!!!
good job on this ... ( 2 weeks ago by edorebel12h)
good job on this video! my names Mackenzie, kinda feelin bored if any1 wants to join me on cam or wana chat i will be signed on at __ PLAY-CAM...dot...COM __ my user ID there is Mackenzie_oudyekfrlya chat soon xx its FR33 to j0in! mwah



Tell a friend:


URL 
Embed Code