@device(postscript) @libraryfile(Mathematics10) @libraryfile(Accents) @style(fontfamily=timesroman,fontscale=11) @pagefooting(immediate, left "@c", center "@c", right "@c") @heading(Tasks and Connection Sets: Choreographed Communication on a Reconfigurable Connection-Based Parallel Computer) @heading(CMU-CS-96-155) @center(@b(Thomas E. Warfel)) @center(April 1996 - Ph.D. Thesis@foot) @center(FTP: CMU-CS-96-155.ps) @blankspace(1) @begin(text,spacing=1) High-bandwidth, high-throughput applications with hard latency constraints are difficult to implement on a general-purpose parallel computer. Multiple developer- controlled "trial-and-error" cycles are usually needed before applications can reliably meet throughput and latency constraints, even on platforms having ample network bandwidth and computation power. Not only is reliable execution difficult to achieve for code developed in this manner, the code itself is difficult to modify or reuse without upsetting the delicate timing balance achieved. Local computation performance can usually be bounded, but communication performance is often more difficult to predict. While hardware-supported connections can offer minimal quality-of-service bandwidth and latency guarantees, limited connection resources make scheduling the full application difficult. This thesis introduces a new approach: use multiple sets of connections, and allow tasks to perform @b(local communication) context switches and dynamically swap, within tasks, between statically scheduled sets of connections. The mechanics of swapping connection sets, starting a task, and ending a task can be encapsulated into a small set of control primitives built upon fast, efficient @b(barrier synchronization). If the control primitives are constructed to give predictable performance, the tasks created using those primitives will have predictable performance as well. Most important, complex tasks can be hierarchically constructed by assembling simpler tasks into larger structures while still maintaining predictable performance. To demonstrate this scalable predictability, the @b(TCS) (@basks and @bonnection @bets) programming model is introduced and implemented on a real target machine, iWarp. The prototype is used to implement a variety of communication patterns and then compared with fast message-passing implementations on the same machine. Finally, the scalable, hierarchical nature of TCS tasks is demonstrated by implementing a portion of a real-time computer vision application. TCS is shown to be well-suited not only for this application, but also for similar applications requiring continuous high-bandwidth input, low-latency output, and multiple computations per datum. @blankspace(2line) @begin(transparent,size=10) @b(Keywords:@ )@c @end(transparent) @blankspace(1line) @end(text) @flushright(@b[(148 pages)])