Robert R. McCormick School of Engineering and Applied Science Electrical Engineering and Computer Science Department Center for Ultra-scale Computing and Information Security at Northwestern University


Project Team Members:

Northwestern University

Syracuse University

Air Force Research Labs

Return to Projects |  CUCIS Home
Project HomeParallel Pipeline Computation ModelInter-task Data RedistributionTask Scheduling and Processor AssignmentMulti-Threading Implementation on Intel Paragon SMP NodesPerformance Results of Parallel Pipeline STAP Implementation

Parallel Pipeline Computation Model

Figure 1. Model of the parallel pipeline system. The set of pipelines indicates that the same pipeline is repeated on subsequent input data sets. Task i for all input instances is executed on the same number of compute nodes.

The system model for the type of STAP applications considered in this work is shown in Figure 1. This model is suitable for the computational characteristics found in these applications. A pipeline is a collection of tasks which are executed sequentially. The input to the first task is obtained normally from sensors or other input devices and the inputs to the rest of the tasks in the pipeline are the outputs of their previous tasks. The set of pipelines shown in the figure indicates that the same pipeline is repeated on subsequent input data sets. Each block in a pipeline represents one parallel task, which itself is parallelized on multiple (different number of) compute nodes.

Figure 2. Three phases for each individual task: receive, compute, and send.

From a single task point of view, the execution flow consists of three phases: receive, compute, and send phases, shown in Figure 1 In the receive and send phases, communication involves data transfer between two different groups of compute nodes. It also involves message packing in the send phase and unpacking in the receive phase. Data redistribution strategy plays an important role in determining the communication performance. In the compute phase, work load is evenly partitioned among all compute nodes assigned in each task to achieve the maximum efficiency. For the parallel systems with multiple processors in each compute node, multi-threading technique can be employed to further improve the computation performance. We will discuss the implementation of multiple threads in our parallel pipeline system later in this paper.

Data dependency:

In such a parallel pipeline system, there exist both spatial and temporal parallelism that result in two types of data dependencies and flows, namely, spatial data dependency and temporal data dependency [1, 2]. Spatial data dependency can be classified into inter-task data dependency and intra-task data dependency. Intra-task data dependencies arise when a set of subtasks needs to exchange intermediate results during the execution of a parallel task in a pipeline. Inter-task data dependency is due to the transfer and reorganization of data passed onto the next parallel task in the pipeline. Temporal data dependency occurs when some form of output generated by the tasks executed on the previous data set are needed by tasks executing the current data set.


  1. A. Choudhary, "Parallel Architectures and Parallel Algorithms for Integrated Vision Systems", Kluwer Academic Publishers, Boston, MA, 1990.
  2. A. Choudhary and R. Ponnusamy, "Run-Time Data Decomposition for Parallel Implementation of IMage Processing and Computer Vision Tasks", Journal of Concurrency, Practice and Experience, 1992.

Click here to go back

Northwestern University EECS Home | McCormick Home | Northwestern Home | Calendar: Plan-It Purple
© 2011 Robert R. McCormick School of Engineering and Applied Science, Northwestern University
"Tech": 2145 Sheridan Rd, Tech L359, Evanston IL 60208-3118  |  Phone: (847) 491-5410  |  Fax: (847) 491-4455
"Ford": 2133 Sheridan Rd, Ford Building, Rm 3-320, Evanston, IL 60208  |  Fax: (847) 491-5258
Email Director

Last Updated: $LastChangedDate: 2015-02-19 15:02:26 -0600 (Thu, 19 Feb 2015) $