    Pipelining in Computer Architecture

    Parallel Processing

    A Parallel Processing system is able to perform concurrent data processing to achieve faster execution time.

    Example: While an instruction is being executed in the ALU, the next instruction can be read from memory.

    The system may have two or more ALUs and be able to execute two or more instructions at the same time. “So the purpose of
    parallel processing is to speedup the computer processing capabilities.”

    Pipelining Case: Laundry

    4 loads of laundry that need to washed, dried, and folded.

    – 30 minutes to wash, 40 min. to dry, and 20 min. to fold.
    – We have 1 washer, 1 dryer, and 1 folding station.

    What’s the most efficient way to get the 4 loads of laundry done?



    Non Pipelined Laundry • Takes a total of 6 hours; nothing is done in parallel

    Pipelined Laundry • Using this method, the laundry would be done at 9:30.



    Pipelining is an speed up technique where multiple instructions are overlapped in execution on a processor.


    Computers, like laundry, typically perform the exact same steps for every  instruction:–

    • Fetch an instruction from memory
    • Decode the instruction
    • Execute the instruction
    • Read memory to get input
    • Write the result back to memory

    Instruction Pipeline

    Instruction execution process lends itself naturally to pipelining
    overlap the subtasks of instruction fetch, decode and execute

    • Fetch instruction (FI)
    • Decode instruction (DI)
    • Calculate operands (CO)
    • Fetch operands (FO)
    • Execute instructions (EI)
    • Write result (WR) Overlap these operations

    Instruction pipeline has six operations

    Instructions Fetch • The IF stage is responsible for obtaining the requested instruction from memory. The instruction and
    the program counter are stored in the register as temporary storage.

    Decode Instruction • The DI stage is responsible for decoding the instruction and sending out the various control lines to
    the other parts of the processor.

    Calculate Operands • The CO stage is where any calculations are performed. The main component in this stage is the ALU.
    The ALU is made up of arithmetic, logic and capabilities.

    Fetch Operands and Execute Instruction • The FO and EI stages are responsible for storing and loading values to and from memory.
    They also responsible for input and output from the processor respectively.

    Write Operands • The WO stage is responsible for writing the result of a calculation, memory access or input into
    the register file.

    Six Stage Instruction Pipeline

    Timing Diagram for Instruction Pipeline Operation




    Pipeline Performance: Clock & Timing Si Si+1  m d Clock cycle of the pipeline :  Latch delay : d  = max {m } + d Pipeline
    frequency : f f = 1 /  6


    Pipeline Performance: Speedup & Efficiency k-stage pipeline processes n tasks in k + (n-1) clock cycles: k cycles for the first task and n-1 cycles for the remaining n-1 tasks Total time to process n tasks Tk = [ k + (n-1)]

    For the non-pipelined processor T1 = n k Speedup factor Sk = T1 Tk = n k  [ k + (n-1)]  = n k k + (n-1) 7


    Pipelining makes efficient use of resources.
    Quicker time of execution of large number of instructions
    The parallelism is invisible to the programmer.



