INSTRUCTIONS AND INSTRUCTION SEQUENCING Data transfers between the memory and the processor registers Arithmetic and logic operations on data Program sequencing and control I/O transfers
Register Transfer Notation s ymbolic name standing for its hardware binary address (LOC !" Identify a location by a symbolic
#$ Contents of a location are denoted by placing square brac%ets around the name of the location (!&'LOC) !* '!&)+!,)$ (!-.$ !egister -ransfer .otation (!-.$ Assembly Language Notation !epresent machine instructions and programs 0o1e LOC !& 2 !&'LOC) Add !& !, !* 2 !* '!&)+!,)
Basi instrution ty!es"# ty!es
•
-hree address instructions3 Add A4C A4C
A 43source operands C3destination operands •
-wo -wo address a ddress instructions3Add A4
4 56A) + 4) •
One address instructions 6Add A
Add contents of A to accumulator 7 store sum bac% to accumulator accu mulator •
8ero address instructions
Instruction store operands in a structure called push down stac% Instrution $ormats -hree3Address Instructions o
ADD
!& !, !*
!& ' !, + !*
-wo3Address Instructions -wo3Address o
ADD
!& !,
!& ' !& + !,
One3Address Instructions o
ADD
0
AC ' AC + 0A!)
8ero3Address Instructions o
ADD
-O9 ' -O9 + (-O9 6 &$
!I9C Instructions o
Lots of registers 0emory is restricted to Load 7 9tore
E%am!le& E'aluate (A)B*
(C)D*
T+ree"A,,ress
& ADD !& A 4 , ADD !, C D
: !& ' 0A) + 04) : !, ' 0C) + 0D)
* 0;L < !& !,
: 0<) ' !& ∗ !,
E%am!le& E'aluate (A)B*
(C)D*
T-o"A,,ress
& , * >
0O= ADD 0O= ADD
!& A !& 4 !, C !, D
: !& ' 0A) : !& ' !& + 04) : !, ' 0C) : !, ' !, + 0D)
? 0;L !& !, : !& ' !& ∗ !, @ 0O= < !& : 0<) ' !& E%am!le& E'aluate (A)B*
(C)D*
One"A,,ress
& , * > ?
LOAD A ADD 4 9-O! LOAD C ADD D
: AC ' 0A) : AC ' AC + 04) : 0-) ' AC : AC ' 0C) : AC ' AC + 0D)
@ 0;L -
: AC ' AC ∗ 0-)
B 9-O!
<
E%am!le& E'aluate (A)B*
: 0<) ' AC
(C)D*
.ero"A,,ress
& , * > ? @
P;9 P;9 ADD P;9 P;9 ADD
B 0;L
A 4 C D
: -O9 ' A : -O9 ' 4 : -O9 ' (A + 4$ : -O9 ' C : -O9 ' D : -O9 ' (C + D$ : -O9 ' (C+D$∗(A+4$
POP < : 0<) ' -O9 Instrution E%eution an, Straig+t"Line Se/uening •
-he processor control circuits use information in PC to fetch 7 eEecute instructions one at a time in order of increasing address
•
-his is called straight line sequencing
•
Eecuting an instruction3, phase procedures
•
&st phase 01instrution fet+2" instruction is fetched from memory location whose address is in PC
•
-his instruction is placed in instruction register in processor
•
,nd phase31instrution e%eute2" instruction in I! is eEamined to determine which operation to be performed
Bran+ing • 4ranch3type of instruction loads a new 1alue into program counter • 9o processor fetches 7 eEecutes instruction at this new address called Fbranch targetG • Conditional branch3causes a branch if a specified condition is satisfied • g 4ranchH" LOOP 6conditional branch instruction it eEecutes only if it satisfies condition A straig+t"line !rogram for a,,ing n numbers
Using a loo! to a,, n numbers
Con,ition o,es
• • •
!ecording required information in indi1idual bits called Fcondition code flagsG -hese flags are grouped together in a special processor register called Fcondition code registerG or Fstatus registerG Indi1idual condition code flags3& or "
Con,ition Co,es
•
Condition code flags
•
Condition code register / status register
• . (negati1e$ •
8 (ero$
•
= (o1erflow$
•
C (carry$
•
Different instructions affect different flags
•
Jour commonly used flags are 9et to & if the result is negati1e: otherwise cleared to " .(negati1e$ 9et to & if the result is ": otherwise cleared to " 8(ero$ 9et ot& if arithmetic o1erflow occurs: otherwise cleared to " =(o1erflow$ C(carry$ 9et to & if a carry3out results from the operation: otherwise cleared to "
INSTRUCTION SET ARC3ITECTURE
•
9uperscalar processor 33can eEecute more than one instruction per cycle
•
Cycle33smallest unit of time in a processor
•
Parallelism33the ability to do more than one thing at once
•
Pipelining--overlapping parts of a large task to increase throughput without decreasing latency
Instrution Set Ar+iteture (ISA*
-he Instruction Set Architecture (I9A$ is the part of the processor that is 1isible to the programmer or compiler writer -he I9A ser1es as the boundary between software and hardware Ke will briefly describe the instruction sets found in many of the microprocessors used today -he I9A of a processor can be described using ? categories -he * most common types of I9As are & Stack 3 -he operands are implicitly on top of the stac% , Accumulator 3 One operand is implicitly the accumulator * General Purpose Register (GPR) 3 All operands are eEplicitly mentioned they are either registers or memory locations
LetMs loo% at the assembly code of A 2 4 + C:
in all * architectures 9tac% P;9 A P;9 4 ADD POP C
Accumulator LOAD A ADD 4 9-O! C 3
NP! LOAD !&A ADD !&4 9-O! !&C 3
Stack A,'antages& 9imple 0odel of eEpression e1aluation (re1erse polish$ 9hort instructions Disa,'antages& A stac% cant be randomly accessed -his ma%es it hard to generate eficient code -he stac% itself is accessed e1ery operation and becomes a bottlenec%
Accumulator A,'antages& 9hort instructions Disa,'antages& -he accumulator is only temporary storage so memory traffic is the highest for this approach
GPR A,'antages& 0a%es code generation easy Data can be stored for long periods in registers Disa,'antages& All operands must be named leading to longer instructions
arlier CP;s were of the first , types but in the last &? years all CP;s made are NP! processors -he , maor reasons are that registers are faster than me mory the more data that can be %ept internally in the CP; the faster the program wil run -he other reason is that registers are easier for a compiler to use ADDRESSING 4ODES
-he different ways in which location of an op erand is specified in an instruction are referred as addressing modes T56ES O$ ADDRESSING 4ODES =ariable3represented by allocating a register or memory location to hold its 1alue 78 REGISTER 4ODE • -he operand is the contents of processor register: name of register is gi1en in instruction g 0o1e Loc !, • Processor registers are used as temporary storage locations where data in a register are accessed using register mode 98 ABSOLUTE 4ODE (OR* DIRECT 4ODE • -he operand is in a memory location the address of this location is gi1en eEplicitly in the instruction
g Integer A 4 • Absolute mode is used to access these 1ariables :8 I44EDIATE 4ODE • Address and data constants3represented in assembly language using immediate mode • Operand is gi1en eEplicitly in the instruction g 0o1e Q,"" !" • (Q$31alue is used as an immediate operand • 0ainly used to specify 1alue of a source operand #8 INDIRECT 4ODE • 0emory address of an operand can be determined by instruction • Address3called ffecti1e Address (A$ of an operand • A of an operand 6contents of a register • Khen absolute mode3not a1ailable indirect addressing through registers use to access global 1ariables ;8 INDE< 4ODE • Deals Kith lists and arrays • A3generated by adding constant 1alue to contents of register • IndeE registers 6one of set of general purpose registers in a processor • g <(!i$ • <3constant 1alue in instruction • !i3name of the register in1ol1ed • A2<+!i) • 9econd register is used indeE mode3(!i !$ • A3sum of contents of registers !i ! • 9econd register3base registereg <(!i!$ • A2<+!i)+!) • Ni1es more fleEibility
=8
• • • • • ?8
• • • • • • @8
• • •
RELATI>E 4ODE EA"for indeE mode is gi1en using program counter -his mode used to access data operands Common use3specify target address in branch instruction g 4ranchH" Loop Program eEecution got to branch target location identified by name loop if branch condition is satisfied8 AUTO INCRE4ENT 4ODE8 ;seful for accessing data items in successi1e locations in memory A of an operand 6contents of register specified in instruction After accessing operand 6contents of register is automatically incremented to point to neEt item in a list g (!i$+ Increment amount & for byte specified operands , for &@3bit operands > for *,3bit operands AUTODECRE4ENT 4ODE8 Contents of register specified in instruction are first automatically decremented 7 used as a A of the operand g 6(!i$ 0inus sign indicate contents to be decremented before being used as A
•
Operands are accessed in descending address order
ALU DESIGN
Instructions that in1ol1e an arithmetic or logic operation can be eEecuted using similar steps -hey differ from the Load instruction in two ways R -here are either two source registers or a source register and an immediate source operand R .o access to memory operands is required A typical instruction of this type is Add !* !> !? It requires the following steps & Jetch the instruction and increment the program counter , Decode the instruction and read the contents of source registers !> and !? * Compute the sum !>) + !?) > Load the result into the destination register !* -he Add instruction does not require access to an operand in the memory and therefore could be completed in four steps instead of the fi1e steps needed for the Load instruction owe1er as we will see in the neEt chapter it is ad1antageous to use the same multi3stage processing hardware for as many instructions as possible -his can be achie1ed if we arrange for all instructions to be eEecuted in the same number of steps -o this end the Add instruction should be eEtended to fi1e steps patterned along the steps of the Loa d instruction 9ince no access to memory operands is required we can insert a step in which no action ta%es place between steps * and > abo1e -he Add instruction would then be performed as follows & Jetch the instruction and increment the program counter , Decode the instruction and read registers !> and !? * Compute the sum !>) + !?) > .o action ? Load the result into the destination register !* If the instruction uses an immediate operand as in Add !* !> Q&""" the immediate 1alue is gi1en in the instruction word Once the instruction is loaded into the I! the immediate 1alue is a1ailable for use in the addition operation -he same fi1e3step sequence can be used with steps , and * modified as , Decode the instruction and read register !> * Compute the sum !>) + &"""
Addition logic for a single stage
n-bit
R R
adder Cascade n full adder (JA$ bloc%s to form a n3bit adder Carries propagate or ripple through this cascade n3bit ripple carry adder
Carry3in c0 into the L94 position pro1ides a con1enient way to perform subtraction n-bit subtractor R !ecall X – Y is equi1alent to adding ,Ms complement of Y to X R ,Ms complement is equi1alent to &Ms complement + & R X – Y = X + Y + 1 R ,Ms complement of positi1e and negati1e numbers is computed similarly.
n-bit
adder/subtractor
R Add/sub control 2 " addition R Add/sub control 2 & subtraction Sequential multiplication !ecall the rule for generating partial products If the ith bit of the multiplier is & add the appropriately shifted multiplicand to the
current partial product 0ultiplicand has been shifted left when added to the partial product owe1er adding a left3shifted multiplicand to an unshifted partial product is equi1alent
to adding an unshifted multiplicand to a right3shifted partial product
ircuit arrangement for binary division
BASIC PROCESSING UNIT INTRODUCTION R
!nstruction Set Processor "!SP#
R
entral Processing $nit "P$#
R
A typical computing task consists of a series of steps speci%ed by a sequence of machine instructions that constitute a program&
R
An instruction is e'ecuted by carrying out a sequence of more ru dimentary operations&
$UNDA4ENTAL CONCE6TS
•
Processor fetches one instruction at a time and perform the operation speci%ed&
•
!nstructions are fetched from successive memory locations until a branch or a (ump instruction is encountered&
•
Processor keeps track of the address of the memory location containing the ne't instruction to be fetched using Program ounter "P#&
•
!nstruction Register "!R#
E
)'ecution of one instruction requires the following three steps to be performed by the P$* +& ,etch the contents of the memory location pointed at by the P& he contents of this location are intepreted as an instruction to be e'ecuted& .ence they are stored in the instruction register "!R#& Simbolically this can be written as* !R ⇓00P11 2& Assuming that the memory is byte addressable increment the contents of the P by 3 that is P ⇓ 0P1 4 3 5& arry out the actions speci%ed by the instruction stored in the !R
6ut in cases where an instruction occupies more than one word steps + and 2 must be repeated as many times as necessary to fetch the complete instruction& wo %rst steps are ussually referred to as the fetch phase& Step 5 constitutes the e'ecution phase
6ut in cases where an instruction occupies more than one word steps + and 2 must be repeated as many times as necessary to fetch the complete instruction& • wo %rst steps are usually referred to as the fetch phase& •
Step 5 constitutes the e'ecution phase
•
,etch the contents of a given memory location and load them into a P$
•
Register
•
Store a word of data from a P$ register into a given memory location&
• ransfer a word of data from one P$ register to another or to A7$& •
Perform an arithmetic or logic operation and store the result in a P$ register&
E
• ransfer a word of data from one processor register to another or to the
A7$& •
Perform an arithmetic or a logic operation and store the result in a processor register&
•
,etch the contents of a given memory location and load them into a processor register&
•
Store a word of data from a processor register into a given memory location&
REGISTER TRANS$ER
he input and output gates for register Ri are controlled by the signals Riin and Riout respectively& • hus when Riin is set to + the data available on the common bus is
loaded into Ri& •
Similarly when Riout is set to + the contents of register Ri are placed on the bus&
•
8hile Riout is equal to 9 the bus can be used for transferring data from other registers&
7et us now consider data transfer between two registers& ,or e'ample to transfer the contents of register R+ to R3 the following actions are needed* •
)nable the output gate of register R+ by setting R+out to +& his places the contents of R+ on the P$ bus&
•
)nable the input gate of register R3 by setting R3in to +& his loads data from
•
the P$ bus into register R3&
Performing an Arithmetic Or Logic Operation R he A7$ is a combinational circuit that has no internal storage& R
A7$ gets the two operands from :$; and bus& he result is temporarily stored in register <
R
A sequence of operations to add the contents of register r+ to those of register r2 and store the result in register r5 is* •
R+out =in
•
R2out Select = Add
•
$ETC3ING A ORD $RO4 4E4OR5
•
P$ transfers the address of the required information word to the memory address register ":AR#& Address of the required word is transferred to the main memory&
•
:eanwhile the P$ uses the control lines of the memory bus to indicate that a read operation is required&
•
After issuing this request the P$ waits until it receives an answer from the memory informing it that the requested function has been completed& his is accomplished through the use of another control signal on the memory bus which will be referred to as :emory ,unction ompleted ":,#&
• he memory sets this signal to + to indicate that the contents of the
speci%ed location in the memory have been read and are available on the data lines of the memory bus& •
8e will assume that as soon as the :, signal is set to + the information on the data lines is loaded into :>R and is thus available for use inside the P$& his completes the memory fetch operation&
The actions needed for instruction o!e "R#$% R& are' •
:AR
•
Start Read operation on the memory bus
•
8ait for the :, response from the memory
•
7oad :>R from the memory bus
•
R2
0R+1
0:>R1
Signa(s acti!ated for that pro)(em are'
R+out :ARin Read
:>Rin) 8:, :>Rout R2in Storing a *ord in emor+ • hat is similar procedure with fetching a word from memory& • he desired address is loaded into :AR • hen data to be written are loaded into :>R and a write command is
issued& •
!f we assume that the data word to be stored in the memory is in R2 and that the memory address is in R+ the 8rite operation requires the following sequence *
•
:AR
•
:>R
•
8rite
•
8ait for the :,
0R+1
0R21
o!e R&% "R#$ re,uires the fo((o*ing se,uence "signa($'
R+out :ARin R2out :>Rin& 8rite :>Rout)8:, E
onsider the instruction * Add "R-$% R# •
)'ecuting this instruction requires the following actions *
•
,etch the instruction
•
,etch the %rst operand "the contents of the memory location pointed to by R5#
•
Perform the addition
•
7oad the result into R+
Contro( Se,uence for instruction Add "R-$% R# •
Pout :ARin Read Select3 Add
•
•
:>Rout !Rin
•
R5out :ARin Read
•
R+out =in 8ait for :,
•
:>Rout Select = Add
•
Branch Instructions •
Pout :ARin Read Select3 Add
•
•
:>Rout !rin
•
o?set@%eld@of@!Rout Add
•
Internal processor bus Control signals PC Instruction Address lines
decoder and MAR
control logic
Memory bus MDR Data lines
IR
Y R0
Constant 4
Select
MUX Add
ALU control lines
Sub
A
B
R( n - 1 )
ALU Carry-in XOR
TEMP Z
-
4ULTI6LE BUS ORGANI.ATION
ne solution to the bandwidth limitation of a single bus is to simply add additional buses& onsider the architecture shown in ,igure 2&2 that contains N processors P+ P2 PN each having its own private cache and all connected to a shared memory by B buses 6+ 62 6B& he shared memory consists of M interleaved banks :+ :2 :M to allow simultaneous memory requests concurrent access to the shared memory& his avoids the loss in performance that occurs if those accesses must be serialiBed which is the case when there is only one memory bank& )ach processor is connected to every bus and so is each memory bank& 8hen a processor needs to access a particular bank it has B buses from which to choose& hus each processormemory pair is connected by several redundant paths which implies that the
failure of one or more paths can in principle be tolerated at the cost of some degradation in system performance& !n a multiple bus system several processors may attempt to access the shared memory simultaneously& o deal with this a policy must be implemented that allocates the available buses to the processors making requests to memory& !n particular the policy must deal with the case when the number of processors e'ceeds B& ,or performance reasons this allocation must be carried out by hardware arbiters which as we shall see add signi%cantly to the comple'ity of the multiple bus interconnection network& •
Pout RC6 :ARin Read !ncP
•
8,:
•
:>Rout6 RC6 !Rin
•
R3out RDout6 SelectA Add REin )nd&