ࡱ> |}~zam>_z6;+( (x/ (/ 0`DArialr NewmanttL  0"DComic Sans MSnttL  0B DWingdings MSnttL  00DTimes New RomanttL  0@DSymbolew RomanttL  0PDTimesew RomanttL  0 `DGenevaew RomanttL  0pDCourier NewmanttL  01@ .  @n?" dd@  @@`` ""!;D jR -r[3|&+-G(!4--PPw4vQ7r "f     8-/!.R?_)   4(3 mJ v-   )p 60>!.+/ N] 7"f5 04vD6mC@0d:(W   +xK7 $ %"!L1O 6C#"L <6@Ek7}(`:\3Q4'$#("**'!/!,,][j~D;dAEef8[mzD Krhh.W ' \ =,W=Y.)  `c8 t'%:S]a9:;{MM%#0"?.s$Y|4 YXc| nG8cgbyv{ -E!|S>b(Hs  <L + 7P+ ;    ,r$m>_z6;+i 0AA 33f@8wrTʚ;3 ;ʚ;g4ddddY  0fppp@ <4dddd@w 0tL  <4BdBd@x 0t80___PPT10 ZZ?  %O  =)XMemory ,COE 308 Computer Architecture Prof. Muhamed Mudawar Computer Engineering Department King Fahd University of Petroleum and MineralsB4Z 0Zd/Z ,OKPresentation OutlineRandom Access Memory and its Structure Memory Hierarchy and the need for Cache Memory The Basics of Caches Cache Performance and Memory Stall Cycles Improving Cache Performance Multilevel Caches d'7Random Access MemoryRLarge arrays of storage cells Volatile memory Hold the stored data as long as it is powered on Random Access Access time is practically the same to any data on a RAM chip Chip Select (CS) control signal Select RAM chip to read/write Read/Write (R/W) control signal Specifies memory operation 2n m RAM chip: n-bit address and m-bit data. 21 2 2> 2 2 2 2 2. 2.1>              ITypical Memory StructureRow decoder Select row to read/write Column decoder Select column to read/write Cell Matrix 2D array of tiny memory cells Sense/Write amplifiers Sense & amplify data on read Drive bit line with data in on write Same data lines are used for data in/out 2 2 2 2 2 2 2B 2) 2   B ) JStatic RAM Storage Cell,Static RAM (SRAM): fast but expensive RAM 6-Transistor cell with no static current Typically used for caches Provides fast access time Cell Implementation: Cross-coupled inverters store bit Two pass transistors Row decoder selects the word line Pass transistors enable the cell to be read and writtenR F Ff Dynamic RAM Storage CellxDynamic RAM (DRAM): slow, cheap, and dense memory Typical choice for main memory Cell Implementation: 1-Transistor cell (pass transistor) Trench capacitor (stores bit) Bit is stored as a charge on capacitor Must be refreshed periodically Because of leakage of charge from tiny capacitor Refreshing for all memory rows Reading each row and writing it back to restore the chargef 2B 2F 21 2 2; 2E B1; <DRAM Refresh CyclesRefresh cycle is about tens of milliseconds Refreshing is done for the entire memory Each row is read and written back to restore the charge Some of the memory bandwidth is lost to refresh cycles <M#Loss of Bandwidth to Refresh CyclesExample: A 256 Mb DRAM chip Organized internally as a 16K 16K cell matrix Rows must be refreshed at least once every 50 ms Refreshing a row takes 100 ns What fraction of the memory bandwidth is lost to refresh cycles? Solution: Refreshing all 16K rows takes: 16 1024 100 ns = 1.64 ms Loss of 1.64 ms every 50 ms Fraction of lost memory bandwidth = 1.64 / 50 = 3.3% < < < < 1 #cLTypical DRAM PackagingB24-pin dual in-line package for 16Mbit = 222 4 memory 22-bit address is divided into 11-bit row address 11-bit column address Interleaved on same address linesfW <K <* )K]Trends in DRAMgDRAM capacity quadrupled every three years until 1996 After 1996, DRAM capacity doubled every two yearsh hNExpanding the Data Bus WidthMemory chips typically have a narrow data bus We can expand the data bus width by a factor of p Use p RAM chips and feed the same address to all chips Use the same Chip Select and Read/Write control signalsT` <o <^iO Increasing Memory Capacity by 2k!A k to 2k decoder is used to select one of the 2k chips Upper n bits of address is fed to all memory chips Lower k bits of address are decoded to select one of the 2k chips8 2u 2'23b Next . . .Random Access Memory and its Structure Memory Hierarchy and the need for Cache Memory The Basics of Caches Cache Performance and Memory Stall Cycles Improving Cache Performance Multilevel Caches( d'/m Processor-Memory Performance Gap1980  No cache in microprocessor 1995  Two-level cache on microprocessorPThe Need for Cache MemoryWidening speed gap between CPU and main memory Processor operation takes less than 1 ns Main memory requires more than 50 ns to access Each instruction involves at least one memory access One memory access to fetch the instruction A second memory access for load and store instructions Memory bandwidth limits the instruction execution rate Cache memory can help bridge the CPU-memory gap Cache memory is small in size but fastj/ <X <5 <b < </X5bTypical Memory Hierarchy&Registers are at the top of the hierarchy Typical size < 1 KB Access time < 0.5 ns Level 1 Cache (8  64 KB) Access time: 0.5  1 ns L2 Cache (512KB  8MB) Access time: 2  10 ns Main Memory (1  2 GB) Access time: 50  70 ns Disk Storage (> 200 GB) Access time: milliseconds*)*)  "Principle of Locality of ReferencePrograms access small portion of their address space At any time, only a small set of instructions & data is needed Temporal Locality (in time) If an item is accessed, probably it will be accessed again soon Same loop instructions are fetched each iteration Same procedure may be called and executed many times Spatial Locality (in space) Tendency to access contiguous instructions/data in memory Sequential execution of Instructions Traversing arrays element by element5 <? < < < < < ?  ^   % What is a Cache Memory ?TSmall and fast (SRAM) memory technology Stores the subset of instructions & data currently being accessed Used to reduce average access time to memory Caches exploit temporal locality by & Keeping recently accessed data closer to the processor Caches exploit spatial locality by & Moving blocks consisting of multiple contiguous words Goal is to achieve Fast speed of cache memory access Balance the cost of the memory system( 2B 2S 27 2% 26 2 2I 2(B<76 %WCache Memories in the Datapath Almost Everything is a Cache !"In computer architecture, almost everything is a cache! Registers: a cache on variables  software managed First-level cache: a cache on second-level cache Second-level cache: a cache on memory Memory: a cache on hard disk Stores recent programs and their data Hard disk can be viewed as an extension to main memory Branch target and prediction buffer Cache on branch target and prediction information <] <$ <2 <E) ]$ c Next . . .Random Access Memory and its Structure Memory Hierarchy and the need for Cache Memory The Basics of Caches Cache Performance and Memory Stall Cycles Improving Cache Performance Multilevel Caches( dVXFour Basic Questions on CachesoQ1: Where can a block be placed in a cache? Block placement Direct Mapped, Set Associative, Fully Associative Q2: How is a block found in a cache? Block identification Block address, tag, index Q3: Which block should be replaced on a miss? Block replacement FIFO, Random, LRU Q4: What happens on a write? Write strategy Write Back or Write Through (with Write Buffer), #B #% #/ #. #$ # #? #,2%. 0Block Placement: Direct MappedBlock: unit of data transfer between cache and memory Direct Mapped Cache: A block can be placed in exactly one location in the cacheFK;1;Direct-Mapped CacheA memory address is divided into Block address: identifies block in memory Block offset: to access bytes within a block A block address is further divided into Index: used for direct cache access Tag: most-significant bits of block address Index = Block Address mod Cache Blocks Tag must be stored also inside cache For block identification A valid bit is also required to indicate Whether a cache block is valid or not! -W -( -P -( -% - -) -& -!  !()                      &8Direct Mapped Cache  cont dRCache hit: block is stored inside cache Index is used to access cache block Address tag is compared against stored tag If equal and cache block is valid then hit Otherwise: cache miss If number of cache blocks is 2n n bits are used for the cache index If number of bytes in a block is 2b b bits are used for the block offset If 32 bits are used for an address 32  n  b bits are used for the tag Cache data size = 2n+b bytest(   $ $ % # %   v #"$#                X#Mapping an Address to a Cache BlockExample Consider a direct-mapped cache with 256 blocks Block size = 16 bytes Compute tag, index, and byte offset of address: 0x01FFF8AC Solution 32-bit address is divided into: 4-bit byte offset field, because block size = 24 = 16 bytes 8-bit cache index, because there are 28 = 256 blocks in cache 20-bit tag field Byte offset = 0xC = 12 (least significant 4 bits of address) Cache index = 0x8A = 138 (next lower 8 bits of address) Tag = 0x01FFF (upper 20 bits of address)  /2(Y#Example on Cache Placement & MissesConsider a small direct-mapped cache with 32 blocks Cache is initially empty, Block size = 16 bytes The following memory addresses (in decimal) are referenced: 1000, 1004, 1008, 2548, 2552, 2556. Map addresses to cache blocks and indicate whether hit or miss Solution: 1000 = 0x3E8 cache index = 0x1E Miss (first access) 1004 = 0x3EC cache index = 0x1E Hit 1008 = 0x3F0 cache index = 0x1F Miss (first access) 2548 = 0x9F4 cache index = 0x1F Miss (different tag) 2552 = 0x9F8 cache index = 0x1F Hit 2556 = 0x9FC cache index = 0x1F Hit~4l%? <4 # 4?  Fully Associative Cache"A block can be placed anywhere in cache no indexing If m blocks exist then m comparators are needed to match tag Cache data size = m 2b bytesME  !Set-Associative CacheA set is a group of blocks that can be indexed A block is first mapped onto a set Set index = Block address mod Number of sets in cache If there are m blocks in a set (m-way set associative) then m tags are checked in parallel using m comparators If 2n sets exist then set index consists of n bits Cache data size = m 2n+b bytes (with 2b bytes per block) Without counting tags and valid bits A direct-mapped cache has one block per set (m = 1) A fully-associative cache has one set (2n = 1 or n = 0)`R 26 2< 23 2n 2% 2l 2M   $   %-.Set-Associative Cache Diagram  Write Policy/Write Through: Writes update cache and lower-level memory Cache control bit: only a Valid bit is needed Memory always has latest data, which simplifies data coherency Can always discard cached data when a block is replaced Write Back: Writes update cache only Cache control bits: Valid and Modified bits are required Modified cached data is written back to memory when replaced Multiple writes to a cache block require only one write to memory Uses less memory bandwidth than write-through and less power However, more complex to implement than write through& # # #D #E! Z -   HYWrite Miss PolicyWhat happens on a write miss? Write Allocate: Allocate new block in cache Write miss acts like a read miss, block is fetched and updated No Write Allocate: Send data to lower-level memory Cache is not modified Typically, write back caches use write allocate Hoping subsequent writes will be captured in the cache Write-through caches often use no-write allocate Reasoning: writes must still go to lower level memory.[60716[6071 6 Write BufferJDecouples the CPU write from the memory bus writing Permits writes to occur without stall cycles until buffer is full Write-through: all stores are sent to lower level memory Write buffer eliminates processor stalls on consecutive writes Write-back: modified blocks are written when replaced Write buffer is used for evicted blocks that must be written back The address and modified data are written in the buffer The write is finished from the CPU perspective CPU continues while the write buffer prepares to write memory If buffer is full, CPU stalls until buffer has an empty entry 65 2B 29 2? 26 2B 28 2m 2? 259 6    m   " What Happens on a Cache Miss?ICache sends a miss signal to stall the processor Decide which cache block to allocate/replace One choice only when the cache is directly mapped Multiple choices for set-associative or fully-associative cache Transfer the block from lower level memory to this cache Set the valid bit and the tag field from the upper address bits If block to be replaced is modified then write it back Modified block is moved into a Write Buffer Otherwise, block to be replaced can be simply discarded Restart the instruction that caused the cache miss Miss Penalty: clock cycles to process a cache missr^r9@7df r9    . 3%jReplacement PolicyWhich block to be replaced on a cache miss? No selection alternatives for direct-mapped caches m blocks per set to choose from for associative caches Random replacement Candidate blocks are randomly selected One counter for all sets (0 to m  1): incremented on every cycle On a cache miss replace block specified by counter First In First Out (FIFO) replacement Replace oldest block in set One counter per set (0 to m  1): specifies oldest block to replace Counter is incremented on a cache miss&_6'U& 3k6Replacement Policy  cont dLeast Recently Used (LRU) Replace block that has been unused for the longest time Order blocks within a set from least to most recently used Update ordering of blocks on each cache hit With m blocks per set, there are m! possible permutations Pure LRU is too costly to implement when m > 2 m = 2, there are 2 permutations only (a single bit is needed) m = 4, there are 4! = 24 possible permutations LRU approximation are used in practice For large m > 4, Random replacement can be as effective as LRU - -/ - - -/ -l 5lComparing Random, FIFO, and LRUCData cache misses per 1000 instructions 10 SPEC2000 benchmarks on Alpha processor Block size of 64 bytes LRU and FIFO outperforming Random for a small cache Little difference between LRU and Random for a large cache LRU is expensive for large associativity (# blocks per set) Random is the simplest to implement in hardwareH(l(;0d Next . . .Random Access Memory and its Structure Memory Hierarchy and the need for Cache Memory The Basics of Caches Cache Performance and Memory Stall Cycles Improving Cache Performance Multilevel Caches( dk*. Hit Rate and Miss RateHit Rate = Hits / (Hits + Misses) Miss Rate = Misses / (Hits + Misses) I-Cache Miss Rate = Miss rate in the Instruction Cache D-Cache Miss Rate = Miss rate in the Data Cache Example: Out of 1000 instructions fetched, 150 missed in the I-Cache 25% are load-store instructions, 50 missed in the D-Cache What are the I-cache and D-cache miss rates? I-Cache Miss Rate = 150 / 1000 = 15% D-Cache Miss Rate = 50 / (25% 1000) = 50 / 250 = 20%l - -\ -X& \!Memory Stall Cycles The processor stalls on a Cache miss When fetching instructions from the Instruction Cache (I-cache) When loading or storing data into the Data Cache (D-cache) Memory stall cycles = Combined Misses Miss Penalty Miss Penalty: clock cycles to process a cache miss Combined Misses = I-Cache Misses + D-Cache Misses I-Cache Misses = I-Count I-Cache Miss Rate D-Cache Misses = LS-Count D-Cache Miss Rate LS-Count (Load & Store) = I-Count LS Frequency Cache misses are often reported per thousand instructions% -{ -6 -4 - -: -$'&&:U#Memory Stall Cycles Per InstructionMemory Stall Cycles Per Instruction = Combined Misses Per Instruction Miss Penalty Miss Penalty is assumed equal for I-cache & D-cache Miss Penalty is assumed equal for Load and Store Combined Misses Per Instruction = I-Cache Miss Rate + LS Frequency D-Cache Miss Rate Therefore, Memory Stall Cycles Per Instruction = I-Cache Miss Rate Miss Penalty + LS Frequency D-Cache Miss Rate Miss Penalty& 20 2 26 21 2U 2&0e"61U"Example on Memory Stall CyclesConsider a program with the given characteristics Instruction count (I-Count) = 106 instructions 30% of instructions are loads and stores D-cache miss rate is 5% and I-cache miss rate is 1% Miss penalty is 100 clock cycles for instruction and data caches Compute combined misses per instruction and memory stall cycles Combined misses per instruction in I-Cache and D-Cache 1% + 30% 5% = 0.025 combined misses per instruction Equal to 25 misses per 1000 instructions Memory stall cycles 0.025 100 (miss penalty) = 2.5 stall cycles per instruction Total memory stall cycles = 106 2.5 = 2,500,000~2 #  #7 #_ # #q #27'       ?   $!CPU Time with Memory Stall CyclesCPIPerfectCache = CPI for ideal cache (no cache misses) CPIMemoryStalls = CPI in the presence of memory stalls Memory stall cycles increase the CPIP  ) M$)M%!Example on CPI with Memory StallsA processor has CPI of 1.5 without any memory stalls Cache miss rate is 2% for instruction and 5% for data 20% of instructions are loads and stores Cache miss penalty is 100 clock cycles for I-cache and D-cache What is the impact on the CPI? Answer: Mem Stalls per Instruction = CPIMemoryStalls = CPIMemoryStalls / CPIPerfectCache = Processor is 3 times slower due to memory stall cycles CPINoCache =*5 # #' # #"              b< mAverage Memory Access TimeAverage Memory Access Time (AMAT) AMAT = Hit time + Miss rate Miss penalty Time to access a cache for both hits and misses Example: Find the AMAT for a cache with Cache access time (Hit time) of 1 cycle = 2 ns Miss penalty of 20 clock cycles Miss rate of 0.05 per access Solution: AMAT = 1 + 0.05 20 = 2 cycles = 4 ns Without the cache, AMAT will be equal to Miss penalty = 20 cycles" 2, 2X 2l 2 2i 2"+/(l i Z"Designing Memory to Support Caches aMemory InterleavingMemory interleaving is more flexible than wide access A block address is sent only once to all memory banks Words of a block are distributed (interleaved) across all banks Banks are accessed in parallel Words are transferred one at a time on each bus cycle&66[Estimating the Miss PenaltyTiming Model: Assume the following & 1 memory bus cycle to send address 15 memory bus cycles for DRAM access time 1 memory bus cycle to send data Cache Block is 4 words One-Word-Wide Memory Organization Miss Penalty = 1 + 4 15 + 4 1 = 65 memory bus cycles Wide Memory Organization (2-word wide) Miss Penalty = 1 + 2 15 + 2 1 = 33 memory bus cycles Interleaved Memory Organization (4 banks) Miss Penalty = 1 + 1 15 + 4 1 = 20 memory bus cycles% # #" #9 #' #9 #* #9 #%"8'8*8h Next . . .Random Access Memory and its Structure Memory Hierarchy and the need for Cache Memory The Basics of Caches Cache Performance and Memory Stall Cycles Improving Cache Performance Multilevel Caches( d+Improving Cache PerformanceAverage Memory Access Time (AMAT) AMAT = Hit time + Miss rate * Miss penalty Used as a framework for optimizations Reduce the Hit time Small and simple caches Reduce the Miss Rate Larger cache size, higher associativity, and larger block size Reduce the Miss Penalty Multilevel caches" 2, 2& d 2 2 2? 2 2 2"+%?  ,Small and Simple Caches?Hit time is critical: affects the processor clock cycle Fast clock rate demands small and simple L1 cache designs Small cache reduces the indexing time and hit time Indexing a cache represents a time consuming portion Tag comparison also adds to this hit time Direct-mapped overlaps tag check with data transfer Associative cache uses additional mux and increases hit time Size of L1 caches has not increased much L1 caches are the same size on Alpha 21264 and 21364 Same also on UltraSparc II and III, AMD K6 and Athlon Reduced from 16 KB in Pentium III to 8 KB in Pentium 48:3_4=)$:3_4=) >Z 8):Classifying Misses  Three CsConditions under which misses occur Compulsory: program starts with no block in cache Also called cold start misses Misses that would occur even if a cache has infinite size Capacity: misses happen because cache size is finite Blocks are replaced and then later retrieved Misses that would occur in a fully associative cache of a finite size Conflict: misses happen because of limited associativity Limited number of blocks per set Non-optimal replacement algorithmV 7X 75 7s 79 7C 7$ ( :-s1C *6Classifying Misses  cont d -$Larger Size and Higher AssociativityXIncreasing cache size reduces capacity misses It also reduces conflict misses Larger cache size spreads out references to more blocks Drawbacks: longer hit time and higher cost Larger caches are especially popular as 2nd level caches Higher associativity also improves miss rates Eight-way set associative is as effective as a fully associativejN P8 P PA PN8T<A.Larger Block SizerSimplest way to reduce miss rate is to increase block size However, it increases conflict misses if cache is smalls 2si Next . . .Random Access Memory and its Structure Memory Hierarchy and the need for Cache Memory The Basics of Caches Cache Performance and Memory Stall Cycles Improving Cache Performance Multilevel Caches d/Multilevel CacheshTop level cache should be kept small to Keep pace with processor speed Adding another cache level Can reduce the memory gap Can reduce memory bus loading Local miss rate Number of misses in a cache / Memory accesses to this cache Miss RateL1 for L1 cache, and Miss RateL2 for L2 cache Global miss rate Number of misses in a cache / Memory accesses generated by CPU Miss RateL1 for L1 cache, and Miss RateL1 Miss RateL2 for L2 cache(8s(8<       @    0Multilevel Cache PoliciesMultilevel Inclusion L1 cache data is always present in L2 cache A miss in L1, but a hit in L2 copies block from L2 to L1 A miss in L1 and L2 brings a block into L1 and L2 A write in L1 causes data to be written in L1 and L2 Typically, write-through policy is used from L1 to L2 Typically, write-back policy is used from L2 to main memory To reduce traffic on the memory bus A replacement or invalidation in L2 must be propagated to L1Z P> P$ P= P>$=1DMultilevel Cache Policies  cont dMultilevel exclusion L1 data is never found in L2 cache  Prevents wasting space Cache miss in L1, but a hit in L2 results in a swap of blocks Cache miss in both L1 and L2 brings the block into L1 only Block replaced in L1 is moved into L2 Example: AMD Athlon Same or different block size in L1 and L2 caches Choosing a larger block size in L2 can improve performance However different block sizes complicates implementation Pentium 4 has 64-byte blocks in L1 and 128-byte blocks in L2\ < <1 < <12BTwo-Level Cache Performance  1/2>Average Memory Access Time: AMAT = Hit TimeL1 + Miss RateL1 Miss PenaltyL1 Miss Penalty for L1 cache in the presence of L2 cache Miss PenaltyL1 = Hit TimeL2 + Miss RateL2 Miss PenaltyL2 Average Memory Access Time with a 2nd Level cache: AMAT = Hit TimeL1 + Miss RateL1 (Hit TimeL2 + Miss RateL2 Miss PenaltyL2) Memory Stall Cycles per Instruction = Memory Access per Instruction (AMAT  Hit TimeL1)$ 72 76 7< 73 7R 7& 75 7     6              #          %0 3BTwo-Level Cache Performance  2/2Average memory stall cycles per instruction = Memory Access per Instruction Miss RateL1 (Hit TimeL2 + Miss RateL2 Miss PenaltyL2) Average memory stall cycles per instruction = Misses per instructionL1 Hit TimeL2 + Misses per instructionL2 Miss PenaltyL2 Misses per instructionL1 = MEM access per instruction Miss RateL1 Misses per instructionL2 = MEM access per instruction Miss RateL1 Miss RateL2.].U*8/         /             4Example on Two-Level CachesDProblem: Miss RateL1 = 4%, Miss RateL2 = 25% Hit time of L1 cache is 1 cycle and of L2 cache is 10 cycles Miss penalty from L2 cache to memory is 100 cycles Memory access per instruction = 1.25 (25% data accesses) Compute AMAT and memory stall cycles per instruction Solution: AMAT = 1 + 4% (10 + 25% 100) = 2.4 cycles Misses per instruction in L1 = 4% 1.25 = 5% Misses per instruction in L2 = 4% 25% 1.25 = 1.25% Memory stall cycles per instruction = 5% 10 + 1.25% 100 = 1.75 Can be also obtained as: (2.4  1) 1.25 = 1.75 cycles: 0n0n 0n0n    5-7 //8"Shl   0` 33` Sf3f` 33g` f` www3PP` ZXdbmo` \ғ3y`Ӣ` 3f3ff` 3f3FKf` hk]wwwfܹ` ff>>\`Y{ff` R>&- {p_/̴>?" dd@$?vdd(@#?  n?" dd@   @@``PR    ?  ` p>>    ~(    68   `   T Click to edit Master title style! !  0|   `x   RClick to edit Master text styles Second level Third level Fourth level Fifth level!     SN  0 " `* Memory COE 308  Computer Architecture Muhamed Mudawar  slide *C 2Cc A H  0޽h ? 33___PPT10i.И@r+D='  = @B + Default Design  0 QI`(    6|a   `   T Click to edit Master title style! !  0   `b   W#Click to edit Master subtitle style$ $H  0޽h ? 3380___PPT10.88/k&d 0  t(    0$c  B  q  P*    0)c   wB c  R*  d  c $ ?qU  c   0,c   K c  RClick to edit Master text styles Second level Third level Fourth level Fifth level!     S  62c  .  c  P*  :  66c   w.` c H@___PPT9"@ l*"  H  0rllC ? 3380___PPT10.J F   h(  h h Nc tt B  c  \* p88pp h Nc tt  wB c  ^* p88pp h Tc tt .  c  \* p88ppt h Tc tt  w.` c H@___PPT9"@ *"  p88ppH h 0rllC ? 3380___PPT10.Jp: 0  06(  0~ 0 s *  `r   x 0 c $4  `   H 0 0޽h ? 33___PPT10i.И0+D='  = @B +  0 .P0(  Px P c $ M  `   x P c $M     H P 0޽h ? 33___PPT10i.Ȝ0f+D='  = @B +   0   p-0  (  r  S =  `x   r  S d)  `    8 vo 9  o v9 m @ vo 9  vo 9   T @jJ"`o 93  9RAM(2  Zܴ jJS"`?< VQ  9Address 2  Z jJS"`?< V5  6Data 2  Zx7 jJS"`?;   4CS 2  Z jJS"`?:   7R/W(2fB   6D>v  lB   <Dov  `B   0D3  `B   0D3  `B  B 0DjJ  Y4 `B B 0DjJ  Y   ZT jJS"`? v  7n(2  Z jJS"`? v  7m(2TB  c $D  RB  s *DjJ ' H  0޽h ? 33___PPT10i.Οb%+D='  = @B +-  0 ',,.ND, +(  Dx D c $t"  s   x D c $L#  `   K*8  TI DX TfB D 6DԔ x [   D  `h2 jJ S"`? V U ? Row address (2 `B %DB 0DjJ    &D Z jJ S"`? W  610 2  T  6   cD#  o B )D TZD6  B *D TZDu6 u B +D TZD<6 < B ,D TZD6  B -D TZD6  B .D TZD6  B /D TZDX6 X B 0D TZD6  B 1D TZD6  B 2D TZD6  B 3D TZDu6 u B 4D TZD<6 < B 5D TZD6  B 6D TZD 6 B 7D TZD 6  8D Z jJ"`;H   ;. . .(2B 9D TZD!6 ! yT x t3  ;D#  XTlB D <ZD 3 lB ?D <ZD 3 lB @D <ZD 3 lB AD <ZDW W3 lB BD <ZD 3 lB CD <ZD 3 lB DD <ZD 3 lB ED <ZD 3 lB FD <ZDZ Z 3 lB GD <ZD! ! 3 lB HD <ZD 3 lB ID <ZD 3 lB JD <ZDx x 3  KD T| jJ"`    ;. . .(2lB LD <ZD  3  D Z jJ "`T y1024 1024 Cell MatrixF(2B    D  ` jJ S"`?[  = Row Decoder (2  MD Z jJ "`o T  HSense/write amplifiers(2 `D Z jJ S"`? T  @Column Decoder(2r aD BGjJ  T 9 T  6   dD#    xB eD HZD6  xB fD HZDu6 u xB gD HZD<6 < xB hD HZD6  xB iD HZD6  xB jD HZD6  xB kD HZDX6 X xB lD HZD6  xB mD HZD6  xB nD HZD6  xB oD HZDu6 u xB pD HZD<6 < xB qD HZD6  xB rD HZD 6 xB sD HZD 6  tD T4 jJ"`;H   ;. . .(2xB uD HZD!6 ! x vD HZGjJ [  xD Z jJ S"`? CI BColumn address(2lB wDB <DԔ r s fB yD 6ZDjJ VZ v  zD Zx jJ S"`?3 V  610 2  D Z jJ S"`? u   8Data(2`B D 0D  ZB D s *D  @ > 6   DZ 6   D T jJS"`?> 6   aR / W@(2TB D c $D6 v6 H D 0޽h ? 33___PPT10i.Οb%+D='  = @B +2  0 IA..ID     (  Hr H S '  `   r H S x(  `N   a8    I   sH <$. "`   OTypical SRAM cell"D`B vH 0DjJ rr H # B9CDEF AAjJ9999 @  `B H 0DjJs s l2 H <jJ;t8 H # B9CDEF jJ9999 @`B H 0DjJttq`B H 0DjJ  H # ZB9CDEF jJ9999 @u7 fB H 6ZDjJY H # BCrDEFAAjJUr @U HB # B9CDEF jJ9999 @  `B HB 0DjJ< < l2 HB <jJ<u8 HB # B9CDEF jJ9999 @`B HB 0DjJ<<q`B HB 0DjJ  H # BCDEFAAjJU @tS  HB # BCDEFjJU @<S  H # BCrDEFjJUr @  B H TDjJWWUB H TDjJB H TDjJ  rB H BDjJW W `B H 0DjJ  `B H 0DjJ; s rB H BDjJu7 7  H T9 "`W9 QVcc(2 HB # ZB9CDEF jJ9999 @97 fB HB 6ZDjJUrB HB BDjJ7 97 rB HB BDjJ=7 7 rB H BDjJ7 q7 `B H 0DjJ== `B H 0DjJqq rB H BDjJrrB H BDjJr H T "`W K Word line& (2   H T "`   Ebit&(2  H TX "`   Ebit&(2 `B H 0D18  `B H 0DjJ  H H 0޽h ? 33___PPT10i.Og+D='  = @B +l   0L0 { s *T (  T~ T s *  `   ~ T s *P  `x    F   T UQ  T 6 "`   OTypical DRAM cell"DZB T s *DjJqq T # ZB9CDEF jJ9999 @ 6 `B T 0ZDjJ  fB  T 6DjJ  lB  TB <DjJu6 6 ZB  T s *DjJuu lB  T <DjJ<q<   T N؏ "`vV K Word line& (2   T N "`    Ebit&(2  T # BUCDE FjJUU@6  ZB T s *DjJ X ZB T s *DjJ X  T N "`sR r  = Capacitor n   T NX "`XW  QPass Transistor&n   ZB T s *DjJ : ZB T s *DjJ  ZB T s *DjJ W H T 0޽h ? X(=^y___PPT10Y+D='  = @B +Y  0 XX- .X(  r  S 8  `   V8 t  1B  3 1B  3 -B  3 ( Z  s *X99?t DT G WU   # G WU <B  # 7DG* +    lBkCKDEFKk K @` WU B   3 ^(5    <TC ]  E Time$F   <I 7]_  =  D  <L P I Threshold$ F  <R  =  D  <PV n ? Gvoltage$F  <[ <dR =  DBB  3 7DGBB  3 7DBB  3 7DB  3   <`   I 0 Stored$ F  <f Mu =  DB  3   <j a J 1 Written$ F  <p  =  DT K z # K z<B  # 7Dz  BCDE0F87D ++aaC 8[xh@K zT K  z !# K  z<B  # 7DK L z   BCDE0F87D ++laM 8exr@K  zT Wz $# Wz<B " # 7Dz # BCDE,F47D ++ka8Zxg@WzT  z '#  z<B % # 7D  z & BCDE4F<7D ++laN 8X[xr@ zB ( 3   ) <x ^   J Refreshed$ F * <}  A  =  DB + 3  b , <   J Refreshed$ F - <8  =  DB . 3 X / < 5 J Refreshed$ F 0 <$ ~ =  D<T K zV  9# K zV  1 B C+DEF& ++ + @`K zV  2 B C6DEF& 66 6 @`K V  3 B C5DEF& 55 5 @`K V ; 4 B C6DEF& 66 6 @`K QV  5 B C6DEF& 66 6 @`K V  6 B C6DEF& 66 6 @`K V  7 B C5DEF& 55 5 @`K 3V h 8 B C6DEF& 66 6 @`K ~V -T GW x# GW : B+C DEF& + ++@`Gr ; B5C DEF& 5 55 @` < B6C DEF& 6 66 @`  = B6C DEF& 6 66 @`T > B5C DEF& 5 55 @`j ? B6C DEF& 6 66 @` @ B6C DEF& 6 66 @`6 A B6C DEF& 6 66 @`K B B5C DEF& 5 55 @` C B6C DEF& 6 66 @` D B6C DEF& 6 66 @`-c E B6C DEF& 6 66 @`x F B6C DEF& 6 66 @` G B6C DEF& 6 66 @`E H B6C DEF& 6 66 @`Z I B5C DEF& 5 55 @` J B6C DEF& 6 66 @`' K B6C DEF& 6 66 @`<r L B6C DEF& 6 66 @` M B5C DEF& 5 55 @`  N B6C DEF& 6 66 @` T  O B6C DEF& 6 66 @`i   P B6C DEF& 6 66 @`   Q B6C DEF& 6 66 @` 6  R B6C DEF& 6 66 @`K   S B6C DEF& 6 66 @`   T B5C DEF& 5 55 @`   U B6C DEF& 6 66 @`- c  V B6C DEF& 6 66 @`x   W B6C DEF& 6 66 @`   X B5C DEF& 5 55 @` D  Y B6C DEF& 6 66 @`Z   Z B6C DEF& 6 66 @`   [ B5C DEF& 5 55 @` &  \ B6C DEF& 6 66 @`< r  ] B6C DEF& 6 66 @`   ^ B6C DEF& 6 66 @`  _ B5C DEF& 5 55 @`S ` B6C DEF& 6 66 @`i a B6C DEF& 6 66 @` b B6C DEF& 6 66 @`5 c B5C DEF& 5 55 @`K d B6C DEF& 6 66 @` e B6C DEF& 6 66 @` f B5C DEF& 5 55 @`-b g B6C DEF& 6 66 @`x h B6C DEF& 6 66 @` i B6C DEF& 6 66 @`D j B5C DEF& 5 55 @`Z k B6C DEF& 6 66 @` l B6C DEF& 6 66 @`& m B6C DEF& 6 66 @`;q n B6C DEF& 6 66 @` o B6C DEF& 6 66 @` p B6C DEF& 6 66 @`S q B5C DEF& 5 55 @`i r B6C DEF& 6 66 @` s B6C DEF& 6 66 @`5 t B6C DEF& 6 66 @`J u B5C DEF& 5 55 @` v B6C DEF& 6 66 @` w B+C DEF& + ++ @`,WDT }*  {# }* <B y # 7D *  z lBKCvDEFKv vKv @`}@BB | 3 7D*  BB } 3 7DK * L B ~ 3 & 5   <J  \  ? "F  <$J @ d  ? "F  6(J  p K Refresh Cycle" F  <,J   #  ? "F  <\ZJ c GVoltage$F  <_J Go =  D  <,dJ ; Efor 1$F  <iJ N =  D  <mJ c] GVoltage$F  <fJ G]o  =  D  <dvJ   Efor 0$F  <J   =  D  <dJ   ` J  H  0޽h ? ̙33y___PPT10Y+D='  = @B +   0 .\P(  \r \ S 8J  `  J   \ S 8J  `x<$@ 0 J  H \ 0޽h ? 33  ___PPT10 .&'+EDS '  = @B D ' = @BA?%,( < +O%,( < +Do' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*\%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*\#%(D' =-o6Bdissolve*<3<*\#D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*\#?%(D' =-o6Bdissolve*<3<*\#?D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*\?t%(D' =-o6Bdissolve*<3<*\?t+i  0 hh.TCh(  Tr T S J  `  J  r T S J  `x J  ;[F N7  T i 1B T 3 7D   T 6J  ? "FB T 3 7Dq%-B T 3 7Dz%6B  T 3 7D%?B  T 3 7D%HB  T 3 7D%QB  T 3 7D%ZB  T 3 7D%XB T 3 7D%a B T 3 7D %j B T 3 7D %t B T 3 7D %} B T 3 7D % B T 3 7Dq -q B T 3 7Dz 6q B T 3 7D ?q B T 3 7D Hq B T 3 7D Qq B T 3 7D Zq B T 3 7D Xq B T 3 7D a q B T 3 7D j q B T 3 7D t q B T 3 7D } q B T 3 7D q T (  T# ( <  T # (   !T 6J a   ?1"F "T 6J a   ? "F< #T # !(   $T 60J a   ?2"F %T 6@J a "  ? "F< &T # +(   'T 6@J a   ?3"F (T 6D8J a ,  ? "F< )T # 4(   *T 6  ? "F< /T # F(   0T 68QJ a (  ?6"F 1T 6UJ #a G  ? "F< 2T # O(   3T 6J a 1  ?7"F 4T 6 J ,a P  ? "F< 5T # M(  6T 6J a /  ?8"F 7T 6J * a N  ? "F< 8T # V (  9T 6DJ  a 8  ?9"F :T 6J 3 a W  ? "F< ;T # _ (  T # i (  ?T 6J  a p  @11"F @T 68J g a  ? "F< AT # r (  BT 6ĎJ  a y  @12"F CT 6dJ p a  ? "FT i < DT# i << ET # i* FT 6tJ  < @24"F GT 6(J :< ? "F< HT # !i* IT 6Q0 )< @23"F JT 6U0  D< ? "F< KT # +i* LT 6Y0 2< @22"F MT 6_0 )M< ? "F< NT # 4i* OT 6c0 ;< @21"F PT 6h0 2V< ? "F< QT # =i* RT 6Hl0 D< @20"F ST 6q0 ;_< ? "F< TT # Fi* UT 6Hv0 N< @19"F VT 6 |0 Dh< ? "F< WT # Oi* XT 60 W< @18"F YT 6,0 Mq< ? "F< ZT # Mi * [T 6X0 U < @17"F \T 6 0 K o < ? "F< ]T # V i * ^T 6X0  ^ < @16"F _T 600 T x < ? "F< `T # _ i * aT 60  g < @15"F bT 6<0 ^  < ? "F< cT # i i * dT 6h0  p < @14"F eT 60 g  < ? "F< fT # r i * gT 6h0  y < @13"F hT 6$0 p  < ? "FT N7  iT# N7 < jT #  N-  kT 60    A A4"F lT 6d 0 }   ? "F< mT #  N$  nT 60    A A5"F oT 60 t   ? "F< pT # N  qT 60  z  A A6"F rT 6 0 j   ? "F< sT # N  tT 6$0 q  A A7"F uT 6t*0 a   ? "F< vT # N  wT 6.0 s  A A8"F xT 640 c  ? "F< yT # N zT 6 90 i  A A9"F {T 6=0 Z~  ? "F< |T # N }T 6A0 U  A D3"F ~T 6G0 Jn  ? "F< T # N T 6K0 L  A D4"F T 60 Ae  ? "F< T # N T 60 k  B CAS"F T 6D0   ? "F< T # N T 6 0 r  A OE"F T 60 g  ? "F< T #  N7 T 60    d Vss"F T 60    ? "F< T # N T 680 f_  d Vss"F T 60 Nr  ? "FT D 7  T# D 7 < T # D    T 60 | q   A A0"F T 6x0 a |   ? "F< T # D    T 6$2  | z   A A1"F T 6B j |   ? "F< T #  D $   T 6F  |   A A2"F T 6PK t |   ? "F< T #  D -   T 6O  |   A A3"F T 6T } |   ? "F< T # D   T 6] |   B A10"F T 6b |   ? "F< T # D   T 64g | L  A D1"F T 6( A| e  ? "F< T # D   T 6m | U  A D2"F T 6$ J| n  ? "F< T # D   T 6h t|   B RAS"F T 6L |   ? "F< T # D   T 6@ | s  A WE"F T 6 i|   ? "F< T # D   T 6( f| _  d Vcc"F T 6  N| r  ? "F< T #  D 7  T 6  |   d Vcc"F T 6(-  |   ? "F< T # D   T 6 | z  A NC"F T 6E e|   ? "F F   T @  T 6y V BLegend D T 0} S TAi6FF T 6: A ACAS"F T 6 jc nDj6FF T 6L Q @NC"F T 6 s  @OE"F T 6h2 a   ARAS"F T 6    @WE"F T 6\K S nAddress bit i D FFF T 6Ľ A SColumn address strobe"F T 68 c k Data bit j D FFF T 6$ Q K No connection" F T 6 s  K Output enable" F T 6  a   PRow address strobe"F T 6,l  z  J Write enable" FH T 0޽h ? 33___PPT10i.2@+D='  = @B +G>  0 ^=V=/S<(  r  S 0'  `   r  S '  `   ; =QC  #">2 )=Cy  T _ж ж?C R3 ns8Z#?  T _ж ж?m  S45 ns8Z#?  T, ж@~ж?m  S$0.108Z#?  <` ?n  1024 Mbit 0Z V#?  < ?=n R20048Z#?  T _ж ж? C R5 ns8Z#?  T _ж ж?m  S50 ns8Z#?  Tb ж@~ж? m  S$0.258Z#?  <d ?n   512 Mbit 0Z V#?  <, ?= n R20028Z#?  T _ж ж? C  R7 ns8Z#?  T _ж ж?m   S55 ns8Z#?  T ж@~ж? m  S$1.008Z#?  <h ?n    256 Mbit 0Z V#?  < ?= n  R20008Z#?  TD _ж ж? C  S10 ns8Z#?  TH _ж ж?m    S60 ns8Z#? } T ж@~ж? m  S$4.008Z#? { <9 ?n    128 Mbit 0Z V#? y <( ?= n  R19988Z#? v T _ж ж?3 C  S12 ns8Z#? t T _ж ж?m 3   S60 ns8Z#? r T( ж@~ж?3 m   T$10.008Z#? p <d+ ?n3    64 Mbit 0Z V#? n <v ?=3 n  R19968Z#? k Tt{ _ж ж?M C3  S30 ns8Z#? i TK _ж ж?m M 3  S90 ns8Z#? g T4T ж@~ж?M m 3  T$15.008Z#? e <W ?nM 3   16 Mbit 0Z V#? c <`h ?=M n3  R19928Z#?  TY _ж ж?g CM  S40 ns8Z#?  TϷ _ж ж?m g M  T110 ns8Z#?  TXط ж@~ж?g m M  T$50.008Z#?  <1 ?ng M   4 Mbit0ZV#?  <x1 ?=g nM  R19898Z#?  T2 _ж ж?Cg  S40 ns8Z#?  Tx3 _ж ж?m g  T135 ns8Z#?  THS ж@~ж?m g  U$200.008Z#?  <U ?ng   1 Mbit0ZV#?  < ?=ng  R19858Z#?  T _ж ж?C T100 ns8Z#?  Tı _ж ж?m  T185 ns8Z#?  T`ͱ ж@~ж?m  U$500.008Z#?  <xб ?n  256 Kbit 0Z V#?  < ?=n R19838Z#?  Tdȱ _ж ж?C T150 ns8Z#?   TTi _ж ж?m  T250 ns8Z#?   Tu ж@~ж?m  V$1500.00 8Z #?   <w ?n  64 Kbit 0Z V#?   <Ĉ ?=n R19808Z#?   Z̊ _ж_ж?QC kColumn access to existing row#?  Z _ж_ж?m Q lTotal access time to a new row#?  Z _ж_ж?Qm  Y Cost per MB  #?  Z| _ж_ж?nQ VCapacity  #?  Z _ж_ж?=Qn ]Year introduced#?ZB  s *o ?=QCQZB  s *1 ?=CZB   s *1 ?=CZB ! s *1 ?=CZB " s *1 ?=g Cg `B # 0o ?=CZB $ s *o ?=Q=ZB % s *1 ?nQnZB & s *1 ?QZB ' s *1 ?m Qm ZB ( s *1 ?QZB ) s *o ?CQC`B P 0o ?==ZB R s *1 ?nnZB T s *1 ?ZB V s *1 ?m m ZB W s *1 ?`B X 0o ?CCZB d s *1 ?=M CM ZB o s *1 ?=3 C3 ZB z s *1 ?= C ZB  s *1 ?= C ZB  s *1 ?= C ZB  s *1 ?=CH  0޽h ? 33___PPT10i.#+D='  = @B +   0 .16` ?(  `r ` S   `   r ` S   `T   8 H1 6`H1T 6 4   `# 5 3 ` ` 06 4  ` N< "`S   6CS(2w@ $ # o  `S   ` NLm "`$ # o  7R/W(2TB ` c $D    ` N "`n   ;Address(2  ` Nw "`D {  8Data(2f ` 6A 5 [ 3  ` T "`] R \  6CS(2T $ # o  `# ? R >  ` Nh "`$ # o  7R/W(2TB ` c $D   ` Tϱ "`A m x  ;Address(2 ` T "`   8Data(2T 6 4  `# 5 3 ` ` 06 4  ` N "`S   6CS(2N $ # o  ` S   ` Nŧ "`$ # o  7R/W(2TB ` c $D   ` Nɧ "`n   ;Address(2 ` Nǧ "`D {  8Data(2 ` c BHC6DEFAAԔee6H6 @H   `  BC6DE FԔ66@"`_ B  `  BC6DE FԔ66@"`   !` C B CDE FAA  @HU5  "` C B CDE F  @H s5 rB #` BDc c5 rB $` BD((5 rB %` BD  5 rB &` BD  5  '`  BCDEFAAo @"`E3 2lB (` <Do 3 2 )`B  BCDEFo @"` 3 2 *` Tҧ "`?  <  ;. . .(2' +` T֧ "`` 1 Data width = m p bitsZ(2 Br ,` BZjJ] 2  -` Tާ "` O  9. .(2T a '  0`# ` & ZB .`B s *DjJ   /` N "`a '  Am$(2T a '  1`# w = ZB 2`B s *DjJ   3` N> "`a '  Am$(2H ` 0޽h ? 33___PPT10i.4N+D='  = @B +/+  0 F*>*/AOd )(  dx d c $G  `    d  <DH "` `   ^'8 ,r7N Ndr,7NT 6 4  d#   ` d 06 4  d NL "`S   6CS(2N $ # o  d S   d NP "`$ # o  7R/W(2TB  d c $D    d NS "`n   ;Address(2  d NW "`D {  8Data(2`  d 0y   d N\ "` n  6CS(2T $ # o  d# w v n  d N0Z "`$ # o  7R/W(2TB d c $D   d Nl^ "`y  ;Address(2 d Nf "`  !  8Data(2T 6 4  d#  7 ` d 06 4  d Nk "`S   6CS(2N $ # o  d S   d Nxo "`$ # o  7R/W(2TB d c $D   d N|m "`n   ;Address(2 d Nu "`D {  8Data(2 d  BHC6DEFԔee6H6 @"` 6 P  d  BC6DE FԔ66@"`6 z P  d  BC6DE FԔ66@"`P  d C B CDE F  @  fB  d 6D lB !d <D` ` lB #d <D fB %d 6Do# #  'd N{ "`w t  ;. . .(2 (d N< "`x  ]Data width = m bits.(2 @ )O   4da 'kfB ,d 6DjJ) b  -d TT "`aO   Am$(2 2d c BCDEFAAo @} # N 3d C BCDE FAAo@# N )O   5d  k`B 6d 0DjJ) b  7d N "`aO   Am$(2N )O   8d  pk`B 9d 0DjJ) b  :d N, "`aO   Am$(2N )O   ;d   `B @  0 U?M?+k=a>(  `~ ` s *|,  `   ~ ` s *@-     =8 8H  =a8H  ` # l1 xaxaA1?>  ICPU: 55% per year0  ` # l5 xaxaA1? 8  GDRAM: 7% per year0 `B ` 01?fB ` 61? `B ` 01?  fB ` 61?  `B ` 01?  `B ` 01?U U `B ` 01?  `B ` 01?  `B ` 01?- - `B ` 01?  `B ` 01?c c `B ` 01?  `B ` 01?  `B ` 01?< <  `B ` 01?  `B ` 01?r r  `B ` 01?  `B ` 01?  `B ` 01?C C  `B ` 01?  `B ` 01?  `B ` 01?  `B ` 01?  `B ` 01?R R `B ` 01?  v `  FB9 CDETF\1?Xh(PxH8H`@   @ H p8 +,@ v `  FB9 CiDETF\1?hXX@0 PHH@ h X H@ 8 8 +,@  f ` 61?  f ` 61?> \ f ` 61?j  f ` 61?|2 R f ` 61? 5 f ` 61?  f ` 61?Mz k f ` 61?B b f ` 61?f ` 61?& jC f ` 61?  "f ` 61?\ z f ` 61? * Jf ` 61?  f ` 61?- RK rf ` 61?  f a 61?k z f a 61?#2f a 61?f a 61?<:ZZf a 61?f a 61?  f a 61?> \ f a 61?  f a 61?|  f  a 61? 5 f  a 61?  f  a 61?Mm k f  a 61?b  f  a 61?R r f a 61?& B C b f a 61? * J f a 61?\  z : f a 61?  * f a 61?  f a 61?- K  f a 61? f a 61?k f a 61? # f a 61?  f a 61?< Z f a 61?z   a 3 r xaxaA1?T 9  ;10  a 3 r䛧 xaxaA1?  >10 0  a 3 rT xaxaA1?+ ?100 0  a 3 r xaxaA1? @1000 0  a 3 r䣧 xaxaA1? m  >19800  a 3 r$> xaxaA1?7   >19810   a 3 rB xaxaA1?n @  >19830  !a 3 r? xaxaA1?    >19840  "a 3 rJ xaxaA1? v  >19850  #a 3 rN xaxaA1?G   >19860  $a 3 rl xaxaA1?   >19870  %a 3 rd xaxaA1?} O  >19880  &a 3 r xaxaA1?  >19890  'a 3 r xaxaA1?  >19900  (a 3 r xaxaA1?N  >19910  )a 3 r xaxaA1?  >19920  *a 3 rX xaxaA1? ^  >19930  +a 3 rP xaxaA1?'  >19940  ,a 3 r xaxaA1?  >19950  -a 3 r# xaxaA1?] /  >19960  .a 3 r' xaxaA1?   >19970  /a 3 r+ xaxaA1? e  >19980  0a 3 rD/ xaxaA1?7   >19990  1a 3 r<3 xaxaA1?   >20000  4a 3 r6 xaxaA1?   >19820 rB 5a B8c?1@1  6a 3 r [ xaxaA1?v9Hx l6Processor-Memory Performance Gap: (grows 50% per year)70 7 7a C x] xaxaA1?8V  C Performance 0   8a 3 r] xaxaA1? v V Moore s Law  0 ZB :a s *DZB ;a s *D99ZB \/y[(  ~  s *8,  `   Y8 vB vB ~ N- "` dV  ?Address 2 frB V BDjJ   w s B1CDEFԔ1 @"`z9fB c 6DԔfB d 6DԔrB  BDjJ    B5 ##Z TRs0B   B : ##vk TRt0B   T\> ##jJ"` %  K Register File"0B fB   6DԔ ,xB   HDo( fB   6DԔ&"   S 0e0e    BCDE F jJ 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E||@"`3R    TB "`*)R ?Imm260B rB  BDjJIIrB  BDjJ   0e0e    B CDE F Ԕ 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E|| @"` C` wT   # 8/   3 rpG 1?"`2  XRw 0( 2   3 rL 1?"`  F ALU result 0( 2    3 rXP 1?"` 1 =B 0( 2   C xXN 1?"`(  i  =B 0( 2   C xV 1?"`( "  =A 0( 2   C x 1?"`(  9 AImm16 0( 2   C xDi 1?"`( i  XRw 0( 2 fB  6DԔ rB  BDjJ ( T 0 =  # _ 2 x"  HG0*1?0 =  B  `D 1?"`0 =  Am u x  FcfB  6DԔ. ( fB  6DԔ. ( T 0 =   # L x" ! HG0*1?0 =  "B  ` 1?"`0 =  Am u x  Fc # C x 1?"`}L  G Instruction 0( 2  rB $ BDjJ  T &c J 2 %# b Sb B &  f DԔ?&   ' s B$CDEFԔ$q$r @c"$` &c J 2B ( # l DԔ? " " T 0 =  )#  O0 x" * HG0*1?0 =  +B  `( 1?"`0 =  Am u x  FcT 0 =  ,#  0 i x" - HG0*1?0 =  .B  ` 1?"`0 =  Am u x  FcfB / 6DԔ  fB 0 6DԔ   1 H ##0ejJ"`E O  OInstruction Cache"0xB  2 C x( 1?"`3 @ Address 0   fB 3 6DԔ |fB 4 6DԔoo 5 C x 1?"`s E Instruction 0    6  `" 1?"`m AInc"0( 2B  7 Z' 1"`g"R  <PC0B  8 Z& 1"`"g <000B fB 9 6DԔm/}/" : S 0e0e    BCzDE F Ԕ 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E||z@"`c/jT  f  ;# v<R x" < HG0*1? f  =B  `D. 1?"` f  Am u x  Fc >B  `x1 1?"` fC  , B  ?B  `5 1?"`F f  , B T ` @# ,&Y* A c BC DE F*jJ? p ` @`S" ` B  f9 xaxa1?"` =A L U PB C C x8= 1?"`}c ?NPC 0( 2 fB D 6DԔ<oofB E 6DԔ/0 F  BCADEFԔArA @"`p G  BC>DE FԔ>>@"` `  H  BC^DE FԔ^@"`   I  BC^DE FԔ^@"` r T `  J K#  #J f" L 6Gn(1`  J M NC ##"``  J =Ext0B  N  BC[DE Fo[@  " O c BCDE FԔ@ P T|G "` )  ?Imm160B  Q C BCXDEFDX @S" ow R ZK jJ"`c  ;Next PC( S TO "`  ]PCSrc"0F  W HS ##jJ"`  H Data Cache" 0x B  X C xN 1?"`?5 @ Address 0     Y C xH[ 1?"`q? b Data_in 0    Z B^ ##"`/9 D ALU result 0 B fB [ 6DԔfB \ 6DԔ wwT 0 =  ]#  ,x" ^ HG0*1?0 =  _B  `c 1?"`0 =  Am u x  Fcr ` B1Cj  a C xg 1?"`C _ WriteData 0( 2    b C x f 1?"`j   XRw 0( 2  f  B C0DEFԔ  00 @"` w8  g c B CDEFjJ  @ % H  h c B CDEFԔ    @Ywv/  i c BTC4DEFԔTT44 @,   j Bxp ## p  <Rd0B  l Ht ##0ejJ"` B C Main Memory 0 @ZB m s *DfԔ  lB nB <DfԔ rB p BDfԔzz ZB q s *D)  lB r <D)S  rB oB BDfԔ^p^ ~B s ND)%  ~B t ND)  u C BCDEFԔ @"` ",ZB x s *DjJ  `B y 0DjJgS g `B zB 0DjJ%  `B { 0DjJ  | N{ "`/ d  ?Control 2  } N "`? d  <Data 2 RB  s *Df Z f   0$  p  R Interface between CPU and memory! 2!H  0޽h ? a(___PPT10i.i0F+D='  = @B +  0 0+l<(  l~ l s *Ȋ  `   ~ l s *  `x   H l 0޽h ? a(___PPT10i.,W+D='  = @B +  0 @00(  x  c $  `   x  c $     H  0޽h ? 33___PPT10i.Ȝ0f+D='  = @B +  0 @+p<(  p~ p s *ę  `   ~ p s *  `   H p 0޽h ? a(___PPT10i.,W+D='  = @B +G  0 FFP+aat7F(  t~ t s *  `   ~ t s * ITZ   tBL p0p t# j}:~ t N 1?p  ~ t N 1?  ~ t N 1?  ~ t N 1?   ~  t N 1? @ ~  t N 1?@  ~  t N 1? ` ~  t N 1?`  ~  t N 1?  ~ t N 1?  ~ t N 1?  ~ t N 1? 0 ~ t N 1?0 ~ t N 1? P ~ t N 1?P ~ t N 1? p ~ t N 1?p ~ t N 1? ~ t N 1? ~ t N 1? ~ t N 1? @ ~ t N 1?@  ~ t N 1? ` ~ t N 1?`  ~ t N 1?  ~ t N 1?  ~ t N 1?  ~  t N 1? 0 ~ !t N 1?0  ~ "t N 1? P ~ #t N 1?P  ~ $t N 1? p ~ %t N 1?0 P ~ &t N 1? PP ~ 't N 1?P P ~ (t N 1? Pp ~ )t N 1?p P ~ *t N 1? P ~ +t N 1? P ~ ,t N 1? P TT 0 0 P -t# 0 0 P .t 3 r4 1?"`0 0 P ?000 0( 2 /t 3 rع 1?"` 0P P ?001 0( 2 0t 3 r| 1?"`P 0 P ?010 0( 2 1t 3 r 1?"` 0p P ?011 0( 2 2t 3 rę 1?"`p 0 P ?100 0( 2 3t 3 rhș 1?"` 0 P ?101 0( 2 4t 3 r ̙ 1?"` 0 P ?110 0( 2 5t 3 rϙ 1?"` 0 P ?111 0( 2< T p p 6t# p p 7t 3 rTә 1?"`p  ?000000 2 8t 3 r֙ 1?"`  ?000010 2 9t 3 rtڙ 1?"`  ?000100 2 :t 3 r 1?"`   ?000110 2 ;t 3 r 1?"` @ ?001000 2 t 3 r| 1?"``  ?001110 2 ?t 3 r  1?"`  ?010000 2 @t 3 r  1?"`  ?010010 2 At 3 r 1?"`  ?010100 2 Bt 3 r 1?"` 0  ?010110 2 Ct 3 r8 1?"`0  ?011000 2 Dt 3 r 1?"` P  ?011010 2 Et 3 r 1?"`P  ?011100 2 Ft 3 r 1?"` p  ?011110 2 Gt 3 r 1?"`p  ?100000 2 Ht 3 r 1?"`  ?100010 2 It 3 r 1?"`  ?100100 2 Jt 3 r 1?"`  ?100110 2 Kt 3 r 1?"` @ ?101000 2 Lt 3 r4 1?"`@  ?101010 2 Mt 3 r 1?"` ` ?101100 2 Nt 3 r8 1?"``  ?101110 2 Ot 3 rp 1?"`  ?110000 2 Pt 3 r 1?"`  ?110010 2 Qt 3 r 1?"`  ?110100 2 Rt 3 r 1?"` 0 ?110110 2 St 3 rt 1?"`0  ?111000 2 Tt 3 r" 1?"` P ?111010 2 Ut 3 r& 1?"`P  ?111100 2 Vt 3 r$ 1?"` p ?111110 2B Wt Z D1?0p 0 B Xt Z D1? 0 B Yt Z D1? ` 0 B Zt Z D1? p0 B [t Z D1?pp0 0 B \t Z D1? 0 0 B ]t Z D1?` 0 B ^t Z D1?` p 0 + _t T/ 1?3H   IIn this example: Cache index = least significant 3 bits of Memory addressB8 +8 9 `t 65   5Cache 2 at 69  :  ; Main Memory  H t 0޽h ? a(___PPT10i.,W+D='  = @B +t  0 `+$'x0 (  x~ x s *A  `   ~ x s *B ,uc   8 T 'xTtT     x# X L ~ x N 1?   ~B x N D1? ~B x N D1? ~B  x N D1?    x T f1?` L B  x T D1?  B  x T D1?  B  x  ` Do?   x 3 rG 1?"`XL< =V 0( 2 x 3 rlK 1?"`L< ?Tag 0( 2 x 3 rO 1?"`LT< F Block Data 0( 2  x T 1?`   x T 1?X`  B x  ` D1?b b r x ZZ 1? p2 x T 1?    x 3 rLS 1?S"`?   ;=0( 2B x Z D1?b b B x Z D1?  B x Z D1? p B xB T D1?  B x Z D1?   x 3 r$X 1?S"`?U  EHit&0( 2 x 3 r\ 1?S"`? 8@ FData&0( 2 x # l` 1?"`HJ ATag"0( 2   x # lHd f1?"`HJ CIndex"0( 2  !x # lg f1?"`J Doffset"0( 2  "x 3 rk 1?S"`? K Block Address"0( 2  #x  ` 1?m  $x  0e0e    BCCDEF 1 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E||CCV @J   %x  0e0e    BCPDEF 1 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E||VVP P @Jt H x 0޽h ? a(___PPT10i.,W+D='  = @B +l  0 {p+$G|(  |~ | s *r  `   ~ | s *s  `j   8 TN G|TNtT     '|# XYL ~ (| N 1?   ~B )| N D1? ~B *| N D1? ~B +| N D1?   ,| T f1? LY B -| T D1?Y B .| T D1?Y B /|  ` Do?   0| 3 r| 1?"`Xu =V 0( 2 1| 3 ry 1?"`u ?Tag 0( 2 2| 3 r~ 1?"`Tu F Block Data 0( 2  3| T 1? Y  4| T 1?X Y B 5|  ` D1?b b9 r 6| ZZ 1? 2 7| T 1?9 Y   8| 3 r< 1?S"`?9 Y  ;=0( 2B 9| Z D1?bY b B :| Z D1?Y  B ;| Z D1?   B <|B T D1?Y Y B =| Z D1? Y  >| 3 r8 1?S"`? N EHit&0( 2 ?| 3 r0 1?S"`? 8y FData&0( 2 @| # l` 1?"`H ATag"0( 2  A| # lp f1?"`H CIndex"0( 2  B| # l f1?"` Doffset"0( 2  C| 3 r 1?S"`?3 K Block Address"0( 2  D|  ` 1?P  E|  0e0e    BCCDEF 1 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E||CCV @   F|  0e0e    BCPDEF 1 8c8c     ?1 d0u0@Ty2 NP'p<'pA)BCD|E||VVP P @t H | 0޽h ? a(___PPT10i.,W+D='  = @B +!  0   /G (  r  S h  `      6T "` `x<$@ 0   l  /?  / ? ,$D 0N@  S    ?    fܻ 1?"`(S  ATag"0( 2    fན f1?"`(S  CIndex"0( 2    f` f1?"` S  Doffset"0( 2    f\ړ f1?S"`?  ?4"0( 2    fɓ f1?S"`?- ?8"0( 2     f͓ f1?S"`?- @20"0( 2 @     /j   # lLғ 1?S"`? u K Block Address"0( 2    Z 1? H  0޽h ? 33___PPT10.;yݪ+ٛDy'  = @B D4' = @BA?%,( < +O%,( < +D ' =%(D ' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*+%(D' =-o6Bdissolve*<3<*+D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*+<%(D' =-o6Bdissolve*<3<*+<D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*DY' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*<y%(D' =-o6Bdissolve*<3<*<yD3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*y%(D' =-o6Bdissolve*<3<*yD3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*+!  0 / D(     6x "` `  x     6x "` `x<$@ 0 x     S  #  i ,$D 0   fI 1?"`(S  ATag"0( 2    fx f1?"`(S  CIndex"0( 2    fPx f1?"` S  Doffset"0( 2     fx f1?S"`?  ?4"0( 2     f x f1?S"`?- ?5"0( 2     f"x f1?S"`?- @23"0( 2 H  0޽h ? 33___PPT10.ժ(+<D'  = @B Di' = @BA?%,( < +O%,( < +D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*B%(D' =-o6Bdissolve*<3<*BD' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*Bf%(D' =-o6Bdissolve*<3<*BfD' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*f%(D' =-o6Bdissolve*<3<*fD' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*+J  0 5I-I+qrH(  ~  s *x  `  x  ~  s *x ,T x    Z x jJ?"`  Km-way associative0 2F8 I rN  3 rx 1?S"`?` CAddress 0( 2  # l x 1?"`F ATag"0( 2   # l8x f1?"`F Doffset"0( 2 B   Z Do? f V   3 rx 1?S"`?V DData$0( 2   3 rP8x 1?S"`?  CHit$0( 2   T f1?D    T 1?   T 1?u B   ` D1?S& S r  ZZ 1?u6  2  T 1?     3 rK 1?S"`?   ;=0( 2B  Z D1?S S6 B B T D1?6 6 B   ` D1?& 6   T f1?   T 1?z   T 1?"z B   ` D1?&  r  ZZ 1?"6 , 2  T 1? X    3 rM 1?S"`? X  ;=0( 2B  Z D1? 6 B B T D1?N6 z6 B   ` D1?N& N6    T f1?0   ! T 1?' 0  " T 1? ' B #  ` D1? &  r $ ZZ 1? 6 2 % T 1?S     & 3 r T 1?S"`?S    ;=0( 2B ' Z D1? 6 B (B T D1? 6 ' 6 B )  ` D1? & 6  * T f1?I B +  ` Do?& v  , T 1?  - T 1?z B .  ` D1?X& X r / ZZ 1?z6  2 0 T 1?    1 3 rZ 1?S"`?   ;=0( 2B 2 Z D1?X X6 B 3B T D1?6 6 B 4  ` D1?& 6 B 5B  f D1?"B 6B  f D1?v B 7 Z D1?v v B 8 Z D1?u v S v B 9B  f D1?u u v B :B T D1?"F"v B ; Z D1?"v v B < Z D1?v v B =B T D1?v B > Z D1?  VB ? T D1?b V b B @ T D1? & B A T D1? &  B B T D1?l V l  C  ` GPF1?5 B D T D1? V B E T D1?V b V B F T D1?l V V B G T D1?& & B H T D1? & B I T D1?S S B J T D1? V B KB Z Do?b bv B LB Z Do? v B MB Z Do? v B N Z Do?g& g B O T Do?g b B P T Do?  B Q T Do?  B R Z Do? &  B S T Do?  B T T D1?  B U T D1?S  B V T D1? +B W T D1? 6+6 X Z G1?v kfB Y  f D1? V B ZB  f D1? & S & B [ T Do?6  B \ Z Do?& 6 B ]  f D1? F F B ^  f D1?b v v B _ T D1? F B ` T D1? v 6LT  a# zI b # ll 1?"`0 =V 0( 2 c # l Z Do? ? 3 r4 1?S"`? DData$0( 2B @  ` D1?J-Jm B A Z D1?-m-r B ZZ 1?m vM 2 C T 1?m -   D 3 r 1?S"`?m -  ;=0( 2B E Z D1?J- J B FB T D1?  B G  ` D1?- B H  ` D1?-m B I Z D1?--r J ZZ 1? #M 2 K T 1?m O-   L 3 r 1?S"`?m O-  ;=0( 2B M Z D1?-  B NB T D1?E q B O  ` D1?E-E B P  ` D1? - m B Q Z D1?l - -r R ZZ 1? M 2 S T 1?J m -   T 3 rt 1?S"`?J m -  ;=0( 2B U Z D1? - B VB T D1?  B W  ` D1? - B X  ` Do?- B Y  ` D1?N-Nm B Z Z D1?-q-r [ ZZ 1?q {M 2 \ T 1?m -   ] 3 r0" 1?S"`?m -  ;=0( 2B ^ Z D1?N- N B _B T D1?  B `  ` D1?- B aB T D1?==B bB T D1?m-B cB  f D1?m-B dB  f D1?= B e Z D1?  B f Z D1?@ J B gB  f D1?l ml -B hB  f D1?@ =@ B iB T D1?-B jB T D1?= B k Z D1?  B l Z D1?  B mB T D1?= B n Z D1?  B o T D1?Y Y -B p T D1? } -B q T D1? } -B r T D1?b b - s  ` GPF1?, B t T D1?M  B u T D1? Y B v T D1?b  B w T D1?} } B x T D1?M } B y T D1?J M J -B z T D1?M  B {B Z Do?, , B |B Z Do?{= { B }B Z Do?m  B ~ Z Do?1-1 B  T Do?1 , B  T Do?l = {= B  T Do?m m B  Z Do? - ] B  T Do? ] l = B  T D1?  B  T D1?J --B  T D1? ]"]B  T D1? "  Z G1?q 6B   f D1? B B  f D1? } J } B  T Do? m B  Z Do?- B   f D1? B   f D1?Y B  T D1? ]B  T D1?   Z5 jJ?S"`? $ Wmux0 2  3 r9 1?S"`?G -l  CHit$0( 2P  s *޽h ?"` a(___PPT10i.,W+D='  = @B +  0 +b(  ~  s *H@  `      B@ "`,p   H  0޽h ? a(___PPT10i.,W+D='  = @B +  0 +<(  ~  s *E  `   ~  s *xF ,p   H  0޽h ? a(___PPT10i. +D='  = @B +  0 +b(  ~  s *,  `      B "`,Tc   H  0޽h ? a(___PPT10i.+D='  = @B +  0 +b(  ~  s *4  `  4     BT4 "`,p 4  H  0޽h ? a(___PPT10i. _+D='  = @B +  0 0b(  ~  s *4  `  4     B4 "` `x 4  H  0޽h ? a(___PPT10i.,W+D='  = @B +  0 0<(  ~  s *  `   ~  s * 7P   H  0޽h ? a(___PPT10i.h/+D='  = @B +s  0 1 (  ~  s *   `      B "`,T    L   #  1<  Nd 1?   2-way 4-way 8-way Size LRU Rand FIFO LRU Rand FIFO LRU Rand FIFO 16 KB 114.1 117.3 115.5 111.7 115.1 113.3 109.0 111.8 110.4 64 KB 103.4 104.3 103.9 102.4 102.3 103.1 99.7 100.5 100.3 256 KB 92.2 92.1 92.5 92.1 92.1 92.5 92.1 92.1 92.5:0 2/, . `  ` P~B  N D1? ~B  N D1?@  ~B  N D1? ` ~B   N D1?  H  0޽h ? a(___PPT10i.v+D='  = @B +  0 P00(  x  c $@(  `   x  c $)     H  0޽h ? 33___PPT10i.Ȝ0f+D='  = @B +   0 ,$,(  ~  s *\0  `      HTA  "` `x<$D  0   4 $`H  0޽h ? a(F > ___PPT10 .X` [+qZD ' = @B D ' = @BA?%,( < +O%,( < +Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*Z%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*ZD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*ZDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*+  0 ,(     B\ "`,   <( $`~  s *\^  `   H  0޽h ? a(___PPT10i.l`]wI+D='  = @B +  0 `/|b(  |~ | s *T  `    |  B "` x   H | 0޽h ? a(___PPT10i.X` [+D='  = @B +t  0  ,(  ~  s *-  `      B_ "` r<$D  0   H  0޽h ? a(___PPT10.P+qZD' = @B D' = @BA?%,( < +O%,( < +Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*?v%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*?vD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*?vDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*v%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*vD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*vDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*(%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*(D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*(Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*(Z%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*(ZD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*(Z+  0 @,v(  ~  s *[  `     s *(S  `   4 $`r  ZT jJ"`XeV 2CPU Time = I-Count CPIMemoryStalls Clock Cyclej3 FF F NBF  ZCPIMemoryStalls = CPIPerfectCache + Mem Stalls per Instruction`?F NF NF6H  0޽h ? a(___PPT10i.X` [+D='  = @B +A  0   P,  (  ~  s * z  `      B "`T<$D  0     :=  # < y ,$@  0  Zb 1?"` :  M Instruction& 0( 2 r  ZZ jJ? :=   =  # <y ,$D  0  Z+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*D ' =%(D ' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<* D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<* D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*+%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*+D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*+D{' =%(D#' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<* D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<* Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*+P%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*+PD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*+PD{' =%(D#' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<* D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<* Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*P%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*PD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*PDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*D{' =%(D#' =%(D' =A@BBBB0B%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<* D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<* ++0+ 0 ++0+ 0 ++0+ 0 ++0+ 0 +K   0 1h(  ~  s *  `     s * ,7p<$D 0   H  0޽h ? a(  ___PPT10 .G++MC D '  = @B D ' = @BA?%,( < +O%,( < +D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =-o6Bdissolve*<3<*D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*C%(D' =-o6Bdissolve*<3<*CD' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*C%(D' =-o6Bdissolve*<3<*C+t"  0 !!/**!(  ~  s *l  `   "L `0 # &'T J   # /3b   ` GH1?J     # l؜ 1?S"`? Q  ?Bus 0( 2  # l 1?"``F ?CPU 0( 2  # l f1?"`30 BMemory 0( 2b    ` GH1?F   # l 1?"`0 ACache 0( 21   3 r 1?S"`?`0 a!One-word-wide Memory Organization$"0( 2"| L P   # &     # l 1?"`  ACache 0( 2T   #    Z G 1?   # lP 1?S"`?  G Multiplexer 0( 2 b   ` G=H1?    # lH# 1?S"`?  ?Bus 0( 2  # l' f1?"`  BMemory 0( 2"  # l& 1?S"`?  XWide Memory Organization$0( 2b   ` GH1? v b   ` GH1? b   ` GH1? b   ` GH1? N I  P   # lX/ 1?"` ?CPU 0( 2b   ` GH1?I L  `P  # ,b ,  # l(4 f1?"` 39  I Memory bank 0 0( 2  # l5 f1?"`3 9  I Memory bank 1 0( 2  # l4< f1?"`&3,9  I Memory bank 2 0( 2   # l f1?"`J3P9  I Memory bank 3 0( 2) ! # l 1?S"`? 9 P  _Interleaved Memory Organization$ 0( 2 N  " `*3T J   ## b $  ` GH1?J    % # l 1?S"`? Q  ?Bus 0( 2 & # lD 1?"` ?CPU 0( 2b '  ` GH1?I ( # l\ 1?"`J ACache 0( 2 ) T` 1?"` @ > $One Word Wide: CPU, Cache, Bus, and Memory have word width: 32 or 64 bits Interleaved: CPU, Cache, Bus: 1 word Memory: N independent banks|K0 0 (40 ; 4a * T8 1?C Y YWide: CPU, Mux: 1 word Cache, Bus, Memory: N words Alpha: 256 bits Ultra SPARC: 512 bitsFZ0 %- KH  0޽h ? a(___PPT10i.0+D='  = @B +   0  07:C(  r  S %  `   r  S l& C    L  `P  #  YY  # l' f1?"` 39  I Memory bank 0 0( 2  # l) f1?"`3 9  I Memory bank 1 0( 2  # l f1?"`&3,9  I Memory bank 2 0( 2  # l  f1?"`J3P9  I Memory bank 3 0( 2)   # l 1?S"`? 9 P  _Interleaved Memory Organization$ 0( 2 N    `*3T J    # b    ` GH1?J      # lG 1?S"`? Q  ?Bus 0( 2  # l, 1?"` ?CPU 0( 2b   ` GH1?I  # l 1?"`J ACache 0( 28  3  :% = ZB  s *D   NI "` I  8Time(2@   B 8  ZB  s *D  BZB  s *D BZB  s *D^ ^BZB  s *D BZB  s *D BZB  s *D: :BZB  s *D BZB  s *Dx xBZB  s *D BZB  s *D BZB  s *DT TBZB  s *D BZB  s *D BZB   s *D0 0 BZB ! s *D B`B # 0D    $ N "`   = Bus cycle   @  T 6 fB & 6Do T fB ' 6Do8 T8 fB ( 6Dol Tl fB ) 6DoT` * 03333   `B + 0D   , NH "`^   W#All banks access same block address$$  1 N0 "`7 F 3  Cword 3 (bank 3)  2 N "`  ~  Cword 2 (bank 2)  3 N` "` }  Cword 1 (bank 1)  4 N "`d Q H Cword 0 (bank 0) ` - 0 6 ` . 0V ` / 0T" O ` 0 0 T 7 T "`  A block address 2 9 BdENGI<Q ܻdd`Tܻdd`T<`Td`T<`T eO H  0޽h ?9 33___PPT10i.â^+D='  = @B +  0L0 /b(  ~  s *p  `      BH "` `p   H  0޽h ? a(y___PPT10Y+D='  = @B +  0 00(  x  c $#  `   x  c $\$     H  0޽h ? 33___PPT10i.Ȝ0f+D='  = @B +  0 ,<(  ~  s *)  `   ~  s *|* ,70   H  0޽h ? a(___PPT10i.G++D='  = @B +  0 ,(     <1 "`       H2  "`    H  0޽h ? a(___PPT10i.P+D='  = @B +  0 ,h(  ~  s *l:  `      HD;  "` `x   H  0޽h ? a(___PPT10i.F0+D='  = @B +&  0 &%,;;%(  ~  s *E  `   *  TG gֳgֳ ? * TCompulsory misses are independent of cache size Very small for long-running programs60(2%(20%]  TP 1?"`v *  ]Conflict misses decrease as associativity increases Data were collected using LRU replacementR4( v2*( 24F*B  T`W 1? /Capacity misses decrease as capacity increases 80( v2.FfB3!L 0 # 0<B  # 1pC<B  # 1sCD<B   # 1s$ % <B   # 1s  <B   # 1s  <B   # 1s  <B   # 1s<B  # 1s<B  # 1spq<B  # 1CD<B  # 1#c<B  # 1s#tc<B  # 1Z#[c<B  # 19 #: c<B  # 1 #! c<B  # 1 # c<B  # 1#c<B  # 1#c(   B3 CDEDFN1l S 3 3 S l $(@`+;(   B3 CuDEDFN1&l S 3 =3 uS ul uuuuuu$(@` +(   B3 CDEDFN1>l &S e3 3 S el .vGP$(@`f  (   B3 CDEDFN1]-l S 3 3 S l EFo$(@` (   B3 CDEDFN1l S T3 |3 S \l %N$(@`g (   B3 CDEDFN1fl \S 3 3 S l <\E.$(@`   6h$f  S Miss Rate. 0  E   6s   G0*0 D@ ! 6x M   J2%,0 D@ " 6@v M a  J4%,0 D@ # 6| M B  J6%,0 D@ $ 6䆅 Mv #  J8%,0 D@ % 6T W  K10%,0 D@ & 6ď ? K12%,0 D@ ' 64   K14%,0 D@ ( <h ]r I1,0 D@ ) <h =0 I2,0 D@ * <h qm I4,0 D@ + <h  qm  I8,0 D@ , <hd  ph  J16,0 D@ - <h  pO  J32,0 D@ . <h p. J64,0 D@ / <hH p N128 KB,0 D@ 0 6Q - I1-way(0 A 1 6 bt I2-way(0 A 2 6z R  I4-way(0 A 3 6hz )   I8-way(0 A 4 6H z    d Capacity < 0  EfE 5 0 z m ~  pCompulsory F0  EEB 6B T D1?mSB 7B T D1? B 8B T D1?  o B 9B T D1? ~ B :B T D1? X B ;B T D1?} -H  0޽h ? a(___PPT10i.F0+D='  = @B +  0 &,(  ~  s *5  `      l, gֳgֳ ?#" 0e,TN   H  0޽h ? a(___PPT10i.`6L+D='  = @B +2q  0 zfrf,IK e(  ~  s *\z  `  z     B z "`,Ts z  M_L _@0 # e1|5T g  # ;Mb <B  # 1  <B  # 1* : <B  # 1Z j <B   # 1  <B   # 1  <B   # 1  <B   # 1 ) <B   # 1I Y <B  # 1y  <B  # 1  <B  # 1  <B  # 1  <B  # 18 H <B  # 1g w <B  # 1 <B  # 1 <B  # 1  <B  # 1& 6 <B  # 1V f <B  # 1 <B  # 1 <B  # 1 <B  # 1 % <B  # 1E U <B  # 1u <B  # 1 <B   # 1 <B ! # 1  <B " # 14 D <B # # 1c s <B $ # 1 <B % # 1 <B & # 1  <B ' # 1" 2 <B ( # 1R b <B ) # 1 <B * # 1 <B + # 1 <B , # 1 ! <B - # 1A Q <B . # 1q  <B / # 1  <B 0 # 1  <B 1 # 1  <B 2 # 10 @ <B 3 # 1` g <B 4 # 1, - <B 5 # 1*, :- <B 6 # 1Z, j- <B 7 # 1, - <B 8 # 1, - <B 9 # 1, - <B : # 1, )- <B ; # 1I, Y- <B < # 1y, - <B = # 1, - <B > # 1, - <B ? # 1 ,  - <B @ # 18 , H - <B A # 1g , w - <B B # 1 , - <B C # 1 , - <B D # 1 ,  - <B E # 1& , 6 - <B F # 1V , f - <B G # 1 , - <B H # 1 , - <B I # 1 , - <B J # 1 , % - <B K # 1E , U - <B L # 1u , - <B M # 1 , - <B N # 1 , - <B O # 1 ,  - <B P # 14 , D - <B Q # 1c , s - <B R # 1 , - <B S # 1 , - <B T # 1 ,  - <B U # 1" , 2 - <B V # 1R , b - <B W # 1 , - <B X # 1 , - <B Y # 1 , - <B Z # 1, !- <B [ # 1A, Q- <B \ # 1q, - <B ] # 1, - <B ^ # 1, - <B _ # 1, - <B ` # 10, @- <B a # 1`, g- <B b # 1 <B c # 1*:<B d # 1Zj<B e # 1<B f # 1<B g # 1<B h # 1)<B i # 1IY<B j # 1y<B k # 1<B l # 1<B m # 1  <B n # 18 H <B o # 1g w <B p # 1  <B q # 1  <B r # 1  <B s # 1& 6 <B t # 1V f <B u # 1  <B v # 1  <B w # 1  <B x # 1 % <B y # 1E U <B z # 1u  <B { # 1  <B | # 1  <B } # 1  <B ~ # 14 D <B  # 1c s <B  # 1  <B  # 1  <B  # 1  <B  # 1" 2 <B  # 1R b <B  # 1  <B  # 1  <B  # 1  <B  # 1!<B  # 1AQ<B  # 1q<B  # 1<B  # 1<B  # 1<B  # 10@<B  # 1`g<B  # 1" #<B  # 1*":#<B  # 1Z"j#<B  # 1"#<B  # 1"#<B  # 1"#<B  # 1")#<B  # 1I"Y#<B  # 1y"#<B  # 1"#<B  # 1"#<B  # 1 " #<B  # 18 "H #<B  # 1g "w #<B  # 1 " #<B  # 1 " #<B  # 1 " #<B  # 1& "6 #<B  # 1V "f #<B  # 1 " #<B  # 1 " #<B  # 1 " #<B  # 1 "% #<B  # 1E "U #<B  # 1u " #<B  # 1 " #<B  # 1 " #<B  # 1 " #<B  # 14 "D #<B  # 1c "s #<B  # 1 " #<B  # 1 " #<B  # 1 " #<B  # 1" "2 #<B  # 1R "b #<B  # 1 " #<B  # 1 " #<B  # 1 " #<B  # 1"!#<B  # 1A"Q#<B  # 1q"#<B  # 1"#<B  # 1"#<B  # 1"#<B  # 10"@#<B  # 1`"g#<B  # 1 <B  # 1*:<B  # 1Zj<B  # 1<B  # 1<B  # 1<B  # 1)<B  # 1IY<B  # 1y<B  # 1<B  # 1<B  # 1  <B  # 18 H <B  # 1g w <B  # 1  <B  # 1  <B  # 17 MG N<B  # 1f Mv N<B  # 1 M N<B  # 1 M N<B  # 1 M N<B  # 1% M5 N<B  # 1U Me N<B  # 1 M N<B  # 1 M N<B  # 1 M N<B  # 1 M$ N<B  # 1D MT N<B  # 1t M N<B  # 1 M N<B  # 1 M N<B  # 1 M N<B  # 13 MC N<B  # 1b Mr N<B  # 1 M N<B  # 1 M N<B  # 1 M N<B  # 1! M1 N<B  # 1Q Ma N<B  # 1 M N<B  # 1 M N<B  # 1 M N<B  # 1M N<B  # 1@MPN<B  # 1pMN<B  # 1MN<B  # 1;M< <B  # 1# Z <B  # 1#a Zb <B  # 1# Z <B  # 1#W ZX <B  # 1#Z<B  # 1#MZN<B  # 1;  <B  # 1; <<B  # 1X Y<B  # 1u v <B  # 1 <B  # 1 <B  # 1;O X <B  # 1X u <B  # 1u  <B  # 1 4<B  # 1;L X <B  # 1X u <B  # 1u <B  # 1   <B  # 1; X <B  # 1X u  <B  # 1u   <B  # 1  <B  # 1;H X <B  # 1X u <B  # 1u <B  # 1  <B   # 1; X <B   # 1X u <B   # 1u <B   # 1  H   C 1'< Nc H  C 1D l H  C 1a H  C 1v  H  C 1 GH  C 1'8 N_ H  C 1D l H  C 1a H  C 1v w H  C 1  H  C 1' N H  C 1D l H  C 1a  , H  C 1v $ H  C 1  H  C 1'4 N\ H  C 1Dk l H  C 1a H  C 1v H   C 1{  H ! C 1'{ N H " C 1D l H # C 1a H $ C 1v H % C 1   & 0 s S@ mBlock Size (bytes) <0 GE ' 6(s _/p  V Miss Rate 0 0  FB ( 6ܗs  D H0%*0 E ) 6Ps    H5%*0 E * 6s  2  I10%*0 E + 6s    I15%*0 E , 6Ps {( I20%*0 E - 6Ĭs  I25%*0 E . <s B J16,0 E / <s @ J32,0 E 0 <X s 3 @  J64,0 E 1 <s 4 @ 0 K128,0 E 2 <s P@ 0 K256,0 EB 3 3 1U <B 4 # 1H 5 C 1 6 68s e H1K*0 E<B 7 # 16 7 H 8 C 1" I  9 6| s   H4K*0 E<B : # 1{ | H ; C 1g   < 6X%s D   I16K*0 E<B = # 1  H > C 1   ? 6L+s  6  I64K*0 E<B @ # 1  H A C 1   B 6`:s  G|  J256K*0 El   JJ,$D 02 D  `7 jJ? #A E N ?s 1?  YIncreased Conflict Misses$0 2l {^  I! ,$D 02 GB  `(N jJ?{   H NCs 1?^  YReduced Compulsory Misses$0 2E K THs 1"`UW ,$D 0 M64-byte blocks are common in L1 caches 128-byte block are common in L2 cachesN(2NH  0޽h ? a(` X ___PPT108 .F0+u D '  = @B D ' = @BA?%,( < +O%,( < +D' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*I%(D' =-o6Bdissolve*<3<*ID' =%(D' =%(D3' =4@BB BB%(D' =1:Bvisible*o3>+B#style.visibility<*J%(D' =-o6Bdissolve*<3<*JD' =%(D' =%(D@' =A@BB BB0B%(D' =1:Bvisible*o3>+B#style.visibility<*K%(D' =-o6Bdissolve*<3<*K+8+0+K0 +  0 00(  x  c $Vs  `  s  x  c $Vs   s  H  0޽h ? 33___PPT10i.Ȝ0f+D='  = @B +  0 , Y(  ~  s * ]s  `  s     B]s "`,TP s  L z `  #  l   `Dq 1?"`  RUnified L2 Cache&0( 2   `q 1?"`z`j II-Cache&0( 2   `4q 1?"`z j ID-Cache&0( 2B  Z D1?jB   Z D1?pjpB   Z D1?p    `q 1?"`p `  I Main Memory" 0( 2 H  0޽h ? a(___PPT10i."+D='  = @B +  0 -<(  ~  s *q  `  q  ~  s *hq 2`x q  H  0޽h ? a(___PPT10i.L"s+D='  = @B +  0 -b(  ~  s *,q  `  q     Bq "`,pN q  H  0޽h ? a(___PPT10i.~zR+D='  = @B +  0  -t(  ~  s *"q  `  q     B8_q "`,N q   H  0޽h ? a(___PPT10i.,W+D='  = @B +  0 0-(     Bx "` `     s *tq  `x   0h$`H  0޽h ? a(___PPT10i.3z@+D='  = @B +t  0 @-(  ~  s *  `      B "`,TP<$D  0   H  0޽h ? a(___PPT10.DNx+qZD' = @B D' = @BA?%,( < +O%,( < +Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<* %(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<* D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<* Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*C%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*CD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*CDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*Cq%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*CqD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*CqDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*q%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*qD' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*qDn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*Dn' =%(D' =%(D' =4@BBBB%(D' =1:Bvisible*o3>+B#style.visibility<*#%(D' =+4 8?\CB#ppt_xBCB#ppt_xB*Y3>B ppt_x<*#D' =+4 8?dCB1+#ppt_h/2BCB#ppt_yB*Y3>B ppt_y<*#+&  0   *Xv (  X X S ~LJc ih ih 1 ? Q^M  c  Let s summarize today s lecture. The first thing we covered is the principle of locality. There are two types of locality: temporal, or locality of time and spatial, locality of space. We talked about memory system design. The key idea of memory system design is to present the user with as much memory as possible in the cheapest technology while by taking advantage of the principle of locality, create an illusion that the average access time is close to that of the fastest technology. As far as Random Access technology is concerned, we concentrate on 2: DRAM and SRAM. DRAM is slow but cheap and dense so is a good choice for presenting the use with a BIG memory system. SRAM, on the other hand, is fast but it is also expensive both in terms of cost and power, so it is a good choice for providing the user with a fast access time. I have already showed you how DRAMs are used to construct the main memory for the SPARCstation 20. On Friday, we will talk about caches. +2 = 78 min. (Y:58) d|p X  01 ?|   c H X 0rllC ? X(=^b 0 " +h(  h h  T1 ?vQ   c  h C x0gc Wn+Wn+ ? T$L  c   H h 0rllC ? a(rX   N U nsfvz=݊h+B 28zu]~y0#[`11#j`d,چ V0Ķ8q)`sNz41Oh+'0 `hx  MemoryDr. Muhamed MudawarMuhamed Mudawar781Microsoft Office PowerPoint@pXl@ И@pyvIGg  L  y--$xx--'--$<<--'@BComic Sans MS-. 2 *>Memory ."System8-@"Arial-. 2 GBCOE 308.-@"Arial-. '2 P2Computer Architecture.-@"Arial-. '2 Z.Prof. Muhamed Mudawar.-@"Arial-. 62 f'Computer Engineering Department.-@"Arial-. L2 o.King Fahd University of Petroleum and Minerals.-՜.+,0D     On-screen ShowKFUPM> HArialComic Sans MS WingdingsTimes New RomanSymbolTimesGeneva Courier NewDefault DesignMemoryPresentation OutlineRandom Access MemoryTypical Memory StructureStatic RAM Storage CellDynamic RAM Storage CellDRAM Refresh Cycles$Loss of Bandwidth to Refresh CyclesTypical DRAM PackagingTrends in DRAMExpanding the Data Bus Width!Increasing Memory Capacity by 2k Next . . .!Processor-Memory Performance GapThe Need for Cache MemoryTypical Memory Hierarchy#Principle of Locality of ReferenceWhat is a Cache Memory ?Cache Memories in the DatapathAlmost Everything is a Cache ! Next . . .Four Basic Questions on CachesBlock Placement: Direct MappedDirect-Mapped CacheDirect Mapped Cache contd$Mapping an Address to a Cache Block$Example on Cache Placement & MissesFully Associative CacheSet-Associative CacheSet-Associative Cache Diagram Write PolicyWrite Miss Policy Write BufferWhat Happens on a Cache Miss?Replacement PolicyReplacement Policy contd Comparing Random, FIFO, and LRU Next . . .Hit Rate and Miss RateMemory Stall Cycles$Memory Stall Cycles Per InstructionExample on Memory Stall Cycles"CPU Time with Memory Stall Cycles"Example on CPI with Memory StallsAverage Memory Access Time#Designing Memory to Support CachesMemory InterleavingEstimating the Miss Penalty Next . . .Improving Cache PerformanceSmall and Simple CachesClassifying Misses Three CsClassifying Misses contd%Larger Size and Higher AssociativityLarger Block Size Next . . .Multilevel CachesMultilevel Cache Policies#Multilevel Cache Policies contd"Two-Level Cache Performance 1/2"Two-Level Cache Performance 2/2Example on Two-Level CachesShl  Fonts UsedDesign Template Slide Titles> Custom Shows'_0Muhamed MudawarMuhamed Mudawar  !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~      !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcefghijkmnopqrsuvwxyz{Root EntrydO)PicturesCurrent UsertSummaryInformation(dPowerPoint Document(DocumentSummaryInformation8l