体系结构量化研究方法（第一章）-EW帮帮网

Chapter 1 Fundamentals of Quantitative Design and Analysis

Defining Computer Architecture

ISA Design
Implementation
- Machine Organization (Microarchitecture)
- Hardware

Computer Architecture Design

Objective

Objective = f(functionality, performance, power, cost, dependability)

Better Performance = Better Hardware + Better Architecture

Improvements in hardware (semiconductor technology)

[Computation] Integrated circuit logic
- Moore’s Law.
  - Transistor number in a chip increases exponentially (doubled every two years)
- Dennard Scaling Law
  - The power consumption does not increase with transistor density
    - Larger density, smaller transistor size
- Pollack’s Rule
  - Performance improvement is proportional to the square root of microprocessor complexity (area or transistor number)
[Data-serving] Semiconductor storage
- Capacity
- Bandwidth & Latency
  - Bandwidth: Affect processor throughput
  - Latency: Affect processors response time
  - Bandwidth Improve > Latency Improve
    - Use large IO batch
  - Demand Increase > Supply Improve
    - Use cache hierarchy

Improvements in architectures

Cache management
- Cache Hierarchy Optimizations
- Create an illusion of super-large space and super-short latency
Instruction level parallelism
- Pipelining
- Dynamic Scheduling, Multiple Issue, Speculation
Other parallelism levels
- Data-level parallelism (DLP)
- Thread-level parallelism (TLP)
- Request-level parallelism (RLP)

Flynn’s Taxonomy for Computers

Single instruction stream, single data stream (SISD)
Single instruction stream, multiple data streams (SIMD)
- Vector architectures
- Multimedia extensions
- Graphics processor units
Multiple instruction streams, single data stream (MISD)
- No commercial implementation
Multiple instruction streams, multiple data streams (MIMD)
- Tightly-coupled MIMD (multiprocessor)
- Loosely-coupled MIMD (cluster)

The only path left to improve performance is specialization—— Design domain-specific architectures (DSA)

Power-related

Maximum Power: describe the electricity IN
- Used for electricity supply design
TDP(Thermal Design Power): describe the heat OUT
- Used for cooling system design
- Peak Power (1.5X higher) > Thermal Design Power > Average Power
Energy (能耗): accumulated power (功耗) during a period
- 1 watt = 1 joule per second
- Energy is a better metric than power for comparing processors.
  - Processor-A (1.2W x 0.7s = 0.84J) better than Processor-B (1W x 1s) given a task
Task-per-joule and Performance-per-watt
- More important than performance-per-mm² in today’s evaluation

Calculation of Energy and Power

Dynamic energy (in a single transistor switch)

Transistor switch from 0 -> 1 or 1 -> 0
= ½ × Capacitive load × Voltage²

Dynamic power

= ½ × Capacitive load × Voltage² × Frequency switched
Reducing clock rate reduces power, not energy
Reducing voltage can greatly reduce the dynamic power and energy

Static Power

= Current_static × Voltage
Scales with number of transistors
Consuming 25-50% of total power

Question:

Techniques To Reduce Power Consumption

Turn off the clock of inactive modules (e.g., floating-point unit)
Dynamic voltage-frequency scaling (DVFS)
Enable low power state for DRAM, disks
Overclocking when power permits

Impact of Energy Constraint on Computer Architecture

Different operations incur different energy/area cost

A 32-bit floating-point addition consumes 30X as much energy as an 8-bit integer add
A DRAM Read consumes 125X as much energy as a SRAM Read.

Domain Specific Processors can then be designed to save energy by:
- reducing wide floating-point operations
- reducing DRAM access by deploying special-purpose memories

Cost

Cost of Die Chip Manufacturing

Die area is a key factor for processor cost
- One wafer is chopped into multiple dies
- Die cost grows roughly as the square of die area
  - $\text{Cost of die}= \frac{\text{Cost of wafer}}{\text{Dies per wafer}\times \text{Die yield}}$
  - $\text{Dies per wafer}=\frac{\pi\times (\text{Wafer diameter/2})^{2}}{\text{Die area}} - \frac{\pi\times\text{Wafer diameter}}{\sqrt{2\times \text{Die area}}}$

体系结构量化研究方法（第一章）