Advanced Computer Architecture
- 2012 年度版 (2013年度版準備中)
Instructor
Goal and Theme
Most modern CPUs can exploit 3-level parallelism: (1) The CPUs can dispatch multiple instructions from an instruction stream in every clock cycle to exploit the instruction-level parallelism (ILP). (2) The CPUs can execute multiple threads simultaneously to exploit the thread-level parallelism (TLP). And (3) There are multiple cores in a single CPU chip so that it can execute multiple programs in parallel to exploit the job-level parallelism (Chip multiprocessors). To achieve high-performance in scientific computations, Wallace Tree, Goldschmidt algorithms and Newton-Raphson algorithms are used to speedup the operations of the multiplications, division, and square root. For computer/CPU design, the Verilog HDL (Hardware description language) is widely used by both academia and industry.
Abstract
The contents of the lecture include technology and performance evaluation, instruction architectures, pipelining, floating point adder design, Wallace Tree, Goldschmidt algorithms, Newton-Raphson algorithms, FPU/CPU design, multithreading/multicore CPU design, instruction scheduling and branch prediction, Scoreboard and Tomasulo algorithms, cache and TLB design, PS/2 Keyboard and mouse, VGA controller, and PCI Bus. In the last two classes, students will present their work related to this course.
Schedule
| 回 | テーマ | 内容 |
|---|---|---|
| 1 | Technology and Performance Evaluation | Integrated circuit technology, cost and performance |
| 2 | Instruction Architectures | Instruction type, format, and addressing modes |
| 3 | Pipelining and Verilog HDL | Pipelined CPU design in Vrilog HDL |
| 4 | Floating Point Adder Design | FPU addition and subtraction |
| 5 | Wallace Tree | Multiplication and Wallace Tree Circuit |
| 6 | Goldschmidt Algorithms | Goldschmidt divison and square root algorithms |
| 7 | Newton-Raphson Algorithms | Newton-Raphson divison and square root algorithms |
| 8 | FPU/CPU Design | Advanced CPU/FPU design |
| 9 | Instruction Scheduling and Branch Prediction | Loop unrolling, instruction re-organizing, and branch prediction |
| 10 | Scoreboard and Tomasulo Algorithms | Dynamic instruction scheduling, Scoreboard and Tomasulo algorithms |
| 11 | Cache and TLB Design | Memory hierarchy and cache/TLB design |
| 12 | Multithreading/Multicore CPU Design | Multithreading CPU and Multicore CPU Design |
| 13 | Input and Output Systems | PS/2 Keyboard and mouse, VGA controller, and PCI Bus |
| 14 | Presentations | Present your theme |
| 15 | Presentations | Present your theme |
授業外に行うべき学習活動
Write Verilog HDL codes for CPU design or develop interconnection networks, and prepare presentation slides
Materials
Online materials
References
None
Evaluation Method
Based on attendance and presentations
情報機器使用
Bring note-PC to the lecture
前年度の授業改善アンケートからの気づき
None