CPU Basics


Meaning of 32-bit CPU

Each CPU can be identified as below:
Register Width: n-bit register means CPU can access 2^n addresses. 
E.g. 8086 had 16-bit registers, but had additional 4 bits. So in total, 20 bits register could access 2^20 memory locations (1MB).

Instruction Set: CPUs have pre-defined instructions each having unique code, such as add=1000, decrement=1001 etc.
E.g. 8086 had 16-bit instruction set. Lets say, if Intel processor also had 16-bit instruction set, it does not mean 8086 could understand it. Why? Because each instruction may not have same code across CPU, and then it becomes like different languages.

Intel 386 had ability to understand 16-bit 8086 instruction set. Along with this, it had added capability to access unique brand new 32-bit instruction sets.

CPU Execution Flow

Instructions (program) residing in secondary storage is brought into RAM and transferred to processor queue (aka pipeline). CPU executes instructions by shifting part of queue to execution engines on clock tick.

Uni-core



Thread is unit of execution in a program (We all know, nothing new :) ).

Giving sense of all happening at same time: 
A Software switch is placed between CPU and Program(threads/unit of execution). This switch keeps connecting different programs at fixed timeslice (say 1 millisecond) giving user a feeling that it all executes simultaneously.
There can be optimizations by giving different priorities to threads, sort of hijacking this switch to do higher priority task more frequently.
Did you guess name of this switch?

Yes, it is called scheduler which does pre-emptive multitasking without waiting for application to leave CPU (as it is the case in collaborative multi-tasking).

Hyperthreading Part of pipeline in duplicated so that it can parallelize some part of execution unit, at least the units which don't require  common resources such as Math Engine. 
Multicore: Pipeline and engines are duplicated to give sense of true multi-threading.
SMP: Symmetrical Multi-Processing. Actual CPU cores are duplicated to give power packed performance.

Ultimately server industry faced limited by 32-bit registers, as only 2^32=4GB memory could be accessed. Gradually Intel Itanium (server technology) had totally totally new instruction set along with 64-bit register. 
Issue: It was not compatible with 32-bit 386, means Everything had to be re-written. Intel added a HW device which could understand older 32-bit instructions to new instruction code. But it made this slower while translating.

Quasi 32-bit: AMD designed system with 32-bit instruction set, with additional instructions which allow more than 4GB memory to be accessed [Athlon64]. It was a hit, as older application worked along with capability to access more than 4GB memory. Microsoft started OS development and it became standard.

Today all Intel CPU support 32-bit and added instruction to support additional memory access.

SoC, ASIC, MicroProcessor etc
At some point in history, SoC (System On Chip) meant an IC that contains everything, processor, memory, peripherals. Not any more.
Mobile phone processors are called SoC, even when they require external DDR, different ICs (sensors, power management IC, RF circuit, WiFi, BT etc.). So, MicroControllers are subset of SoCs, ASICs(Application Specific IC), ASSPs(Application Specific Standard Parts) overlap with SoC definition but not truly a subset.

ARM Architecture

ARM is known to provide processor architecture which are famous across. Many companies have moved from their proprietary process to ARM and developed their SoC around ARM core.
It is good to know that ARM does not fabricate processor, just provides IP, design etc. The companies use this core and develop their system around it, which is then called SoC(System On Chip). Like Broadcom's STB SoCs any many more

ARM is 32-bit Load-Store Architecture with most instructions executing in single clock.
Load and Store: Means only load and store are only direct operation than can be performed in Memory. Can not directly manipulate memory, need to load, manipulate and store always.

Supports: Word(32-bit), HalfWord(16-bit), DoubleWord(64-bit)

In general ARM is known to have three profiles, Ax (Application), Rx(RealTime), Mx(Microcontroller).
Ax series has MMU, high performance low power results, multitasking, virtual memory, security, HW virtualization(A15, A7). Usage: Smartphone, Digital TV, Networking
Rx series has protected memory [MPU], low latency predictable real time guarantee (Thus no virtual memory). Usage: HDD controller, Engine management etc.
Mx series is for embedded devices, and thus has lowest gate count, lowest power consumption, different exception handling than Ax/Rx, fixed memory map. Only Support Thumb Instruction Set.

Most ARM cores have 7 basic operating modes, each having its own stack, different private subset of registers to ensure protection.
SVC(supervisor): Entered on Reset
FIQ: Entered when high priority(FAST) INT raised
IRQ: Entered when normal INT raised
Abort: To handle memory access violation
Undef: To handle undefined instruction
System: Priviledge mode using same register set as users
User: Most application user task

The security of these modes is partly implemented in the sense that subset of these registers is specific to each mode. When moves from one mode to another, those registers are simply replaced (not content)

* PriviledgeModes   ExceptionModes

RISC/CISC: 8051-CISC, ARM/PIC-RISC

CISC optimized for low density and provides more high level instructions
RISC optimized for low power, have simple and smaller CPU designs. all instructions execute in single cycle.
PIC and 8015 [both 8-bit] are mix of accumulator-register and register-memory model
ARM is register-register model.

Operands/Results: Model 
Stack Mode: For each ALU operation, arguments taken from top of stack and results pushed onto the stack. 
Accumulator: One of operand is going to be accumulative, while other comes from memory. result stored in accumulator. JVM (machine that executes java programs, uses this model).
Regsiter-Memory: More than one registers, result may be stored in register other than accumulator. Operands can also come from Memory. 
Register-Register: All operands only in register. Load operation required to store values in register before ALU performs operations/

ARM7 and before had 3 stage pipeline, while ARM Cortex A8 has 13 stages.
8051 and PIC: Next instruction fetched when previous completed. Too primitive.


Reference
Meaning of 32-bit by Dave Crabbe
Process-Threads By Dave Crabbe
Joseph Yiu@ARM

No comments:

Post a Comment