> early microprocessors such as the 6502 and Z-80 didn't use microcode
If you look at a die shot of the 6502, for example, you will see an area that looks confusingly similar to where microcode would be stored. Turns out, this is a PLA, not a microcode store. The best quick intro to microcoding that I have encountered is https://people.cs.clemson.edu/~mark/uprog.html.
So where's the line drawn between instruction decoding, and microcoding? It sounds like microcoding is when a single instruction in the program resolves to a _sequence_ of operations that the processor executes?
There is one important practical criterion, which is whether you can change what some instructions do, without redesigning the layout of the CPU.
Both ROMs and PLAs have a regular array structure, where you can change the stored bits with some minor changes in one of the masks, without having to make any changes in the layout of the CPU.
So the most useful definition of whether an instruction is micro-programmed or hard-wired is whether the method of changing what it does is by changing just the values of some bits in a table or by redesigning some CPU part.
This is the right definition from the point of view of the CPU designer.
Application programmers tend to partition the instructions of the modern CPUs in hard-wired and micro-programmed based on whether they execute in a single clock cycle or in multiple clock cycles, but in old CPUs it was quite frequent to have hard-wired instructions that executed in multiple cycles. Modern CPUs have much more available resources, so now it is normal for any hard-wired instruction to also be a single-cycle instruction, but there is no logical necessity for this.
An extra complication in many modern CPUs is that even if most instructions are usually hard-wired, there are also means to intercept the decoding for any opcode and replace the hard-wired operation with a micro-program, in order to be able to correct any dangerous bugs.
Microcoding is when the internal control structure of the processor operates in much the same way as the outer machine language program is processed. The microcode is a sequence of instructions stored in a memory (usually ROM, but can be EPROM or RAM), there is a microcode-program-counter, and the "microcontroller" functions by fetching "microcode instructions" one after another from the memory addressed by the microcode-program-counter.
The results of those fetched microcode instructions then control the actual CPU internal hardware elements. And depending on the type of microcode (horizontal or vertical) those microcode instruction's bits either directly control the hardware elements, or get further decoded to then control the hardware elements.
A good 10k foot description is "a mini-cpu controlling the hardware - that mini-cpu controlled by the outside visible machine language instructions"
It's a bit fuzzy and I don't think there's a single universally agreed on line. See the discussion in the "Microcode in RISC?" section of Ken's analysis of the ARM1 at http://www.righto.com/2016/02/reverse-engineering-arm1-proce... -- Ken thinks that it makes sense to think of it as microcode or perhaps hybrid/partially microcoded, but the designers of this CPU didn't think of it like that and viewed it as a non-microcoded design...
As far as I'm concerned it's bad terminology. Microcode to me at least means the processor trapping and emulating some edge case or similar, so I nearly always say decode to micro-ops whenever I can instead.
This is part of a series [1] by the way. I really liked the one about the latches [2] and the die shrink photos are pretty cool. We always hear about it but here you get to see it. [3]
Not dirrctly in the article, but a key change within Intel after the 4004 was removing the mandate that all chips be a 16 pin DIP. This severely limited performance of the 4004 by not having enough lines to do data etc in one clock, instead having to shift register and serialize it.
The 8008, Intel's first 8-bit processor, direct ancestor to the 8080 and indirectly to the Z80 and 8086, also came in a far-too-small 18-pin package as well.
In hindsight it seems rather inscrutable. Why were they hobbling their product like that? Well it seems just the arbitrary preference of one executive. But the seeming absurdity of it is partially an artifact of later thinking, from the era of single-chip microcontrollers. The 4004 and 8008 both needed a large amount of dedicated support electronics. Not just bus multiplexers, but clock drivers, special memory interfaces, etc. You couldn't turn an 8008 on without at least half a dozen support chips which were as much an integral part of the "computer" to the designers as the 8008 itself was. And while multiplexing brutally hindered the chip's performance, oddly enough pure performance doesn't seem to have been a major design goal for either the 4004 or 8008. Basically anything would be fast enough -- even multiplexed at 0.5 MHz -- for the industrial control and calculator-like applications envisioned.
Looking at the 8008 wiki if the quotes are to be believed they were actually scared of performance. Believing that they would scare off customers buying intel memory (their main business at the time apparently) for their own in house processors. So it seems very deliberate. That said the 8008 was primarily intended for terminals where performance really wasn't a concern. But that didn't stop people from making proper mini-computers from it.
The 8008 used P-channel silicon gate MOS technology and was packaged in an 18-pin package: a very poor choice, imposed by Intel management’s aversion to high pin-count packages.
For details on how Intel was fixated on 16-pin chips, see the Oral History of Federico Faggin [1]. He describes how 16-pin packages were a completely silly requirement, but the "God-given 16 pins" was like a religion at Intel. He hated this requirement because it was throwing away performance. When Intel was forced to 18 pins by the 1103 memory chip, it "was like the sky had dropped from heaven" and he had "never seen so many long faces at Intel."
I think it would have been impossible otherwise. It's still annoying with the multiplexed address and data pins in the DIP-40: you have to demultiplex them with latches and transcievers. And don't forget the 8088 (used in the PC) couldn't do data in one clock due to the 8-bit data bus (386SX was similar in the next step up.) Pretty cool for us that the 8086/8 are breadboardable though.
They could've definitely put more shift registers on die and squeezed an 8086 in a DIP-16, but then the support circuitry would become even more complex.
It's worth noting that the trend has been towards (high speed) serial buses in later CPUs, e.g. the Pentium 4 was the last generation to use a parallel FSB; that was replaced with the serial https://en.wikipedia.org/wiki/Direct_Media_Interface
Fair enough. I think the original Datapoint design for what became the 8008 was even entirely serial, including the memory. I guess I should have said it would have been exceptionally more annoying. It's hard enough to get a computer out of it. So it'd be impossible in the sense that they wouldn't have met the goal of getting that relatively easy kit out of it and there wouldn't have been any PC.
Just a tangentially related plug for a youtube channel called Micrographia. It's from the same guy that does the Breaking Taps channel (also a great watch), but he did a tour of the 80486 with optical and SEM scopes.
ARM code is a fair bit bigger than 16 bit 8086 code, so there is that to consider. It makes total cost of ownership much higher when we're talking about systems with RAM measured in hundreds of kilobytes or single digit megabytes.
But basically what you noticed is exactly why RISC swept everything else away in the late 80s to early 90s. If you have a 50,000 or 100,000 transistor budget and RAM is relatively cheap and fast, then complex microcoded designs really are a bad idea. You can get so, so much more performance out of a design like MIPS or ARM rather than an 80186, etc.
Hindsight is 20/20. If you sent me back to 1976 - 77 in a time machine, I would propose not something like the 68000 or 8086, but something very much like MIPS, or Berkeley RISC (minus the register windows). It could have been done in the late 70s. It would have been easily 3 - 5x as fast per MHz, and could probably be clocked faster. Doing so just wasn't obvious until later. Everyone was trying to pack as much sophistication into the instruction set and architecture, to ease assembly language programming, as possible.
How critical would a load–store architecture be to making that late 70s/early 80s "pre-RISC" better than the CPUs Intel and Motorola were cranking out?
E.g. imagine a simpler 16 bit extension of the 8080 than the 8086 was, with basically just more registers and focusing on making instructions execute in fewer cycles but maintaining a register–memory architecture (potentially also removing or simplifying some instructions but I think the 8080 ISA was already pretty minimal)?
> 29000 transistors? But it's the same as an ARM2 which apparently was full 32bit and had an integer multiplier.
It's a very good point. I think it's worth to ask a few questions in return.
How many years separated the two designs?
Does ARM2 support an equivalent ISA, in terms of features (not encoding)?
For instance, the 8086 has support for BCD integers, specialized instructions for loops, the ability to use its registers as 16 bits or 8bits (doubling the register count in the later case).
Conversely, ARM2 had, for example, a barrel shifter, immediate constants, every instruction being conditional, etc., all of which are absent from 8086.
My point exactly. They have different ISA, and as such cannot be compared on the transistor count alone.
They each correspond to a different era, with different needs, as BCD clearly tells.
If today's HW engineers had a chance to implement a small cpu core with the same tr count, would they come up with the same ISA as the ARM1 or 8086? Would they choose to implement integer division (DIV and IDIV in x86), or not and leave it to software (ARM)? Would they pick CISC, RISC, VLIW, or something else?
Yes, for the most part you can reconstruct the circuits from the photos. (Which I'm doing.) A key thing that doesn't show up is which transistors are enhancement and which are depletion; this depends on the doping. It's straightforward to figure this out from the layout, though. Except in the Z80 processor: the designers famously used some depletion transistors in unexpected places as "traps", so anyone who tried to clone the chip visually would encounter subtle failures. I'm pretty sure that 8086 didn't do that, though.
The other aspect of your question is there's a lot of implicit manufacturing knowledge that you'd need if you tried to reconstruct the chip. Things like the resistance of the lines and the characteristics of the transistors that you'd need to get right for the chip to work reliably.
> Does the microcode contain any "junk code" that doesn't do anything?
> It seems to! While most of the unused parts of the ROM (64 instructions) are filled with zeroes, there are a few parts which aren't. The following instructions appear right at the end of the ROM
Could it be the signature of the microcode implementor?
Many of the early software innovators names are known. Who was the team behind the 8086? Superficial googling gives all credit to "Intel" never listing any individuals who designed this thing.
The Computer History Museum has some oral histories from folks who worked in this era. They don't have the 8086 design team, but they have different panels featuring people from the 8080, Z80, and 680x0 design teams. (Some from an 8086 marketing team too)
I think Ken should take a look at the NEC V30, see how that stacks up against the venerable 8086. I'm already running a V30 as a coprocessor on my raspberry pi.
If you look at a die shot of the 6502, for example, you will see an area that looks confusingly similar to where microcode would be stored. Turns out, this is a PLA, not a microcode store. The best quick intro to microcoding that I have encountered is https://people.cs.clemson.edu/~mark/uprog.html.