10 (Материалы к экзамену), страница 8
Описание файла
Файл "10" внутри архива находится в следующих папках: Материалы к экзамену, faq. Текстовый-файл из архива "Материалы к экзамену", который расположен в категории "". Всё это находится в предмете "вычислительные сети и системы" из 7 семестр, которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр 8 страницы текстового-файла онлайн
|* INMOS (from GB), now bought by SGS Thompson (French), who was the
|inventor and sole manufacturer of transputers
|* Parsytec (still alive, but does not use Transputers any more, Germany)
|* Meiko (GB) produced the "computing surface"
|* IBM had an internal project (codenamed VICTOR)
|and there are many more. Transputers had a built in communication agent and
|it was very easy to connect them together to form large message passing
|machines in regular, fixed topologies. INMOS' idea was that they should be
|the "transistors" (building blocks) of the new parallel computers.
Cray Computer Corp. (CCC)
One of the nifty Cray 3's was the Cray3/SSS. It used some of the stuff done
along the "beltway" if you know what I mean. I have the docs on the 3 and
the 4. The following comes from some docs I got from the CCC team.
Cray-3
------
Logic Circuits GaAs SDFL
500 Gate-Equivalent
3.935 x 3.835 mm
Memory Cricuits Silicon CMOS SRAM, 4Meg x 1, 25ns Cycle Time
8 MWords per Module
Modules 4.1 x 4.1 x 0.25 Inches, 4 Modules per CPU
69 Electrical Layers, 22000 Z-Axis Connections
Twisted Pair Interconnect
Cooling Chilled Water/Flourinert
Cabinets System Tank and C-Prod
Typical Footprint 252 Square Feet
Cray-4
------
Logic Circuits GaAs DCFL, 5000 Gate-Equivalent, 5.4 x 5.4 mm
Memory Circuits 21ns Cycle time, 16MWords per Module (SAA)
Modules 5.2 x 5.2 x 0.33 inces, 1 Module per CPU
90 Electrical Layers, 36000 Z-Axis Connections
Micro-Coaxial Interconnect
Cooling SAA (Also Air Cooled)
Cabinets One Cabinet
Typical Footprint 215 Square Feet
The Cray-3/SSS had PIM's. Nifty.
The thing is made of 5.2"x5.2"x.33" modules. One for CPU and one per
16 MW of memory. (This is 21 ns cycle time Toshiba 4Mx1 SRAM, the same
as in the Cray-3; they promise to go to 4Mx4 SRAM this year, doubling
memory size and bandwidth.) The ratio is 4 memory modules per CPU.
72-bit words with SECDED.
The CPU is solid GaAs. The chips are shaved thin, placed in 4x4 arrays on
some sort of MCM-like board, and a 3x3 array of boards makes on logic
layer. (The memory is different, as the memory chips aren't square.)
The module has 4 logic layers (stacked on the outside) and apparently
90 interconnect layers with 36,000 Z-axis vias.
The CPU runs at 1 GHz. The GaAs chips are billed as "5000 gate-equivalent",
whatever that means. In "DFCL" logic, a term I know not.
The modules are stacked against each other and fluorinert is
chilled and pumped up to drain through them by gravity.
The CPU and memory modules are bolted at one side to bus bars delivering
3 supply voltages + gnd. On the other is a familiar-looking mat of grey
wires. The wires are apparently actually miniature coax cables! The
connectors are single pins on a fine (0.7 mm or so) pitch, with the male
ends being hollow gold-plated cylinders and the females being pins recessed
in plastic. (I suppose this circularly symmetric arrangement is good for
controlled impedance or something.)
The end of the CPU module contains 4 rows of 4 sockets, each socket
containing 2 rows of 39 pins. 2/3 of the pins were signal; the others
were grouns in a S-G-S S-G-S configuration. This provides for a total
of 832 signals from the CPU. The memory modules leave one row of
sockets empty, leaving only 624 signals.
The sockets attached to plugs that tended to have 3 or 6 pins and 2
or 4 wires coming out the back ends. It was all some sort of white plastic.
The plugs were pretty easy to insert and remove when I tried them.
Some cables also had plugs in the middle.
The basic unit of processing is a 4-processor "quartet" with 256 MW of
memory. Each processor has a full-duplex 32-bit HiPPI channel.
The processor is billed as using the IEEE floating-point format but not, I
note, as doing IEEE math.
Now, on to the architecture. All registers are 64 bits wide. There are
three main sets of three buses, one result bus and two operand buses.
These are:
Vi, Vj, Vk: The vector result and operand buses (respectively: Vi is result)
Si, Sj, Sk: The scalar result and operand buses
Ai, Aj, Ak: The address result and operand buses
The memory part of the diagram confuses me. There are clearly 2 bidirectional
memory ports. Each has a "Retry Buffer". Each is connected to 8 rows
of 4 "Memory Ranks for Each Port", the D unit (they are labelled A, B, C and D)
of each row is shown connected to one of the 8 "Octant"s. Each "Octant"
contains 18 (eighteen) memory banks.
Instruction fetch (8 instruction buffers with 32 entries each) is
connected to port 0. (The line is marked "Fetch Control".) The HiPPI
and console interfaces are connected to port 1. They are also shown
connected to memory (directly, not via one of the ports, which confuses
me), the vector registers, the scalar registers, the address registers,
the instruction buffers, the program counter, and some utility
registers:
- Exchange Address
- Limit Address
- Base Address
- Error register
- Status Register
Arithmetic bus Ai, Scalar bus Si and vector bus Vi all connect to a line
between the two memory ports that seems to mean "either".
There are three integer vector functional units: integer, logical, and shift.
These get inputs from Vj, Vj, Sj, Sk, Ak or the vector mask register
and deliver results to Vi or the vector mask register.
There are two floating point functional units, shared between vevtor and
scalar. These take resuts from Vjk or Sjk, and deliver results to Vi, Si
or (apparently) the vector mask register.
There are three integer scalar functional units: integer, logical and shift.
These take input from Sj, Sk or Ak and deliver results to Si.
A 64-bit real time clock is readable on Si and writeable from Sk.
The vector length register is readable on Ai and writeable on Ak.
There is an address adder and a 35-bit address multiplier. Inputs on
Aj and Ak and output on Ai.
The functional units are marked as supporting "chaining" and "tailgating",
which I don't understand and neglected to ask about the meaning of.
The program register is readable on Ai and writeable from Aj, as well as
being fed from the instruction buffers, the console/HiPPI interface and
a built-in incrementor. (+1/3/5)
There are 8 vector registers of length 64, triple-ported on Vi (write), Vj
and Vk (read), 8 scalar aregisters (Si, Sj and Sk) and 8 address registers
(Ai, Aj and Ak), as well as 64 "temporary" registers (apparently an
addition since the Cray-3) that are read/write accessible on Vi, Si and Ai.
"Local memory has been eliminated and replaced by a new set of registers--
the temporary (T) registers"
There are also "up to" 64 semaphore flags. The memory transfer rate
is billed as 2 GW/sec/processor.
Up to 32 processors may go into a single "node" in one cabinet. A
"cluster" bus of 2 GB/sec (full-duplex, per node, <= 4 nodes) is
promised "mid-1995" for systems up to 128 processors. There's some
mention on a features list of 64-bit HiPP. ("Support for 200 Megabyte