Construction of a PowerPC distributed multicomputer with DC­Hypermesh Interconnect

Chief Investigator: Professor John Barker (EEE)
Co­investigators: Dr Lewis MacKenzie (CS) , Dr Asen Asenov (EEE), Mr P Bailey (CS), Dr David White(C); Research assistant : Scott Roy (EEE)

  1. Overview

    The project aims to demonstrate in hardware a new paradigm for parallel processing which overcomes the inherent communications bottleneck in conventional parallel computers such as MIMD machines. In the latter, any algorithm which requires global broadcasting at some stage will necessarily lead to a steadily diminishing performance as the number of processing elements is made larger [7]. Worse still, conventional parallel machines do not allow efficient emulation of general programming architectures; they are difficult to program. The DC hypermesh concept [1-6] under investigation here is predicted to overcome most of the above difficulties by treating the array of processors as nodes in a communication network. The project will develop general software applications which will demonstrate the gain in efficiency over comparable MIMD machines.

  2. Aims

    Long Term

    Immediate Goals

  3. Benefits and Description of a DC Hypermesh Architecture

    Many current multi­processing systems rely on point­to­point communications links between the processing nodes. The connection topologies may be sparsely populated (pipelines, rings, meshes) or densely populated (binary n­cubes). Each topology has fast nearest neighbour connections, but a trade­off is made between the number of nearest neighbours and the wiring density of the system. Although meshes have few nearest neighbours and high message passing latency between remote nodes, they hold the theoretical performance advantage in systems solely limited by wiring density. This fact, coupled with conceptual and design simplicity, makes them popular at present.

    However, there are more practical system constraints (for example, multi­chip pin­out, message routing delay times and message routing decision times) for which meshes are non­optimal, and so improved system topologies must be found. The need for better parallel architectures is made even more urgent when we consider the difficulties of porting general inhomogeneous algorithms to mesh architectures and their diminishing performance as the number of nodes increases.

    Diagram of 4x4 DC­Hypermesh Architecture

    Figure 1. 4×4 Processor DC-Hypermesh Architecture
    (P = processor, C = Comms node)


    Hypermesh architectures have communications links which span a number of processing nodes. Groups of nodes are connected by bus, crossbar switch, or by distributed crossbars. Figure 1 shows a Distributed Crossbar (DC) architecture, where somewhat increased wiring density is traded for increased communications bandwidth and shorter message passing latency. We have shown the following benefits for the DC Hypermesh:

  4. Timetable

    Months 0­10 :Design of a four node hypermesh cluster based on PowerPC 604 processors connected via PCI interfaces. Design of component communications logic ICS and PCB design for the new interface cards.
    Months 11­15 :Construction and logic/communications test of the four node hypermesh cluster.
    Months 13­16 :Porting of fully distributed operating system.
    Months 16­18 :Performance testing.
    Months 16­24 :Evaluation of parallelised software, harness schemes and operating system.
    Months 6­24 :Develop industrial partnership.


Appendices

  1. Guidelines for Work

  2. Detailed Comms Node Description

    Diagram of DC-Hypermesh Comms Node

    Figure B1. Data routing in a Comms Node of the DC-Hypermesh.


    Diagram of Serial Comms Node

    Figure B2. Implementation of DC-Hypermesh Comms Node.


  3. References and Annotation

    1. MacKenzie, L.M., M. Ould­Khaoua, and R.J. Sutherland, Cobra: A High­Performance Interconnection for Large Multicomputers. 1991, Department of Computer Science, University of Glasgow
      The original departmental research report that spawned the two 1992 MacKenzie papers. A good overview of the argument for a distributed crossbar hypermesh, the physical construction of a node, the mathematical model of the system and some performance graphs.

    2. MacKenzie, L.M., et al. The COBRA project: alleviating bandwidth constraints in large multicomputer networks. in 'Supercomputing Symposium 92'. 1992. Canada
      A longer version of the 1992 MacKenzie 'A hypergraph basedÉ' paper. Same argument, but in far more detail. Also includes a description of how they were going to do the physical construction of the COBRA system with Tom Kelly of Motorola.

    3. MacKenzie, L.M., et al. A Hypergraph­based interconnection network for large multicomputers. in 'CONPAR­VIPP 92'. 1992. Lyon, France: Springer­Verlag.
      Very short paper. Says basically that the constraints people usually use when evaluating topologies are not necessarily the ones that apply in real life. The distributed crossbar hypermesh should work well under these real conditions.

    4. Ould­Khaoua, M., L.M. MacKenzie, and R.J. Sutherland, Performance of Switching Methods in COBRA Networks. 1993, Department of Computer Science, University of Glasgow
      Looks at various swiching methods (message switching, wormhole routing and virtual cut­through) using the model derived for the distributed crossbar hypermesh. Rates them in terms of performance (under different types of load) and implementation cost.

    5. Ould­Khaoua, M., L.M. MacKenzie, and R.J. Sutherland. Performance Comparison of Hypergraph and Graph Networks. in 'Massively Parallel Processing Applications and Development 1994'. Netherlands: Elsevier Science B.V.
      Models distributed crossbar hypermesh, torus and cube topologies under the constraints of wiring density cost or pin out costs. The originality is to look at these at finite routing decision times. Instead of torus topologies winning under wiring density and cubes winning under pin out costs, the hypermesh bests them both under these more realistic conditions.

    6. Ould­Khaoua, M., Hypergraph­based interconnection networks for large multicomputers. 1994, University of Glasgow
      Main thesis covering the work. Skims over the physical construction of the COBRA network, and only outlines the ideas behind the comms node design, but goes into great detail about theoretical performance of systems. Analyses most common meshes and hypermeshes under uniform & non uniform traffic, random & restricted routing, and the three main forms of message passing across nodes (virtual cut­through, wormhole & message switching). Comprehensive conclusions.

    7. A. Asenov, D. Reid and J. R. Barker, Speed­up of scalable iterative linear solvers implemented on an array of transputers, Parallel Computing, Vol. 20, pp. 375 ­ 387, (1994)
      A demonstration of the loss of efficiency of parallel algorithms for device modelling implemented on MIMD architectures as the number of nodes is increased.