Papers by Klaus Waldschmidt
Lösung rechenaufwendiger Probleme

For parallel and distributed systems to gain wider acceptance than they have to date, they must b... more For parallel and distributed systems to gain wider acceptance than they have to date, they must become significantly easier to program. Fundamentally, parallel programming is more difficult than sequential programming as long as data and computation must be distributed by the programmer. Cache Only Memory Architectures (COMAs) provide a Distributed Shared Memory (DSM) where data distribution is performed automatically and transparently. This paper generalizes this idea to achieve the same distribution for computation, thus arriving at an automatic and transparent form of scheduling. Where COMA literature normally makes no assumptions concerning the parallel programs which use the DSM, we use special compiler techniques originally developed for multithreaded and dataflow architectures. Having done so, we can specify ways of significantly simplifying the basic COMA coherency protocols, while at the same time enabling automatic, transparent, adaptive run-time scheduling.
Pipelining and parallel training of neural networks on distributed-memory multiprocessors
This paper presents a parallel neural network simulator, implemented on a Parsytec Multicluster2 ... more This paper presents a parallel neural network simulator, implemented on a Parsytec Multicluster2 transputer system. In practical use, neural networks often employ the backpropagation learning rule, as this supervised learning method can be applied to a wide field of recognition problems. The authors focus on the acceleration of backpropagation learning by combining pipelining and parallel training methods. The pipelining model

This paper presents a genetic solvrng izpproach to the "raveling Salesman Problem (TSP), which ca... more This paper presents a genetic solvrng izpproach to the "raveling Salesman Problem (TSP), which can be sign$contly accelerated by using an associative processor architecture, called the A M 3 . To compile the genetic TSP ulgorithm, a C' + programming encironment contaitriny (in associative object library 2s needed as well as a n A M 3 code interpreter to count machine instructions [IO]. Further, a recombination operator, known in the literature as "Partially Mapped Crossover'' (PMX), is employed b?]. The associative character of this operator makes it possible lo reduce its time complexity from quadratic to linear (from O ( n 2 ) to O ( n ) ) . This reduction 2,s noticeable in practice, since genetical recombination demands an increasing portion of the total run-time with growing problem size. A s the TSP can be seen as a typical representative of permutation problems, it i s assumed that the combination of genetic and associative processing is suitable for similar applications
This paper considers whether the seemingly disparate fields of Computational Intelligence (CI) an... more This paper considers whether the seemingly disparate fields of Computational Intelligence (CI) and computer architecture can profit from each others' principles, results and experience. In the process, we identify important common issues, such as parallelism, distribution of data and control, granularity and regularity. We present two novel computer architectures which have profited from principles found in CI, and identify two constraints on CI to eliminate the hidden influence of the von Neumann model of computation.
ADARC: a fine grain dataflow architecture with associative communication network
... Current VLSI technologies support the design of VLSI building-blocks for a modular con-struct... more ... Current VLSI technologies support the design of VLSI building-blocks for a modular con-struction of the associative network. ... In deviation from the pure architecture, the Associative Crossbar Module provides only one CAM-word for each input line. ...
Forum on specification and Design Languages, 2004
The article describes semi-symbolic methods for the analysis of control and signal processing sys... more The article describes semi-symbolic methods for the analysis of control and signal processing systems, including static and dynamic uncertainties. This above mentioned semi-symbolic description of uncertainties is based upon affine arithmetic. A short introduction to affine arithmetic is given. As affine arithmetic is only able to describe static uncertainties, an extension for effects of dynamic uncertainties is described and its feasibility is demonstrated by an example that delivers the frequency dependent noise of an output stage of a delta-sigma converter.

While traditional parallel computing systems are still struggling to gain a wider acceptance, the... more While traditional parallel computing systems are still struggling to gain a wider acceptance, the largest parallel computer that has ever been available is currently growing with the communication resource Internet. Unfortunately it is also rarely used in the parallel computation field. The reason for the rejection of parallel computers is mainly the difficulty of parallel programming. In this paper we propose the Self Distributing Associative ARChitecture (SDAARC). It has been derived from the Cache Only Memory Architecture (COMA). COMAs provide a distributed shared memory (DSM) with automatic distribution of data. We show how this paradigm of data distribution can be extended to the automatic distribution of instruction sequences (microthreads). We show how microthreads can be extracted from legacy C code to produce code that can automatically be parallelized by SDAARC at run time. We also discuss how SDAARC can be implemented on a tightly coupled multiprocessor system, on heterogenous LAN based computer networks (Intranet) and on WANs of computing resources.
Analog/Digital- und Digital/Analog-Umsetzung
In den vorangegangenen Kapiteln haben wir Konzepte und Bausteine zur Realisierung digitaler Steue... more In den vorangegangenen Kapiteln haben wir Konzepte und Bausteine zur Realisierung digitaler Steuerwerke, Operationswerke und Prozessoren kennengelernt. In vielen Fallen dienen diese Schaltwerke zur Verarbeitung von Meswerten. Nun sind die in der Natur auftretenden physikalischen Grosen (z.B. Druck, Spannung, Temperatur) in der Regel analoge Signale, die nicht direkt von digital arbeitenden Systemen ubernommen werden konnen. Die analogen Signale mussen daher entweder zur Verarbeitung digitalisiert werden oder in umgekehrter Richtung wieder in analoge Signale umgesetzt werden. Die digitale Darstellung analoger Meswerte sowie ihre digitale Verarbeitung bieten in vielen Anwendungen eine Reihe wesentlicher Vorteile.
Special section on associative processors and memories
IEE proceedings, 1989
Modellierung des Implementierungsraumes im Analog/Digital Co-Design
MBMV, 2000
MBMV, 2007
Wegen des großen Anteils an Software werden eingebettete Hardware/Software Systeme zunehmend basi... more Wegen des großen Anteils an Software werden eingebettete Hardware/Software Systeme zunehmend basierend auf C/C++ entwickelt. Die Modellierung von Hardware wird dabei durch Spracherweiterungen wie SystemC unterstützt. Eingebettete Systeme umfassen neben digitalen Komponenten in zunehmendem Maße auch analoge Komponenten. Dieser Beitrag gibt einen Überblick über die im Rahmen der OSCI SystemC-AMS Working Group entwickelten Erweiterungen zur Modellierung analog/digitaler Systeme. Darüber hinaus zeigt er, wie polymorphe Signale den Top-Down Entwurf und die Mixed-Level Simulation in einem heterogenen Top-Down Designflow unterstützen. 1 Dieser Beitrag wurde im Rahmen des BMBF/edacentrum-Projekt SAMS unter Förderkennzeichen 01M3070D und des EU-Projekt ANDRES (IST-5-033511) unterstützt.
Aktivierung und Zuordnung von Kooperierenden Prozessen im Assko-Mehrprozessorsystem
Informatik-Fachberichte, 1980
Echt parallel ablaufende Prozesse in einer modularen, symmetrischen Mehrprozessorumgebung bedurfe... more Echt parallel ablaufende Prozesse in einer modularen, symmetrischen Mehrprozessorumgebung bedurfen einer konfliktfreien wechselseitigen Abstimmung. Durch die Separation und Abspeicherung aller hierfur notwendigen, globalen Synchronisationsvariablen in einem modularen Assoziativspeicher-Koordinatorsystem (ASSKO) konnen die Prozesse und Betriebsmittel des Mehrprozessor-Systems zu einem effektiven Gesamtwirken gebracht werden. In diesem Beitrag wird der Einsatz dieser Synchronisationsmittel bei der Programmierung von gleichzeitigen, kooperierenden und zueinander asynchron startenden Prozessen angegeben.
Dagstuhl Seminar Proceedings, 2008
Die Organisation der Selbstverwaltung der Prozessormodule
Die Prozessormodule des ASSKO-Mehrprozessorsystems sind als autonome, kooperationsfahige, aktive ... more Die Prozessormodule des ASSKO-Mehrprozessorsystems sind als autonome, kooperationsfahige, aktive Betriebsmittel zu betrachten, die gleichberechtigt an der Ausfuhrung eines zerlegbaren Rechenproblems mitwirken konnen. Jedes Rechenproblem mus, damit es auf einem Mehrprozessorsystem zum Ablauf gebracht werden kann, neben seiner maschinengerechten Obersetzung noch in seine sequentiell unabhangig ausfuhrbaren Teilaufgaben zergliedert werden.
Technische Realisierung des ASSKO-Systems
In diesem Abschnitt wird der schaltungstechnische Aufbau des ASSKO-Systems anhand einiger Fotogra... more In diesem Abschnitt wird der schaltungstechnische Aufbau des ASSKO-Systems anhand einiger Fotografien vorgestellt. Die Frontansicht des Gesamtsystems zeigt Bild 18, die Ruckansicht ist in Bild 19 dargestellt.
Uploads
Papers by Klaus Waldschmidt