Academia.eduAcademia.edu

Processing Element

3,001 papers
8 followers
AI Powered
A processing element is a fundamental unit within a computational architecture that performs data processing tasks. It typically consists of a processing unit, memory, and interconnection capabilities, enabling it to execute instructions and manipulate data in parallel or sequentially, depending on the system's design.
In this paper we present a method for mapping streaming applications, with real-time requirements, onto a reconfigurable MPSoC. In this method, the performance of the hardware architecture (the reconfigurable Processing Element, the... more
Abstract: Reed Solomon (RS) codes have been widely used in a variety of communication systems such as space communication link, digital subscriber loops, and wireless systems as well as in networking communications and magnetic and data... more
photometric performances, estimated basing on the stellar populations crucial for understanding the formation and evolution of the Galaxy, are discussed. Performance of the GAIA photometric systems (PSs) is evaluated taking into account... more
One ofthe goals of current VLSI design is to represent very complex systems on a chip. However, certain physical limitations sometimes exist, such as the number or wira and pins that can be used on IC packages. Yet it is known that fewer... more
In this paper, we present a Mirroring Neural Network architecture to perform non-linear dimensionality reduction and Object Recognition using a reduced lowdimensional characteristic vector. In addition to dimensionality reduction, the... more
The Instruction-Set extension problem has been one of the major topics in the last years and it is the addition of a set of new complex instructions to a given Instruction-Set. This problem in its general formulation requires an... more
The Discrete Wavelet Transform (DWT) is an important operation in applications of digital signal processing. In this paper, we review several traditional DWT implementation approaches, e.g., application-specific integrated circuits,... more
In this paper, we present a runtime memory allocation algorithm, that aims to substantially reduce the overhead caused by shared-memory accesses by allocating memory directly in the local scratch pad memories. We target a heterogeneous... more
Reliability of a distributed processing system is an important design parameter that can be described in terms of the reliability of processing elements and communication links and also of the redundancy of programs and data files. The... more
Abstract–Clouds have a critical role in many studies, eg weather-and climate-related studies. However, they represent a source of errors in many applications, and the presence of cloud contamination can hinder the use of satellite data.... more
In recent years, a number of database machines consisting of large numbers of parallel processing elements have been proposed. Unfortunately, there are two main limitations in database processing that prevent a high degree of parallelism;... more
Due to the increasing complexity of scientific models, large-scale simulation tools often require a critical amount of computational power to produce results in a reasonable amount of time. For example, multi-system wireless network... more
Due to the increasing complexity of scientific models, large-scale simulation tools often require a critical amount of computational power to produce results in a reasonable amount of time. For example, multi-system wireless network... more
The invention of ever faster and more complex computers makes more and more applications of di-gital signal processing possible. And in turn, new application areas require increasingly more compu-tational power. One of the most... more
The study of C-enhanced metal poor stars with s-process elements overabundances offers a chance of testing the AGB models at low metallicity and constrain the nucleosynthesis codes. We analyzed a total of 13 C-enhanced metal poor stars,... more
We provide a high-dispersion line-by-line abundance analysis of Ðve red horizontal-branch (HB) stars in the extremely metal-rich Galactic globular cluster NGC 6553. These red HB stars are signiÐcantly hotter than the very cool stars near... more
This paper presents the chemical abundance analysis of a sample of 27 red giant stars located in 4 popolous intermediate-age globular clusters in the Large Magellanic Cloud, namely NGC 1651, 1783, 1978 and 2173. This analysis is based on... more
Parallel computing is currently the dominating architecture in embedded systems. Concurrency improves the performance of the system rather without increasing the clock speed which affects the power consumption of the system. However,... more
Recent progress in processing speeds, network bandwidths, and middleware technologies have contributed towards novel computing platforms, ranging from large-scale computing clusters to globally distributed systems. Consequently, most... more
This thesis investigates a new protocol, iFlame, designed to provide highly scalable, distributed, real-time communication systems for the Internet. Scalabilityisachieved by using a client-oriented model rather than a more traditional... more
This paper describes a multi-layer maze routing accelerator which uses a two-dimensional array of processing elements (PEs) implemented in an FPGA. Routing for an L-layer N X N grid is performed by an array of N X N PEs that... more
More and more parallel and distributed systems (clusters, grid and global computing) are available all over the world, opening new perspectives for developers of a large range of applications including data mining, multi-media, and... more
Discourse markers are verbal and non-verbal devices that mark transition points in communication. They presumably facilitate the construction of a mental representation of the events described by the discourse. A taxonomy of these... more
The failure of some national projects AXES to expected results. According to experts one of the reasons is the lack of adequate theoretical apparatus for generating high-branching processes, unjustified detraction of opportunities... more
Cyclops is a new architecture for high performance parallel computers being developed at the IBM T. J. Watson Research Center. The basic cell of this architecture is a single-chip SMP system with multiple threads of execution, embedded... more
As the main scope of mobile embedded systems shifts from control to data processing tasks high performance demand and limited energy budgets are often seen conflicting design goals. Heterogeneous, adaptive multicore systems are one... more
Context. Substellar objects have extremely long life spans. The cosmological consequence for older objects are low abundances for heavy elements, which in turn results in a wide distribution of objects over metallicity, hence over age.... more
Error diffusion dithering is a technique that is used to represent a grayscale image on a printer, a computer monitor or other bi-level displays. For a number of years it was believed that error diffusion algorithms can not be... more
Elementary features are detected by calculating the number of objects inside partly overlapping windows fixed in an image plane. Each windowns contents is processed by a separate processing element (PE) on a SIMD grid or pyramid... more
In this paper a regular bidirectional linear systolic array (RBLSA) for computing all-pairs shortest paths of a given directed graph is designed. The obtained array is optimal with respect to a number of processing elements (PE) for a... more
This paper presents a scale and rotation invariant face detection system. The system employs a hierarchical neural network, called SICoNNet, whose processing elements are governed by the nonlinear mechanism of shunting inhibition. The... more
Increasing on-chip wire delay along with the distributed nature of processing elements, makes instruction scheduling for tiled dataflow architectures very crucial. Our analysis reveals that careful placement of frequently executed... more
The MULTIPLUSproject aims at the development of a modular parallel architecture suitable for the study of several aspects of parallelism in both true shared memory and virtual shared memory environments. The MULTIPLUS architecture is able... more
In this paper, we describe the challenges of prototyping a reference application on System S, a distributed stream processing middleware under development at IBM Research. With a large number of stream PEs (Processing Elements)... more
We approach the construction of design methodologies for on-chip multiprocessor platforms, with the focus on the SegBus, a segmented bus platform. We study how applications can be mapped on such distributed architecture and show how to... more
We present chemical abundances obtained by fitting synthetic spectra to the FEROS data of 12 C-J stars, normal and silicate carbon one. The Li and 13C abundance, as well as the s-process elements abundance indicates no evidence of a... more
Most parallel computations require the exchange of data between processing elements. One of important basic communication operations is all-reduce, a variation of the reduction operation. This paper presents an all-reduce communication... more
Constraint propagation algorithms present inherent parallelism. Each constraint behaves as a concurrent process triggered by changes in the store of variables, updating the store in its turn. There is an inherent sequentiality, a s w ell,... more
Increasing on-chip wire delay along with the distributed nature of processing elements, makes instruction scheduling for tiled dataflow architectures very crucial. Our analysis reveals that careful placement of frequently executed... more
We report on the progress of an experimental research program to demonstrate the feasibility of multiplexing several holographic optical elements (HOEs) in a single film. Named the Shared Aperture Diffractive Optical Element (ShADOE), it... more
Due to the significant growth of link speeds, amount of data that should be stored on router line cards are rapidly increasing. Therefore, a large number of memory modules are required for packet storage. In addition, a high performance... more
Coarse-grain reconfigurable architectures consist of a large number of processing elements (PEs) connected together in a network. For mapping applications to such coarse-grain architectures, we present an algorithm that takes into account... more