Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2001, SIAM Journal on Scientific Computing
…
33 pages
1 file
In this article, we discuss a parallel implementation of efficient algorithms for computation of Legendre polynomial transforms and other orthogonal polynomial transforms. We develop an approach to the Driscoll-Healy algorithm using polynomial arithmetic and present experimental results on the accuracy, efficiency, and scalability of our implementation. The algorithms were implemented in ANSI C using the BSPlib communications library. We also present a new algorithm for computing the cosine transform of two vectors at the same time.
An efficient procedure for the computation of the coefficients of Legendre expansions is here presented. We prove that the Legendre coefficients associated with a function f(x) can be represented as the Fourier coefficients of an Abel-type transform of f(x). The computation of N Legendre coefficients can then be performed in O(N log N) operations with a single Fast Fourier Transform of the Abel-type transform of f(x).
This letter presents a new recursive method for computing discrete polynomial transforms. The method is shown for forward and inverse transforms of the Hermite, binomial, and Laguerre transforms. The recursive flow diagrams require only 2 additions, 2( + 1) memory units, and + 1 multipliers for the +1-point Hermite and binomial transforms. The recursive flow diagram for the +1-point Laguerre transform requires 2 additions, 2( + 1) memory units, and 2( + 1) multipliers. The transform computation time for all of these transforms is ( ).
SIAM Journal on Computing, 1983
It is shown that any multivariate polynomial of degree d that can be computed sequentially in C steps can be computed in parallel in O((log d)(log C + log d)) steps using only (Cd) 1) processors.
Journal of Computer and System Sciences, 1974
domain can be performed in O(N log ~ N) steps, then the residues of an N precision element in the domain can be computed in O(N log a+l N) steps. A special case of this result is that the residues of an N precision integer can be computed in O(N logS N log log N) total operations. Using a polynomial division algorithm due to Strassen [24], it is shown that a polynomial of degree N --1 can be evaluated at N points in O(N log 2 N) total operations or O(N log N) multiplications. Using the methods of Horowitz [10] and Heindel [9], it is shown that if division and multiplication in a Euclidean domain can be performed in O(N log ~ N) steps, then the Chinese Remainder Algorithm (CRA) can be performed in O(Nlog ~+x N) steps. Special cases are: (a) the integer CRA can be performed in O(N log S N log log N) total operations, and (b) a polynomial of degree N-1 can be interpolated in O(N log 2 N) total operations or O(Nlog N) multiplications. Using these results, it is shown that a polynomial of degree N and all its derivatives can be evaluated at a point in O(N log s N) total operations. ~ l fB B ,B'
Applied and Computational Harmonic Analysis, 2010
We introduce a new class of fast algorithms for the application to arbitrary vectors of certain special function transforms. The scheme is applicable to a number of transforms, including the Fourier-Bessel transform, the non-equispaced Fourier transform, transforms associated with all classical orthogonal polynomials, etc.; it requires order O(n log(n)) operations to apply an n ×n matrix to an arbitrary vector. The performance of the algorithm is illustrated by several numerical examples.
Acta Geodaetica et Geophysica, 2018
Today high-speed computers have simplified many computational problems, but fast techniques and algorithms are still relevant. In this study, the Hermitian polynomial approximation is used for fast evaluation of the associated Legendre functions (ALFs). It has lots of applications in geodesy and geophysics. This method approximates the ALFs instead of computing them by recursive formulae and generate them several times faster. The approximated ALFs by the Newtonian polynomials are compared with Hermitian ones and their differences are discussed. Here, this approach is applied for computing a global geoid model point-wise from EGM08 to degree and order 2160 and in propagating the orbit of a low Earth orbiting satellite. Our numerical results show that the CPUtime decreases at least two times for orbit propagation, and five times for geoid computation comparing to the case where recursive formulae for generation of ALFs are used. The approximation error in the orbit computation is at a sub-millimeter level over two weeks and that the computed geoid 0.01 mm, with a maximum of 1 mm.
2009
In recent years there has been a renewed interest in finding fast algorithms to compute accurately the linear canonical transform (LCT) of a given function. This is driven by the large number of applications of the LCT in optics and signal processing. The well-known integral transforms: Fourier, fractional Fourier, bilateral Laplace and Fresnel transforms are special cases of the LCT. In this paper we obtain an O(N*Log N) algorithm to compute the LCT by using a chirp-FFT-chirp transformation yielded by a convergent quadrature formula for the fractional Fourier transform. This formula gives a unitary discrete LCT in closed form. In the case of the fractional Fourier transform the algorithm computes this transform for arbitrary complex values inside the unitary circle and not only at the boundary. In the case of the ordinary Fourier transform the algorithm improves the output of the FFT.
Multidimensional Systems and Signal Processing, 1990
This paper presents vector and parallel algorithms and implementations of one-and two-dimensional orthogonal transforms. The speed performances are evaluated on Cray X-MP/48 vector computer. The sinusoidal orthogonal transforms are computed using fast real Fourier transform (FFT) kernel. The non-sinusoidal orthogonal transform algorithms are derived by using direct factorizations of transform matrices. Concurrent processing is achieved by using the multitasking capability of Cray X-MP/48 to transform long dam vectors and two-dimensional data vectors. The discrete orthogonal transforms discussed in this paper include: Fourier transform (DFT), cosine transform (DCT), sine transform (DST), Hartley transform (DHT), Walsh transform (DWHT) and Hadamard transform (DHDT). The factors affecting the speedup of vector and parallel processing of these transforms are considered. The vectorization techniques are illustrated by an FFT example.
Journal of Real-time Image Processing, 2019
Legendre moments and their invariants for 2D and 3D image/objects are widely used in image processing, computer vision, and pattern recognition applications. Reconstruction of digital images by nature required higherorder moments to get high-quality reconstructed images. Different applications such as classification of bacterial contamination images utilize high-order moments for feature extraction phase. For big size images and 3D objects, Legendre moments computation is very time-consuming and compute-intensive. This problem limits the use of Legendre moments and makes them impractical for realtime applications. Multi-core CPUs and GPUs are powerful processing parallel architectures. In this paper, new parallel algorithms are proposed to speed up the process of exact Legendre moments computation for 2D and 3D image/ objects. These algorithms utilize multi-core CPUs and GPUs parallel architectures where each pixel/voxel of the input digital image/object can be handled independently. A detailed profile analysis is presented where the weight of each part of the entire computational process is evaluated. In addition, we contributed to the parallel 2D/3D Legendre moments by: (1) a modification of the traditional exact Legendre moment algorithm to better fit the parallel architectures, (2) we present the first parallel CPU implementation of Legendre moment, and (3) we present the first parallel CPU and GPU acceleration of the reconstruction phase of the Legendre moments. A set of numerical experiments with different gray-level images are performed. The obtained results clearly show a very close to optimal parallel gain. The extreme reduction in execution times, especially for 8-core CPUs and GPUs, makes the parallel exact 2D/3D Legendre moments suitable for real-time applications.
2013
A fast algorithm is developed to compute orthogonal polynomial expansions on sparse grids for a function of d variables in a weighted L 2 space. The proposed algorithm combines the fast cosine transform, a fast transform from the Chebyshev orthogonal polynomial basis to the orthogonal polynomial basis for the weighted L 2 space and a fast algorithm of computing hierarchically structured basis functions. The overall computational complexity of the algorithm is O(n log d+1 n) where n is the highest polynomial degree in one dimension. Exponential convergence under an analyticity assumption is proved. Numerical experiments confirm the theoretical results and demonstrate the efficiency of the proposed algorithm.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Journal of Computational and Applied Mathematics, 1981
arXiv: Numerical Analysis, 2017
Journal of the Optical Society of America A, 2005
Parallel computing, 1997
Numerical Algorithms, 1991
Parallel Computing, 1997
Lecture Notes in Computer Science, 2002
Electronics Letters, 2006
IEEE Signal Processing Magazine, 2002
Journal of Geodetic Science, 2018
IEEE Transactions on Computers, 1999