Fig. 1. The APEX model. The solid lines denote the weights w;, cj, which are trained at the sth stage.The dashed lines correspond to the weights of the already trained neurons. Note that the lateral weights asymptotically converge to zero so they don’t appear between the already trained units. weights in the network will span an orthonormal basis of the m-dimensional principal component subspace, while the lateral weights tend to zero as orthogonalization is achieved. Thus, we associate the term “anti-Hebbian connections” or “orthogonalization connections” with c;, in contrast to the term “Hebbian connections” associated with the feed-forward weights. The mathematical details of the anti-Hebbian rule (also called Lateral Orthogonalization Rule) used to train c; are to follow. which is the same as APEX equation 3. The only difference is that now we have a specific choice for the value of 2; which is optimal in the sense of the criterion Jy (w, N). Moreover, the optimal choice of the step-size parameter has a profound impact in the convergence speed of the algorithm as will be discussed in Section IV. Clearly @ can also be calculated iteratively by Jin (w. N) can be minimized iteratively using the RLS algo- rithm [30] which yields the following updating equations Fig. 2. Comparison of the theoretical versus the actual decay rates via simulation. Fig. 3. The parallel APEX network model. Fig. 4. (a) Plot of the square distance jle,,, ~ w,,,4||7, between the actual components and the ones estimated using sequential APEX. The data are repeated in sweeps (1 sweep = 200 iterations). (b) Average y?_, over each sweep for cach neuron ji, yx is basically the projection of x; on the subspace spanned by the columns of W which is orthogonal to £. In [21] it was shown that the optimal solution to the CPC problem is Fig. 6. (a) The single-output model. The connections denote the weights w,.cy, Which are trained. (b) The multiple-output model. A. APEX Solves CPC The following facts can be easily shown regarding the eigenvalues/vectors of R,: Fig. 7. APEX model performance for CPC. We use 200 64-dimensional data vectors repeated in sweeps. (a) The convergence of the error ||qg¢ — ey ||? for each neuron m. (b) The output variance: average {y”} over each sweep for each neuron, The final variance equals approximately to the corresponding eigenvalue Amn. SIMULATION RESULTS FOR HARMONIC RETRIEVAL; ‘*’ FOLLOWING A NUMBER DENOTES THAT THE FREQUENCY CANNOT BE RESOLVED TABLE I B. Signal Detection: Drifting Phase Sinusoids In the non-noisy case we have: Let K be the total number of samples. Let We will also assume that the signal is only known in a finite interval, therefore only a finite dimensional estimate R,. of the correlation matrix can be computed. In this section we propose to use principal component analysis for the detection problem discussed above. The idea stems from the fact that in the coherent case, all eigenvalues of R,, except two, are equal to 0 while the principal component subspace corresponding to the nonzero ones is spanned by the quadrature signals cos(wk) and sin(wk). The optimal detector then projects the observation signal onto the optimal subspace following with a projection-energy threshold test. In the drifting phase case the signal spectrum has a nonzero bandwidth proportional to € and thus the eigenvalues of R.,, are all nonzero in general. However, if € is not too large then the first two eigenvalues clearly dominate, so in the extreme case where € = 0 (coherent case) only the first two eigenvalues are nonzero as discussed above. When € is nonzero however the space spanned by cos and sin is only as a rough approximation to the 2-D principal component subspace. This motivates us to compare the performance of the mth order noncoherent detector with the m-order detector produced after substituting cos and sin with the first two eigenvectors e; and e2 In our comparison experiments we used the same value of m as proposed in [46] for both A,, and A‘,. The eigenvectors are estimated using a 2-neuron parallel APEX network trained on the incoming data. We window the observation signal using a window of size N/m in order to produce vector data x; = [2j,---.2;~w/m4i] which are consequently used The difficulty of calculating the likelihood ratio in (4Q) makes the optimal detector in (47) impractical to use. Even the approximation (48) is very computationally costly and has the additional disadvantage that it is difficult to implement in an analog circuit which could alleviate the computational cost problem. Various authors have investigated alternatives to the optimal solution. In [14] the approach of estimating 6; via an Extended Kalman Filter is considered. Once 4; is approximated, a coherent detector is used for testing the two hypotheses. In [46] the authors compare the performance of the mth order noncoherent detector (proposed in [13]) which uses the test statistic with the standard noncoherent detector (i.e., the quadrature Jetector) and the optimal quadratic detector (48). The authors Jerived the optimal value of m for the mth order detector and found that for this value, it has comparable performance io the optimal quadratic detector and is quite a lot simpler (o implement, while it is significantly better than the standard noncoherent detector. Fig. 8. Probability of miss comparison between (a) the mth order nonco- herent detector and (b) eigenvalue-based detector. Fig. 9. Deflection ratio comparison between (a) the mth-order noncoherent detector and (b) the eigenvalue-based detector. Fig. 10. The CPC technique may be applied to compute hybrid DCT and KL transform codes for image compression, using & DCT components and m Constrained Principal Components. Shown here are (a) the original image, (b) the compressed/decoded image using only & = 8 DCT components (SNR = 11.6), (c) using only & = 16 DCT components (SNR = 12.8), (d) combined DCT/CPC, with & = 4, m = 4, (SNR = 17.8), (e) conventional PC approach, using 2 = & PC’s, (SNR = 18.8), and (f} combined DCT/CPC, with & = 8, mr = §, (SNR = 22.5). Fig. 11. The CPC analysis is applied to remove interference (rain) fram a picture. The constraint matrix V is set to be equal to the “rain” component. Shown here are images of (a). picture with rain (ie, a = 0); (b) using a “hard” factor a = 1, note that the “white rain” is converted to the “dark rain”; (c) using a softer a = 0.5, the “white rain” becomes lighter—but is still on the “white” side—hinting a harder factor needed; (d) using a “harder” factor a = 0.6, the rain almost disappears.