Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2013
In the present communication entropy optimization principles namely maximum entropy principle and minimum cross entropy principle are defined and a critical approach of parameter estimation methods using entropy optimization methods is described in brief. Maximum entropy principle and its applications in deriving other known methods in parameter estimation are discussed. The relation between maximum likelihood estimation and maximum entropy principle has been derived. The relation between minimum divergence information principle and other classical method minimum Chi-square is studied. A comparative study of Fisher’s measure of information and minimum divergence measure is made. Equivalence of classical parameter estimation methods and information theoretic methods is studied. An application for estimation of parameter estimation when interval proportions are given is discussed with a numerical example. Key words: Parameter estimation, maximum entropy principle, minimum divergence...
h i g h l i g h t s • Two estimation methods (discretization and kernel-based approach) are applied to FIM and SE. • FIM (SE) estimated by discrete approach is nearly constant with σ. • FIM (SE) estimated by discrete approach decreases (increases) with the bin number. • FIM (SE) estimated by kernel-based approach is close to the theory value for any σ. a b s t r a c t The performance of two estimators of Fisher Information Measure (FIM) and Shannon entropy (SE), one based on the discretization of the FIM and SE formulae (discrete-based approach) and the other based on the kernel-based estimation of the probability density function (pdf) (kernel-based approach) is investigated. The two approaches are employed to estimate the FIM and SE of Gaussian processes (with different values of σ and size N), whose theoretic FIM and SE depend on the standard deviation σ. The FIM (SE) estimated by using the discrete-based approach is approximately constant with σ , but decreases (increases) with the bin number L; in particular, the discrete-based approach furnishes a rather correct estimation of FIM (SE) for L ∝ σ. Furthermore, for small values of σ , the larger the size N of the series, the smaller the mean relative error; while for large values of σ , the larger the size N of the series, the larger the mean relative error. The FIM (SE) estimated by using the kernel-based approach is very close to the theoretic value for any σ , and the mean relative error decreases with the increase of the length of the series. Comparing the results obtained using the discrete-based and kernel-based approaches, the estimates of FIM and SE by using the kernel-based approach are much closer to the theoretic values for any σ and any N and have to be preferred to the discrete-based estimates.
Physica A: Statistical Mechanics and its Applications, 2017
Highlights 1. Two estimation methods of Fisher Information Measure (FIM) and Shannon entropy (SE) are analysed. 2. One is based on discretizing FIM and SE formulae; the other on the kernel-based estimation of the probability density function. 3. FIM (SE) estimated by using the discrete-based approach is approximately constant with , but decreases (increases) with the bin number L. 4. FIM (SE) estimated by using the kernel-based approach is very close to the theoretic value for any .
Entropy, 2014
The minimum error entropy (MEE) criterion has been successfully used in fields such as parameter estimation, system identification and the supervised machine learning. There is in general no explicit expression for the optimal MEE estimate unless some constraints on the conditional distribution are imposed. A recent paper has proved that if the conditional density is conditionally symmetric and unimodal (CSUM), then the optimal MEE estimate (with Shannon entropy) equals the conditional median. In this study, we extend this result to the generalized MEE estimation where the optimality criterion is the Renyi entropy or equivalently, the α-order information potential (IP).
arXiv (Cornell University), 2024
In this research work, a total of 45 different estimators of the Shannon differential entropy were reviewed. The estimators were mainly based on three classes, namely: window size spacings (m), kernel density estimation (KDE) and k-nearest neighbour (kNN) estimation. A total of 16, 5 and 6 estimators were selected from each of the classes, respectively, for comparison. The performances of the 27 selected estimators, in terms of their bias values and root mean squared errors (RMSEs) as well as their asymptotic behaviours, were compared through extensive Monte Carlo simulations. The empirical comparisons were carried out at different sample sizes of n = 10, 50, and 100 and different variable dimensions of d = 1, 2, 3, and 5, for three groups of continuous distributions according to their symmetry and support. The results showed that the spacings based estimators generally performed better than the estimators from the other two classes at d = 1, but suffered from non existence at d ≥ 2. The kNN based estimators were generally inferior to the estimators from the other two classes considered but showed an advantage of existence for all d ≥ 1. Also, a new class of optimal window size was obtained and sets of estimators were recommended for different groups of distributions at different variable dimensions. Finally, the asymptotic biases, variances and distributions of the 'best estimators' were considered.
Information Sciences, 1994
The principle of minimum error entropy estimation as found in the work of Weidemann and Stear is reformulated as a problem of finding optimum locations of probability densities in a given mixture such that the resulting (differential) entropy is minimized. New results concerning the entropy lower bound are derived. Continuity of the entropy and attaining the minjmum entropy are proved in the case where the mixture is finite. Some other examples and situations, in particular that of s~metric unimodal densities, are studied in more detail.
Handbook of Statistics, 2013
This chapter was originally published in the book Handbook of Statistics. The copy attached is provided by Elsevier for the author's benefit and for the benefit of the author's institution, for non-commercial research, and educational use. This includes without limitation use in instruction at your institution, distribution to specific colleagues, and providing a copy to your institution's administrator. All other uses, reproduction and distribution, including without limitation commercial reprints, selling or licensing copies or access, or posting on open internet sites, your personal or institution's website or repository, are prohibited. For exceptions, permission may be sought for such use through Elsevier's
Bulletin of the Malaysian Mathematical Sciences Society
The evaluation of the information entropy content in the data analysis is an effective role in the assessment of fatigue damage. Due to the connection between the generalized half-normal distribution and fatigue extension, the objective inference for the differential entropy of the generalized half-normal distribution is considered in this paper. The Bayesian estimates and associated credible intervals are discussed based on different non-informative priors including Jeffery, reference, probability matching, and maximal data information priors for the differential entropy measure. The Metropolis-Hastings samplers data sets are used to estimate the posterior densities and then compute the Bayesian estimates. For comparison purposes, the maximum likelihood estimators and asymptotic confidence intervals of the differential entropy are derived. An intensive simulation study is conducted to evaluate the performance of the proposed statistical inference methods. Two real data sets are analyzed by the proposed methodology for illustrative purposes as well. Finally, non-informative priors for the original parameters of generalized half-normal distribution based on the direct and transformation of the entropy measure are also proposed and compared. Communicated by Rosihan M. Ali.
Journal of the Iranian Statistical Society Jirss, 2010
In this paper we propose an estimator of the entropy of a continuous random variable. The estimator is obtained by modifying the estimator proposed by . Consistency of estimator is proved, and comparisons are made with Vasicek's estimator (1976), van Es's estimator (1992), Ebrahimi et al. 's estimator (1994) and Correa's estimator (1995). The results indicate that the proposed estimator has smaller mean squared error than above estimators.
Information-theoretic learning often requires the use of the probability density function (pdf), entropy, or mutual information. In this appendix, we provide a brief overview of some efficient methods for estimating the pdf as well as the entropy function. The pdf and entropy estimators discussed here are practically useful because of their simplicity and the basis of sample statistics. For discussion simplicity, we restrict our attention to continuous, real-valued univariate random variables, for which the estimators of pdf and its associated entropy are sought. Definition D.1 A real-valued Lebesgue-integrable function p(x) (x ∈ R) is called a pdf if it satisfies p(x) = x −∞ F (x) dx, where F (x) is a cumulative probability distribution function. A pdf is everywhere nonnegative and its integral from −∞ to +∞ is equal to 1; namely 0 ≤ p(x) ≤ 1 and ∞ −∞ p(x) dx = 1. Definition D.2 Given the pdf of a continuous random variable x, its differential Shannon entropy is defined as H (x) = E[− log p(x)] = − ∞ −∞ p(x) log p(x) dx.
Journal of Statistical Planning and Inference, 1986
The problem considered is simultaneous estimation of scale parameters and their reciprocals from p independent gamma distributions under a scale invariant loss function first introduced in James and Stein (1961). Under mild restrictions on the shape parameters, the best scale invariant estimators are shown to be admissible for p= 2. For p_> 3, a general technique is developed for improving upon the best scale invariant estimators. Improvement on the generalized Bayes estimators of a vector involving certain powers of the scale parameter is also obtained.
Entropy, 2012
The minimum error entropy (MEE) criterion has been receiving increasing attention due to its promising perspectives for applications in signal processing and machine learning. In the context of Bayesian estimation, the MEE criterion is concerned with the estimation of a certain random variable based on another random variable, so that the error's entropy is minimized. Several theoretical results on this topic have been reported. In this work, we present some further results on the MEE estimation. The contributions are twofold: (1) we extend a recent result on the minimum entropy of a mixture of unimodal and symmetric distributions to a more general case, and prove that if the conditional distributions are generalized uniformly dominated (GUD), the dominant alignment will be the MEE estimator; (2) we show by examples that the MEE estimator (not limited to singular cases) may be non-unique even if the error distribution is restricted to zero-mean (unbiased).
Cryptography, Information Theory, and Error‐Correction, 2021
In this note, we look at several definitions of entropy and some of their consequences in information theory. We also obtain the entropy and relative entropy for general error distribution which is used to model errors which are not normal. H(Y) = H(f) = − f (y) log f (y)dy,
Statistical Science, 2010
The authors dedicate this article to Professor Dennis V. Lindley in appreciation of his insightful pioneering work on the relationships between Shannon's information theory and Bayesian inference.
Measurement, 2009
The measuring systems, such as those used in coordinate measuring machines (CMMs), laser interferometers, linear or rotary encoders, etc., feature of huge amount of information indicating the position of the object under control. This information is subject for verification or metrological calibration during some periods in service. On the other hand, there are no means for verifying every digit of output information, and the great quantity of information consisting of millions of values is left with its errors undetermined. Expression of the result of measurement (including the calibration) of a measuring system supplementing it by the parameter of information entropy is proposed in the paper. The uncertainty expression in the result of measurement in the plane and in the volume is presented here with the parameter of information entropy that shows the portion of data assessed.
Many information measures are suggested in the literature. Among these measures are the Shannon H,(O) and the Awad A,(O) entropies. In this work we suggest a new entropy measure, B~(0), which is based on the maximum likelihood function. These three entropies were calculated from the gamma distribution and its normal approximation, the binomial and its Poisson approximation, and the Poisson and its normal approximation. The relative losses in these three entropies are used as a criterion for the appropriateness of the approximation. Copyright © 1996 Elsevier Science Ltd.
Entropy
This study attempts to extend the prevailing definition of informational entropy, where entropy relates to the amount of reduction of uncertainty or, indirectly, to the amount of information gained through measurements of a random variable. The approach adopted herein describes informational entropy not as an absolute measure of information, but as a measure of the variation of information. This makes it possible to obtain a single value for informational entropy, instead of several values that vary with the selection of the discretizing interval, when discrete probabilities of hydrological events are estimated through relative class frequencies and discretizing intervals. Furthermore, the present work introduces confidence limits for the informational entropy function, which facilitates a comparison between the uncertainties of various hydrological processes with different scales of magnitude and different probability structures. The work addresses hydrologists and environmental engineers more than it does mathematicians and statisticians. In particular, it is intended to help solve information-related problems in hydrological monitoring design and assessment. This paper first considers the selection of probability distributions of best fit to hydrological data, using generated synthetic time series. Next, it attempts to assess hydrometric monitoring duration in a netwrok, this time using observed runoff data series. In both applications, it focuses, basically, on the theoretical background for the extended definition of informational entropy. The methodology is shown to give valid results in each case.
Entropy, 2015
The main content of this review article is first to review the main inference tools using Bayes rule, the maximum entropy principle (MEP), information theory, relative entropy and the Kullback-Leibler (KL) divergence, Fisher information and its corresponding geometries. For each of these tools, the precise context of their use is described. The second part of the paper is focused on the ways these tools have been used in data, signal and image processing and in the inverse problems, which arise in different physical sciences and engineering applications. A few examples of the applications are described: entropy in independent components analysis (ICA) and in blind source separation, Fisher information in data model selection, different maximum entropy-based methods in time series spectral estimation and in linear inverse problems and, finally, the Bayesian inference for general inverse problems. Some original materials concerning the approximate Bayesian computation (ABC) and, in particular, the variational Bayesian approximation (VBA) methods are also presented. VBA is used for proposing an alternative Bayesian computational tool to the classical Markov chain Monte Carlo (MCMC) methods. We will also see that VBA englobes joint maximum a posteriori (MAP), as well as the different expectation-maximization (EM) algorithms as particular cases.
Econometric Reviews, 2008
Kullback-Leibler information is widely used for developing indices of distributional fit. The most celebrated of such indices is Akaike's AIC, which is derived as an estimate of the minimum Kullback-Leibler information between the unknown data-generating distribution and a parametric model. In the derivation of AIC, the entropy of the data-generating distribution is bypassed because it is free from the parameters. Consequently, the AIC type measures provide criteria for model comparison purposes only, and do not provide information diagnostic about the model fit. A nonparametric estimate of entropy of the data-generating distribution is needed for assessing the model fit. Several entropy estimates are available and have been used for frequentist inference about information fit indices. A few entropy-based fit indices have been suggested for Bayesian inference. This paper develops a class of entropy estimates and provides a procedure for Bayesian inference on the entropy and a fit index. For the continuous case, we define a quantized entropy that approximates and converges to the entropy integral. The quantized entropy includes some well known measures of sample entropy and the existing Bayes entropy estimates as its special cases. For inference about the fit, we use the candidate model as the expected distribution in the Dirichlet process prior and derive the posterior mean of the quantized entropy as the Bayes estimate. The maximum entropy characterization of the candidate model is then used to derive the prior and posterior distributions for the Kullback-Leibler information index of fit. The consistency of the proposed Bayes estimates for the entropy and for the information index are shown. As by-products, the procedure also produces priors and posteriors for the model parameters and the moments.
Methodology and Computing in Applied Probability, 2007
In this paper we discuss four information theoretic ideas and present their implications to statistical inference: (1) Fisher information and divergence generating functions, (2) information optimum unbiased estimators, (3) information content of various statistics, (4) characterizations based on Fisher information.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.