Cycles, also known as self-avoiding polygons, elementary circuits or simple cycles,
are closed w... more Cycles, also known as self-avoiding polygons, elementary circuits or simple cycles,
are closed walks which are not allowed to visit any vertex more than once.
We present an exact formula for enumerating such cycles of any length on any
directed graph involving a sum over its induced subgraphs. This result stems
from a Hopf algebra, which we construct explicitly, and which provides further
means of counting cycles. Finally, we obtain a more general theorem asserting
that any Lie idempotent can be used to enumerate cycles.
Signed networks have long been used to represent social relations of amity (+) and enmity (betwee... more Signed networks have long been used to represent social relations of amity (+) and enmity (between en individuals. Group of individuals who are cyclically connected are said to be balanced if the number of negative edges in the cycle is even and unbalanced otherwise. In its most natural formulation, the balance of a social network is thus defined from its simple cycles, cycles which do not visit any vertex more than once. Because of the inherent difficulty associated with finding such cycles on very large networks, social balance has always been studied via other, less-direct means. In this article we present the balance as measured from the simple cycles and primitive orbits of social networks. We use a Monte Carlo implementation of a novel exact formula for counting the simple cycles on any weighted directed graph. We show that social networks exhibit strong inter-edge correlations favoring balanced situations and we determine the corresponding correlation length ξ. For longer simple cycles, the percentage of unbalanced simple cycles undergoes a rapid transition to values expected from an uncorrelated model. Our method is more generally applicable to evaluate arbitrary functions over the simple cycles and simple paths of any weighted directed graph and can also answer vertex-specific questions.
Trace monoids provide a powerful tool to study graphs, viewing walks as words whose letters, the ... more Trace monoids provide a powerful tool to study graphs, viewing walks as words whose letters, the edges of the graph, obey a specific commutation rule. A particular class of traces emerges from this framework, the hikes, whose alphabet is the set of simple cycles on the graph. We show that hikes characterize undirected graphs uniquely, up to isomorphism, and satisfy remarkable algebraic properties such as the existence and unicity of a prime factorization. Because of this, the set of hikes partially ordered by divisibility hosts a plethora of relations in direct correspondence with those found in number theory. Some applications of these results are presented, including an immanantal extension to MacMahon's master theorem and a derivation of the Ihara zeta function from an abelianization procedure.
Generalized empirical likelihood and generalized method of moments are well spread methods of res... more Generalized empirical likelihood and generalized method of moments are well spread methods of resolution of inverse problems in econometrics. Each method defines a specific semiparametric model for which it is possible to calculate efficiency bounds. By this approach, we provide a new proof of Chamberlain's result on optimal GMM. We also discuss conditions under which GMM estimators remain efficient with approximate moment constraints.
A number of regularization methods for discrete inverse problems consist in considering weighted ... more A number of regularization methods for discrete inverse problems consist in considering weighted versions of the usual least square solution. However, these so-called filter methods are generally restricted to monotonic transformations, e.g. the Tikhonov regularization or the spectral cut-off. In this paper, we point out that in several cases, non-monotonic sequences of filters are more efficient. We study a regularization method that naturally extends the spectral cut-off procedure to nonmonotonic sequences and provide several oracle inequalities, showing the method to be nearly optimal under mild assumptions. Then, we extend the method to inverse problems with noisy operator and provide efficiency results in a newly introduced conditional framework.
Un estimateur sans biais de la meilleure solution approchée, est donnée par l'image de g par le p... more Un estimateur sans biais de la meilleure solution approchée, est donnée par l'image de g par le pseudo-inverse de A. Cependant, dans les cas où le signal perceptible est fortement atténué par l'opérateur, une faible perturbation sur l'observation peut engendrer un fort changement sur l'estimation, ce qui la rend instable. Pour ce genre de problèmes inverses, les méthodes classiques de régularisation consistent alorsà utiliser une version "lissée" du pseudo-inverse, de manièreà contrôler la variance de l'estimateur, quitteà augmenter le biais.
We study a parametric estimation problem related to moment condition models. As an alternative to... more We study a parametric estimation problem related to moment condition models. As an alternative to the generalized empirical likelihood (GEL) and the generalized method of moments (GMM), a Bayesian approach to the problem can be adopted, extending the MEM procedure to parametric moment conditions. We show in particular that a large number of GEL estimators can be interpreted as a maximum entropy solution. Moreover, we provide a more general field of applications by proving the method to be robust to approximate moment conditions.
In the framework of inverse problems, we consider the question of aggregating estimators taken fr... more In the framework of inverse problems, we consider the question of aggregating estimators taken from a given collection. Extending usual results for the direct case, we propose a new penalty to achieve the best aggregation. An oracle inequality provides the asymptotic behavior of this estimator. We investigate here the price for considering indirect observations.
A general method to combine several estimators of the same quantity is investigated. In the spiri... more A general method to combine several estimators of the same quantity is investigated. In the spirit of model and forecast averaging, the final estimator is computed as a weighted average of the initial ones, where the weights are constrained to sum to one. In this framework, the optimal weights, minimizing the quadratic loss, are entirely determined by the mean square error matrix of the vector of initial estimators. The averaging estimator is built using an estimation of this matrix, which can be computed from the same dataset. A non-asymptotic error bound on the averaging estimator is derived, leading to asymptotic optimality under mild conditions on the estimated mean square error matrix. This method is illustrated on standard statistical problems in parametric and semi-parametric models where the averaging estimator outperforms the initial estimators in most cases.
We compute a variance lower bound for unbiased estimators in specified statistical models. The co... more We compute a variance lower bound for unbiased estimators in specified statistical models. The construction of the bound is related to the original Cramér-Rao bound, although it does not require the differentiability of the model. Moreover, we show our efficiency bound to be always greater than the Cramér-Rao bound in smooth models, thus providing a sharper result.
The characterization of a graph via the variable adjacency matrix enables one to define a partial... more The characterization of a graph via the variable adjacency matrix enables one to define a partially ordered relation on the walks. Studying the incidence algebra on this poset reveals unsuspected relations between connected and self-avoiding walks on the graph. These relations are derived by considering truncated versions of the characteristic polynomial of variable adjacency matrix, resulting in a collection of matrices whose entries enumerate the self-avoiding walks of length ℓ from one vertex to another.
The aim of this paper is to propose a methodology for testing general hypothesis in a Markovian s... more The aim of this paper is to propose a methodology for testing general hypothesis in a Markovian setting with random sampling. A discrete Markov chain X is observed at random time intervals $\tau$ k, assumed to be iid with unknown distribution $\mu$. Two test procedures are investigated. The first one is devoted to testing if the transition matrix P of the Markov chain X satisfies specific affine constraints, covering a wide range of situations such as symmetry or sparsity. The second procedure is a goodness-of-fit test on the distribution $\mu$, which reveals to be consistent under mild assumptions even though the time gaps are not observed. The theoretical results are supported by a Monte Carlo simulation study to show the performance and robustness of the proposed methodologies on specific numerical examples.
We tackle the inverse problem of reconstructing an unknown finite measure µ from a noisy observat... more We tackle the inverse problem of reconstructing an unknown finite measure µ from a noisy observation of a generalized moment of µ defined as the integral of a continuous and bounded operator Φ with respect to µ. When only a quadratic approximation Φ m of the operator is known, we introduce the L 2 approximate maximum entropy solution as a minimizer of a convex functional subject to a sequence of convex constraints. Under several assumptions on the convex functional, the convergence of the approximate solution is established and rates of convergence are provided.
The whole theory of efficiency developped in aims to provide a lower bound for the variance of un... more The whole theory of efficiency developped in aims to provide a lower bound for the variance of unbiased estimators of finite dimensional parameters in semi parametric models. We propose to adapt this theory to the method of generalized method of moments (GMM), which can be described as a specific semi parametric model. Then, the objective is to determine the semi parametric efficency bound in this particular model. Suppose we observe n independant realizations X 1 , ..., X n of a random variable with distribution µ, satisfying
In the framework of inverse problems, we consider the question of aggregating estimators taken fr... more In the framework of inverse problems, we consider the question of aggregating estimators taken from a given collection. Extending usual results for the direct case, we propose a new penalty to achieve the best aggregation. An oracle inequality provides the asymptotic behavior of this estimator. We investigate here the price for considering indirect observations.
We propose a general method to combine several estimators of the same quantity in order to produc... more We propose a general method to combine several estimators of the same quantity in order to produce a better estimate. In the spirit of model and forecast averaging, the final estimator is computed as a weighted average of the initial ones, where the weights are constrained to sum to one. In this framework, the optimal weights, minimizing the quadratic loss, are entirely determined by the mean square error matrix of the vector of initial estimators. The averaging estimator is derived using an estimation of this matrix, which can be computed from the same dataset. We prove a non-asymptotic error bound on the averaging estimator and we show that it is asymptotically optimal under mild conditions on the estimated mean square error matrix. This method is illustrated on standard statistical problems in parametric and semi-parametric models where the averaging estimator outperforms the initial estimators in most cases.
Cycles, also known as self-avoiding polygons, elementary circuits or simple cycles,
are closed w... more Cycles, also known as self-avoiding polygons, elementary circuits or simple cycles,
are closed walks which are not allowed to visit any vertex more than once.
We present an exact formula for enumerating such cycles of any length on any
directed graph involving a sum over its induced subgraphs. This result stems
from a Hopf algebra, which we construct explicitly, and which provides further
means of counting cycles. Finally, we obtain a more general theorem asserting
that any Lie idempotent can be used to enumerate cycles.
Signed networks have long been used to represent social relations of amity (+) and enmity (betwee... more Signed networks have long been used to represent social relations of amity (+) and enmity (between en individuals. Group of individuals who are cyclically connected are said to be balanced if the number of negative edges in the cycle is even and unbalanced otherwise. In its most natural formulation, the balance of a social network is thus defined from its simple cycles, cycles which do not visit any vertex more than once. Because of the inherent difficulty associated with finding such cycles on very large networks, social balance has always been studied via other, less-direct means. In this article we present the balance as measured from the simple cycles and primitive orbits of social networks. We use a Monte Carlo implementation of a novel exact formula for counting the simple cycles on any weighted directed graph. We show that social networks exhibit strong inter-edge correlations favoring balanced situations and we determine the corresponding correlation length ξ. For longer simple cycles, the percentage of unbalanced simple cycles undergoes a rapid transition to values expected from an uncorrelated model. Our method is more generally applicable to evaluate arbitrary functions over the simple cycles and simple paths of any weighted directed graph and can also answer vertex-specific questions.
Trace monoids provide a powerful tool to study graphs, viewing walks as words whose letters, the ... more Trace monoids provide a powerful tool to study graphs, viewing walks as words whose letters, the edges of the graph, obey a specific commutation rule. A particular class of traces emerges from this framework, the hikes, whose alphabet is the set of simple cycles on the graph. We show that hikes characterize undirected graphs uniquely, up to isomorphism, and satisfy remarkable algebraic properties such as the existence and unicity of a prime factorization. Because of this, the set of hikes partially ordered by divisibility hosts a plethora of relations in direct correspondence with those found in number theory. Some applications of these results are presented, including an immanantal extension to MacMahon's master theorem and a derivation of the Ihara zeta function from an abelianization procedure.
Generalized empirical likelihood and generalized method of moments are well spread methods of res... more Generalized empirical likelihood and generalized method of moments are well spread methods of resolution of inverse problems in econometrics. Each method defines a specific semiparametric model for which it is possible to calculate efficiency bounds. By this approach, we provide a new proof of Chamberlain's result on optimal GMM. We also discuss conditions under which GMM estimators remain efficient with approximate moment constraints.
A number of regularization methods for discrete inverse problems consist in considering weighted ... more A number of regularization methods for discrete inverse problems consist in considering weighted versions of the usual least square solution. However, these so-called filter methods are generally restricted to monotonic transformations, e.g. the Tikhonov regularization or the spectral cut-off. In this paper, we point out that in several cases, non-monotonic sequences of filters are more efficient. We study a regularization method that naturally extends the spectral cut-off procedure to nonmonotonic sequences and provide several oracle inequalities, showing the method to be nearly optimal under mild assumptions. Then, we extend the method to inverse problems with noisy operator and provide efficiency results in a newly introduced conditional framework.
Un estimateur sans biais de la meilleure solution approchée, est donnée par l'image de g par le p... more Un estimateur sans biais de la meilleure solution approchée, est donnée par l'image de g par le pseudo-inverse de A. Cependant, dans les cas où le signal perceptible est fortement atténué par l'opérateur, une faible perturbation sur l'observation peut engendrer un fort changement sur l'estimation, ce qui la rend instable. Pour ce genre de problèmes inverses, les méthodes classiques de régularisation consistent alorsà utiliser une version "lissée" du pseudo-inverse, de manièreà contrôler la variance de l'estimateur, quitteà augmenter le biais.
We study a parametric estimation problem related to moment condition models. As an alternative to... more We study a parametric estimation problem related to moment condition models. As an alternative to the generalized empirical likelihood (GEL) and the generalized method of moments (GMM), a Bayesian approach to the problem can be adopted, extending the MEM procedure to parametric moment conditions. We show in particular that a large number of GEL estimators can be interpreted as a maximum entropy solution. Moreover, we provide a more general field of applications by proving the method to be robust to approximate moment conditions.
In the framework of inverse problems, we consider the question of aggregating estimators taken fr... more In the framework of inverse problems, we consider the question of aggregating estimators taken from a given collection. Extending usual results for the direct case, we propose a new penalty to achieve the best aggregation. An oracle inequality provides the asymptotic behavior of this estimator. We investigate here the price for considering indirect observations.
A general method to combine several estimators of the same quantity is investigated. In the spiri... more A general method to combine several estimators of the same quantity is investigated. In the spirit of model and forecast averaging, the final estimator is computed as a weighted average of the initial ones, where the weights are constrained to sum to one. In this framework, the optimal weights, minimizing the quadratic loss, are entirely determined by the mean square error matrix of the vector of initial estimators. The averaging estimator is built using an estimation of this matrix, which can be computed from the same dataset. A non-asymptotic error bound on the averaging estimator is derived, leading to asymptotic optimality under mild conditions on the estimated mean square error matrix. This method is illustrated on standard statistical problems in parametric and semi-parametric models where the averaging estimator outperforms the initial estimators in most cases.
We compute a variance lower bound for unbiased estimators in specified statistical models. The co... more We compute a variance lower bound for unbiased estimators in specified statistical models. The construction of the bound is related to the original Cramér-Rao bound, although it does not require the differentiability of the model. Moreover, we show our efficiency bound to be always greater than the Cramér-Rao bound in smooth models, thus providing a sharper result.
The characterization of a graph via the variable adjacency matrix enables one to define a partial... more The characterization of a graph via the variable adjacency matrix enables one to define a partially ordered relation on the walks. Studying the incidence algebra on this poset reveals unsuspected relations between connected and self-avoiding walks on the graph. These relations are derived by considering truncated versions of the characteristic polynomial of variable adjacency matrix, resulting in a collection of matrices whose entries enumerate the self-avoiding walks of length ℓ from one vertex to another.
The aim of this paper is to propose a methodology for testing general hypothesis in a Markovian s... more The aim of this paper is to propose a methodology for testing general hypothesis in a Markovian setting with random sampling. A discrete Markov chain X is observed at random time intervals $\tau$ k, assumed to be iid with unknown distribution $\mu$. Two test procedures are investigated. The first one is devoted to testing if the transition matrix P of the Markov chain X satisfies specific affine constraints, covering a wide range of situations such as symmetry or sparsity. The second procedure is a goodness-of-fit test on the distribution $\mu$, which reveals to be consistent under mild assumptions even though the time gaps are not observed. The theoretical results are supported by a Monte Carlo simulation study to show the performance and robustness of the proposed methodologies on specific numerical examples.
We tackle the inverse problem of reconstructing an unknown finite measure µ from a noisy observat... more We tackle the inverse problem of reconstructing an unknown finite measure µ from a noisy observation of a generalized moment of µ defined as the integral of a continuous and bounded operator Φ with respect to µ. When only a quadratic approximation Φ m of the operator is known, we introduce the L 2 approximate maximum entropy solution as a minimizer of a convex functional subject to a sequence of convex constraints. Under several assumptions on the convex functional, the convergence of the approximate solution is established and rates of convergence are provided.
The whole theory of efficiency developped in aims to provide a lower bound for the variance of un... more The whole theory of efficiency developped in aims to provide a lower bound for the variance of unbiased estimators of finite dimensional parameters in semi parametric models. We propose to adapt this theory to the method of generalized method of moments (GMM), which can be described as a specific semi parametric model. Then, the objective is to determine the semi parametric efficency bound in this particular model. Suppose we observe n independant realizations X 1 , ..., X n of a random variable with distribution µ, satisfying
In the framework of inverse problems, we consider the question of aggregating estimators taken fr... more In the framework of inverse problems, we consider the question of aggregating estimators taken from a given collection. Extending usual results for the direct case, we propose a new penalty to achieve the best aggregation. An oracle inequality provides the asymptotic behavior of this estimator. We investigate here the price for considering indirect observations.
We propose a general method to combine several estimators of the same quantity in order to produc... more We propose a general method to combine several estimators of the same quantity in order to produce a better estimate. In the spirit of model and forecast averaging, the final estimator is computed as a weighted average of the initial ones, where the weights are constrained to sum to one. In this framework, the optimal weights, minimizing the quadratic loss, are entirely determined by the mean square error matrix of the vector of initial estimators. The averaging estimator is derived using an estimation of this matrix, which can be computed from the same dataset. We prove a non-asymptotic error bound on the averaging estimator and we show that it is asymptotically optimal under mild conditions on the estimated mean square error matrix. This method is illustrated on standard statistical problems in parametric and semi-parametric models where the averaging estimator outperforms the initial estimators in most cases.
Uploads
Papers by Paul Rochet
are closed walks which are not allowed to visit any vertex more than once.
We present an exact formula for enumerating such cycles of any length on any
directed graph involving a sum over its induced subgraphs. This result stems
from a Hopf algebra, which we construct explicitly, and which provides further
means of counting cycles. Finally, we obtain a more general theorem asserting
that any Lie idempotent can be used to enumerate cycles.
are closed walks which are not allowed to visit any vertex more than once.
We present an exact formula for enumerating such cycles of any length on any
directed graph involving a sum over its induced subgraphs. This result stems
from a Hopf algebra, which we construct explicitly, and which provides further
means of counting cycles. Finally, we obtain a more general theorem asserting
that any Lie idempotent can be used to enumerate cycles.