Papers by Michael Trosset
Pattern search methods are a class of direct search methods for nonlinear optimization. Since the... more Pattern search methods are a class of direct search methods for nonlinear optimization. Since the introduction of the original pattern search methods in the late 1950s and early 1960s, they have remained popular with users due to their simplicity and the fact that they work well in practice on a variety of problems. More recently, the fact that they are provably convergent has generated renewed interest in the nonlinear programming community. The purpose of this article is to describe what pattern search methods are and why they work.

Journal of Computational and Applied Mathematics, Dec 1, 2000
We discuss direct search methods for unconstrained optimization. We give a modern perspective on ... more We discuss direct search methods for unconstrained optimization. We give a modern perspective on this classical family of derivative-free algorithms, focusing on the development of direct search methods during their golden age from 1960 to 1971. We discuss how direct search methods are characterized by the absence of the construction of a model of the objective. We then consider a number of the classical direct search methods and discuss what research in the intervening years has uncovered about these algorithms. In particular, while the original direct search methods were consciously based on straightforward heuristics, more recent analysis has shown that in most -but not all -cases these heuristics actually su ce to ensure global convergence of at least one subsequence of the sequence of iterates to a ÿrst-order stationary point of the objective function.
Structural Optimization, 1999
Journal of Computational and Graphical Statistics, Oct 2, 2017
The Joint Optimization of Fidelity and Commensurability (JOFC) manifold matching methodology embe... more The Joint Optimization of Fidelity and Commensurability (JOFC) manifold matching methodology embeds an omnibus dissimilarity matrix consisting of multiple dissimilarities on the same set of objects. One approach to this embedding optimizes the preservation of fidelity to each individual dissimilarity matrix together with commensurability of each given observation across modalities via iterative majorization of a raw stress error criterion by successive Guttman transforms. In this paper, we exploit the special structure inherent to JOFC to exactly and efficiently compute the successive Guttman transforms, and as a result we are able to greatly speed up the JOFC procedure for both in-sample and out-of-sample embedding. We demonstrate the scalability of our implementation on both real and simulated data examples.
Computational Statistics & Data Analysis, Jun 1, 2008
Out-of-sample embedding techniques insert additional points into previously constructed configura... more Out-of-sample embedding techniques insert additional points into previously constructed configurations. An out-of-sample extension of classical multidimensional scaling is presented. The out-of-sample extension is formulated as an unconstrained nonlinear least-squares problem. The objective function is a fourth-order polynomial, easily minimized by standard gradient-based methods for numerical optimization. Two examples are presented.

Computational Statistics & Data Analysis, Nov 1, 2002
Multidimensional scaling (MDS) is a collection of data analytic techniques for constructing conÿg... more Multidimensional scaling (MDS) is a collection of data analytic techniques for constructing conÿgurations of points from dissimilarity information about interpoint distances. Two popular measures of how well the constructed distances ÿt the observed dissimilarities are the raw stress and sstress criteria, each of which must be minimized by numerical optimization. Because iterative procedures for numerical optimization typically ÿnd local minimizers that may not be global minimizers, the choice of an initial conÿguration from which to begin searching for an optimal conÿguration is crucial. A popular choice of initial conÿguration is the classical solution of Torgerson (Psychometrika 17 (1952) 401). Results from the theory of distance matrices are exploited to derive two alternatives, each guaranteed to be at least as good as the classical solution, and empirical evidence is presented that they are usually substantially better.

arXiv (Cornell University), Apr 15, 2020
A random dot product graph (RDPG) is a generative model for networks in which vertices correspond... more A random dot product graph (RDPG) is a generative model for networks in which vertices correspond to positions in a latent Euclidean space and edge probabilities are determined by the dot products of the latent positions. We consider RDPGs for which the latent positions are randomly sampled from an unknown 1-dimensional submanifold of the latent space. In principle, restricted inference, i.e., procedures that exploit the structure of the submanifold, should be more effective than unrestricted inference; however, it is not clear how to conduct restricted inference when the submanifold is unknown. We submit that techniques for manifold learning can be used to learn the unknown submanifold well enough to realize benefit from restricted inference. To illustrate, we test 1and 2-sample hypotheses about the Fréchet means of small communities of vertices, using the complete set of vertices to infer latent structure. We propose test statistics that deploy the Isomap procedure for manifold learning, using shortest path distances on neighborhood graphs constructed from estimated latent positions to estimate arc lengths on the unknown 1-dimensional submanifold. Unlike conventional applications of Isomap, the estimated latent positions do not lie on the submanifold of interest. We extend existing convergence results for Isomap to this setting and use them to demonstrate that, as the number of auxiliary vertices increases, the power of our test converges to the power of the corresponding test when the submanifold is known. Finally, we apply our methods to an inference problem that arises in studying the connectome of the Drosophila larval mushroom body. The univariate learnt manifold test rejects (p < 0.05), while the multivariate ambient space test does not (p 0.05), illustrating the value of identifying and exploiting low-dimensional structure for subsequent inference.

arXiv (Cornell University), Mar 20, 2019
Parametric inference posits a statistical model that is a specified family of probability distrib... more Parametric inference posits a statistical model that is a specified family of probability distributions. Restricted inference, e.g., restricted likelihood ratio testing, attempts to exploit the structure of a statistical submodel that is a subset of the specified family. We consider the problem of testing a simple hypothesis against alternatives from such a submodel. In the case of an unknown submodel, it is not clear how to realize the benefits of restricted inference. To do so, we first construct information tests that are locally asymptotically equivalent to likelihood ratio tests. Information tests are conceptually appealing but (in general) computationally intractable. However, unlike restricted likelihood ratio tests, restricted information tests can be approximated even when the statistical submodel is unknown. We construct approximate information tests using manifold learning procedures to extract information from samples of an unknown (or intractable) submodel, thereby providing a roadmap for computational solutions to a class of previously impenetrable problems in statistical inference. Examples illustrate the efficacy of the proposed methodology.
Iterative Denoising
Computational Statistics, Oct 12, 2007

arXiv (Cornell University), Jul 29, 2016
Likelihood ratio tests are widely used to test statistical hypotheses about parametric families o... more Likelihood ratio tests are widely used to test statistical hypotheses about parametric families of probability distributions. If interest is restricted to a subfamily of distributions, then it is natural to inquire if the restricted LRT is superior to the unrestricted LRT. Marden's general LRT conjecture posits that any restriction placed on the alternative hypothesis will increase power. The only published counterexample to this conjecture is rather technical and involves a restriction that maintains the dimension of the alternative. We formulate the dimension-restricted LRT conjecture, which posits that any restriction that replaces a parametric family with a subfamily of lower dimension will increase power. Under standard regularity conditions, we then demonstrate that the restricted LRT is asymptotically more powerful than the unrestricted LRT for local alternatives. Remarkably, however, even the dimension-restricted LRT conjecture fails in the case of finite samples. Our counterexamples involve subfamilies of multinomial distributions. In particular, our study of the Hardy-Weinberg subfamily of trinomial distributions provides a simple and elegant demonstration that restrictions may not increase power.

IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2019
Parameter estimation in discrete or continuous deterministic cell cycle models is challenging for... more Parameter estimation in discrete or continuous deterministic cell cycle models is challenging for several reasons, including the nature of what can be observed, and the accuracy and quantity of those observations. The challenge is even greater for stochastic models, where the number of simulations and amount of empirical data must be even larger to obtain statistically valid parameter estimates. This work describes a new quasi-Newton algorithm class QNSTOP for stochastic optimization problems. QNSTOP directly uses the random objective function value samples rather than creating ensemble statistics. QNSTOP is used here to directly match empirical and simulated joint probability distributions rather than matching summary statistics. Results are given for a current state-of-the-art stochastic cell cycle model of budding yeast, whose predictions match well some summary statistics and one-dimensional distributions from empirical data, but do not match well the empirical joint distributions. The nature of the mismatch provides insight into the weakness in the stochastic model.

Journal of Computational and Graphical Statistics, Mar 1, 2005
Clustering is often useful for analyzing and summarizing information within large dataseis. Model... more Clustering is often useful for analyzing and summarizing information within large dataseis. Model-based clustering methods have been found to be effective for determining the number of clusters, dealing with outliers, and selecting the best clustering method in datasets that are small to moderate in size. For large datasets, current model-based clustering methods tend to be limited by memory and time requirements and the increasing difficulty of maximum likelihood estimation. They may fit too many clusters in some portions of the data and/or miss clusters containing relatively few observations. We propose an incremental approach for data that can be processed as a whole in memory, which is relatively efficient computationally and has the ability to find small clusters in large datasets. The method starts by drawing a random sample of the data, selecting and fitting a clustering model to the sample, and extending the model to the full dataset by additional EM iterations. New clusters are then added incrementally, initialized with the observations that are poorly fit by the current model. We demonstrate the effectiveness of this method by applying it to simulated data, and to image data where its performance can be assessed visually.
Extensions of Classical Multidimensional Scaling via Variable Reduction
Computational Statistics, Jul 1, 2002
CiteSeerX - Document Details (Isaac Councill, Lee Giles, Pradeep Teregowda): Classical multidimen... more CiteSeerX - Document Details (Isaac Councill, Lee Giles, Pradeep Teregowda): Classical multidimensional scaling constructs a configuration of points... This paper describes the computational theory that provides a common foundation for these formulations.
Alzheimer's disease: Effects on language
Developmental Neuropsychology, Apr 1, 1993
The longitudinal effect of Alzheimer&amp;amp;amp;amp;amp;#x27;s disease on language functions... more The longitudinal effect of Alzheimer&amp;amp;amp;amp;amp;#x27;s disease on language functions has rarely been investigated. Through support from the National Institute of Mental Health, language functions were comprehensively assessed in 94 Alzheimer&amp;amp;amp;amp;amp;#x27;s patients and 53 normal control participants in a 3‐year study. Rate of decline of language abilities was calculated and related to overall dementia severity, family history of the disease, and

Clinical Cancer Research, Feb 1, 2005
Purpose: We recently showed that protein expression profiling of serum using surface-enhanced las... more Purpose: We recently showed that protein expression profiling of serum using surface-enhanced laser desorption/ ionization time-of-flight mass spectrometry (SELDI-TOF MS) has potential as a diagnostic approach for detection of prostate cancer. As a parallel effort, we have been pursuing the identification of the protein(s) comprising the individual discriminatory ''peaks'' and evaluating their utility as potential biomarkers for prostate disease. Experimental Design: We employed liquid chromatography, gel electrophoresis and tandem mass spectroscopy to isolate and identify a protein that correlates with observed SELDI-TOF MS mass/charge (m/z) values. Immunodepletion, immunoassay, and Western analysis were used to verify that the identified protein generated the observed SELDI peak. Subsequent immunohistochemistry was used to examine the expression of the proteins in prostate tumors. Results: An 8,946 m/z SELDI-TOF MS peak was found to retain discriminatory value in each of two separate data sets with an increased expression in the diseased state. Sequence identification by liquid chromatography-MS/MS and subsequent immunoassays verified that an isoform of apolipoprotein A-II (ApoA-II) is the observed 8,946 m/z SELDI peak. Immunohistochemistry revealed that ApoA-II is overexpressed in prostate tumors. SELDI-based immunoassay revealed that an 8.9-kDa isoform of ApoA-II is specifically overexpressed in serum from individuals with prostate cancer. ApoA-II was also overexpressed in the serum of individuals with prostate cancer who have normal prostate-specific antigen (0-4.0 ng/mL). Conclusions: We have identified an isoform of ApoA-II giving rise to an 8.9K m/z SELDI ''peak'' that is specifically overexpressed in prostate disease. The ability of ApoA-II to detect disease in patients with normal prostate-specific antigen suggests potential utility of the marker in identifying indolent disease.

arXiv (Cornell University), May 3, 2023
Random graphs are increasingly becoming objects of interest for modeling networks in a wide range... more Random graphs are increasingly becoming objects of interest for modeling networks in a wide range of applications. Latent position random graph models posit that each node is associated with a latent position vector, and that these vectors follow some geometric structure in the latent space. In this paper, we consider random dot product graphs, in which an edge is formed between two nodes with probability given by the inner product of their respective latent positions. We assume that the latent position vectors lie on an unknown one-dimensional curve and are coupled with a response covariate via a regression model. Using the geometry of the underlying latent position vectors, we propose a manifold learning and graph embedding technique to predict the response variable on out-of-sample nodes, and we establish convergence guarantees for these responses. Our theoretical results are supported by simulations and an application to Drosophila brain data.
arXiv (Cornell University), Feb 11, 2015
The Joint Optimization of Fidelity and Commensurability (JOFC) manifold matching methodology embe... more The Joint Optimization of Fidelity and Commensurability (JOFC) manifold matching methodology embeds an omnibus dissimilarity matrix consisting of multiple dissimilarities on the same set of objects. One approach to this embedding optimizes the preservation of fidelity to each individual dissimilarity matrix together with commensurability of each given observation across modalities via iterative majorization of a raw stress error criterion by successive Guttman transforms. In this paper, we exploit the special structure inherent to JOFC to exactly and efficiently compute the successive Guttman transforms, and as a result we are able to greatly speed up the JOFC procedure for both in-sample and out-of-sample embedding. We demonstrate the scalability of our implementation on both real and simulated data examples.
The Bayes principle from statistical decision theory provides a conceptual framework for quantify... more The Bayes principle from statistical decision theory provides a conceptual framework for quantifying uncertainties that arise in robust design optimization. The difficulty with exploiting this framework is computational, as it leads to objective and constraint functions that must be evaluated by numerical integration. Using a prototypical robust design optimization problem, this study explores the computational cost of multidimensional integration (computing expectation) and its interplay with optimization algorithms. It concludes that straightforward application of standard off-the-shelf optimization software to robust design is prohibitively expensive, necessitating adaptive strategies and the use of surrogates.

This report is intended to facilitate dialogue between engineers and optimizers about the e cienc... more This report is intended to facilitate dialogue between engineers and optimizers about the e ciency of Taguchi methods for robust design, especially in the context of design by computer simulation. Three approaches to robust design are described: 1. Robust optimization, i.e. specifying an objective function f and then minimizing a smoothed (robust) version of f by the methods of numerical optimization. 2. Taguchi's method of specifying the objective function as a certain signal-to-noise ratio, to be optimized by designing, performing and analyzing a single massive experiment. 3. Specifying an expected loss function f and then minimizing a cheap-to-compute surrogate objective functionf, to be obtained by designing and performing a single massive experiment. Some relations between these approaches are noted and it is emphasized that only the rst approach is capable of iteratively progressing toward a solution.
The design and analysis of computer experiments (DACE) usually envisions performing a single expe... more The design and analysis of computer experiments (DACE) usually envisions performing a single experiment, then replacing the expensive simulation with an approximation. When the simulation is a nonlinear function to be optimized, DACE may be inefficient, and sequential strategies that synthesize ideas from DACE and numerical optimization may be warranted. We consider several such strategies within a unified framework in which sequential approximations constructed by kriging are used to accelerate a conventional direct search method. Computational experiments reveal that hybrid strategies outperform both DACE and traditional pattern search.
Uploads
Papers by Michael Trosset