Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2001
The (unit-cost) comparison tree model has long been the basis of evaluating the performance of algorithms for fundamental problems like sorting and searching. In this model, the assumption is that elements of some total order are not given to us directly, but only through a black-box, which performs comparisons between the elements and outputs the result of the comparison.
Lecture Notes in Computer Science, 2005
Traditionally, a fundamental assumption in evaluating the performance of algorithms for sorting and selection has been that comparing any two elements costs one unit (of time, work, etc.); the goal of an algorithm is to minimize the total cost incurred. However, a body of recent work has attempted to find ways to weaken this assumption-in particular, new algorithms have been given for these basic problems of searching, sorting and selection, when comparisons between different pairs of elements have different associated costs. In this paper, we further these investigations, and address the questions of maxfinding and sorting when the comparison costs form a metric; i.e., the comparison costs cuv respect the triangle inequality cuv + cvw ≥ cuw for all input elements u, v and w. We give the first results for these problems-specifically, we present-An O(log n)-competitive algorithm for max-finding on general metrics, and we improve on this result to obtain an O(1)-competitive algorithm for the max-finding problem in constant dimensional spaces.-An O(log 2 n)-competitive algorithm for sorting in general metric spaces. Our main technique for max-finding is to run two copies of a simple natural online algorithm (that costs too much when run by itself) in parallel. By judiciously exchanging information between the two copies, we can bound the cost incurred by the algorithm; we believe that this technique may have other applications to online algorithms. Research partly supported by NSF CAREER award CCF-0448095, and by an Alfred P. Sloan Fellowship. all the comparisons performed? Note that these costs c uv are known to the algorithm, which can use them to decide on the sequence of comparisons. The case where all comparison costs are identical is just the unit-cost model. To measure the performance of algorithms in this model, Charikar et al. [2] used the framework of competitive analysis: they compared the cost incurred by the algorithm to the cost incurred by an optimal algorithm to prove the output correct. The paper of Charikar et al. [2], and subsequent work by the authors [3] and Kannan and Khanna [4] considered sorting, searching, and selection for special cost functions (which are described in the discussion on related work). However, there seems to be no work on the case where the comparison costs form a metric space, i.e., where costs respect the triangle inequality c uv + c vw ≥ c uw for all u, v, w ∈ V. Such situations may arise if the elements reside at different places in a communication network, and the communication cost of comparing two elements is proportional to the distance between them. An equivalent, and perhaps more natural way of enforcing the metric constraint on the costs is to say that the vertices in V lie in an ambitent metric space (X, d) (where V ⊆ X), where the cost c ij of comparing two vertices i and j is the distance d(i, j) between them.
We describe a general framework for realistic analysis of sorting and searching algorithms, and we apply it to the average-case analysis of five basic algorithms: three sorting algorithms (QuickSort, InsertionSort, BubbleSort) and two selection algorithms (QuickMin and SelectionMin). Usually, the analysis deals with the mean number of key comparisons, but, here, we view keys as words produced by the same source, which are compared via their symbols in the lexicographic order. The realistic cost of the algorithm is now the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons used by the algorithm. For sorting algorithms, and with respect to key comparisons, the average-case complexity of QuickSort is asymptotic to 2n log n, InsertionSort to n2/4 and BubbleSort to n2/2. With respect to symbol comparisons, we prove that their average-case complexity becomes Θ (n...
Journal of Algorithms, 2002
We present an O(n 4)-time algorithm for the following problem: Given a set of items with known access frequencies, find the optimal binary search tree under the realistic assumption that each comparison can only result in a two-way decision: either an equality comparison or a less-than comparisons. This improves the best known result of O(n 5) time, which is based on split tree algorithms. Our algorithm relies on establishing thresholds on the frequency of an item that can occur as an equality comparison at the root of an optimal tree.
Lecture Notes in Computer Science, 2009
In experimental psychology, the method of paired comparisons was proposed as a means for ranking preferences amongst n elements of a human subject. The method requires performing all`n 2ć omparisons then sorting elements according to the number of wins. The large number of comparisons is performed to counter the potentially faulty decision-making of the human subject, who acts as an imprecise comparator.
Lecture Notes in Computer Science, 2015
In 1971, Knuth gave an O(n 2)-time algorithm for the classic problem of finding an optimal binary search tree. Knuth's algorithm works only for search trees based on 3-way comparisons, but most modern computers support only 2-way comparisons (<, ≤, =, ≥, and >). Until this paper, the problem of finding an optimal search tree using 2way comparisons remained open-poly-time algorithms were known only for restricted variants. We solve the general case, giving (i) an O(n 4)-time algorithm and (ii) an O(n log n)-time additive-3 approximation algorithm. For finding optimal binary split trees, we (iii) obtain a linear speedup and (iv) prove some previous work incorrect.. .. machines that cannot make three-way comparisons at once.. . will have to make two comparisons.. . it may well be best to have a binary tree whose internal nodes specify either an equality test or a less-than test but not both.
2019
We study the problem of sorting under incomplete information, when queries are used to resolve uncertainties. Each of n data items has an unknown value, which is known to lie in a given interval. We can pay a query cost to learn the actual value, and we may allow an error threshold in the sorting. The goal is to find a nearly-sorted permutation by performing a minimum-cost set of queries. We show that an offline optimum query set can be found in polynomial time, and that both oblivious and adaptive problems have simple query-competitive algorithms. The query-competitiveness for the oblivious problem is n for uniform query costs, and unbounded for arbitrary costs; for the adaptive problem, the ratio is 2. We then present a unified adaptive strategy for uniform query costs that yields: (i) a 3/2-query-competitive randomized algorithm; (ii) a 5/3-query-competitive deterministic algorithm if the dependency graph has no 2-components after some preprocessing, which has query-competitive r...
2009
Since the dawn of computing, the sorting problem has attracted a great deal of research. In past, many researchers have attempted to optimize it properly using empirical analysis. We have investigated the complexity values researchers have obtained and observed that there is scope for fine tuning in present context. Strong evidence to that effect is also presented. We aim to provide a useful and comprehensive note to researcher about how complexity aspects of sorting algorithms can be best analyzed. It is also intended current researchers to think about whether their own work might be improved by a suggestive fine tuning. Our work is based on the knowledge learned after literature review of experimentation, survey paper analysis being carried out for the performance improvements of sorting algorithms. Although written from the perspective of a theoretical computer scientist, it is intended to be of use to researchers from all fields who want to study sorting algorithms rigorously.
Proceedings of the thirty-second annual ACM symposium on Theory of computing - STOC '00, 2000
We consider a class of problems in which an algorithm seeks to compute a function f over a set of n inputs, where each input has an associated price. The algorithm queries inputs sequentially, trying to learn the value of the function for the minimum cost. We apply the competitive analysis of algorithms to this framework, designing algorithms that incur large cost only when the cost of the cheapest "proof" for the value of f is also large. We provide algorithms that achieve the optimal competitive ratio for functions that include arbitrary Boolean AND/OR trees, and for the problem of searching in a sorted array. We also investigate a model for pricing in this framework, constructing a set of prices for any AND/OR tree that satisfies a very strong type of equilibrium property.
IEEE Symposium on Foundations of Computer Science (FOCS), 2011
We study the generalized sorting problem where we are given a set of n elements to be sorted but only a subset of all possible pairwise element comparisons is allowed. The goal is to determine the sorted order using the smallest possible number of allowed comparisons. The generalized sorting problem may be equivalently viewed as follows. Given an undirected graph G(V, E) where V is the set of elements to be sorted and E defines the set of allowed comparisons, adaptively find the smallest subset E ⊆ E of edges to probe such that the directed graph induced by E contains a Hamiltonian path. When G is a complete graph, we get the standard sorting problem, and it is well-known that Θ(n log n) comparisons are necessary and sufficient. An extensively studied special case of the generalized sorting problem is the nuts and bolts problem where the allowed comparison graph is a complete bipartite graph between two equal-size sets. It is known that for this special case also, there is a deterministic algorithm that sorts using Θ(n log n) comparisons. However, when the allowed comparison graph is arbitrary, to our knowledge, no bound better than the trivial O(n 2) bound is known. Our main result is a randomized algorithm that sorts any allowed comparison graph using O(n 3/2) comparisons with high probability (provided the input is sortable). We also study the sorting problem in randomly generated allowed comparison graphs, and show that when the edge probability is p, O(min{ n p 2 , n 3/2 √ p}) comparisons suffice on average to sort.
ArXiv, 2021
We present a simple O(n)-time algorithm for computing optimal search trees with two-way comparisons. The only previous solution to this problem, by Anderson et al., has the same running time, but is significantly more complicated and is restricted to the variant where only successful queries are allowed. Our algorithm extends directly to solve the standard full variant of the problem, which also allows unsuccessful queries and for which no polynomial-time algorithm was previously known. The correctness proof of our algorithm relies on a new structural theorem for two-way-comparison search trees.
Information Processing Letters, 1981
Nordic Journal of Computing, 2006
We present a simplified derivation of the fact that the complexity-theoretic lower bound of comparison-based sorting algorithms, both for the worst-case and for the average-case time measure, is Ω(nlogn).
Applications of Mathematics, 1981
For any k-tuple v in V, let c.(v) denote the f-th component of v. A lexicographic ordering ~< is defined on V in the usual way, that is, for v, u e V, v -< u if and only if either c x (v) < c t (u) or there exists 1 ^ j < k such that c t (v) = = c t (w) for i = 1,2, ...,j and c 7 -+1 (v) < c y+1 (u), where < is the total ordering on each Uj. We shall consider the problem of lexicographic sorting of k-tuples of V, as well as that of searching for a k-tuple in V. The computational complexity of both the problems will be measured by the number of three-branch component comparisons needed for solving these problems (i.e., two components c t (v) and c t (u) will be compared yielding c t (v) < c t (u), c,(v) = = c t (u) or Cj(v) > c t (u) as an answer). We shall be interested in obtaining the (worst case) upper and lower bounds on the complexity, as a function of both n and k. Note that the problem of lexicographic sorting can be straightforwardly solved by applying any "onedimensional" sorting algorithm directly to the k-tuples of V, which are in this case viewed as "unstructured" elements, with respet to the lexico graphic ordering -<. However, this approach would require about 0(n log n) "lexico graphic" comparisons, which can need as much as Q(kn log n) component compari sons, because in the worst case the lexicographic order of two k-tuples cannot be detected until all k component comparisons are performed. In a similar manner the lexicographic search can be done in 0(k log n) steps. In contrast to these "trivial" upper bounds we shall show that making use of the particular structure of the lexicographic ordering, we can accomplish the lexico graphic sorting and searching using 0(n(\og n + k)) and [log (n + 1)] + k -1 component comparisons, respectively, and that these bounds are asymptotically optimal in the case of sorting and optimal in the case of searching.
Combinatorics, Probability and Computing, 2014
We describe a general framework for realistic analysis of sorting algorithms, and we apply it to the average-case analysis of three basic sorting algorithms (QuickSort, InsertionSort, BubbleSort). Usually the analysis deals with the mean number of key comparisons, but here we view keys as words produced by the same source, which are compared via their symbols in lexicographic order. The ‘realistic’ cost of the algorithm is now the total number of symbol comparisons performed by the algorithm, and, in this context, the average-case analysis aims to provide estimates for the mean number of symbol comparisons used by the algorithm. For sorting algorithms, and with respect to key comparisons, the average-case complexity of QuickSort is asymptotic to 2n log n, InsertionSort to n2/4 and BubbleSort to n2/2. With respect to symbol comparisons, we prove that their average-case complexity becomes Θ (n log2n), Θ(n2), Θ (n2 log n). In these three cases, we describe the dominant constants which ...
ijera.com
Dynamic programming is an effective algorithm design method. Sorting is believed to be an unusual area for dynamic programming. Our finding is contrary to this conventional belief. Though it appears that classical sorting algorithms were designed using bottom up design approach, but we have found the evidence which suggests that some classical sorting algorithms can also be designed using Dynamic programming design method. Even the development of classical Merge algorithm shows elements of dynamic programming. This paper finds that development of sorting algorithms can be looked from an entirely different point of view. This work will reveal some new facts about the design or development of some key sorting algorithms. It was discovered that this new interpretation gives a deep insight about the design of some fundamental sorting algorithms.
2020
In the unit-cost comparison model, a black box takes an input two items and outputs the result of the comparison. Problems like sorting and searching have been studied in this model, and it has been generalized to include the concept of priced information, where different pairs of items (say database records) have different comparison costs. These comparison costs can be arbitrary (in which case no algorithm can be close to optimal (Charikar et al. STOC 2000)), structured (for example, the comparison cost may depend on the length of the databases (Gupta et al. FOCS 2001)), or stochastic (Angelov et al. LATIN 2008). Motivated by the database setting where the cost depends on the sizes of the items, we consider the problems of sorting and batched predecessor where two non-uniform sets of items $A$ and $B$ are given as input. (1) In the RAM setting, we consider the scenario where both sets have $n$ keys each. The cost to compare two items in $A$ is $a$, to compare an item of $A$ to an ...
1986
The time complexity of sorting n elements using p~n processors on Valiant's parallel comparison tree model is considered. The following results are obtained. 1. We show that this time complexity is e(Iogn/log(1 +p/n». This complements the AKS sorting network in settling the wider problem of comparison sort of n elements by p processors, where the problem for p~n was resolved. To prove the lower bound, we show that to achieve time k~logn, we need o (kn l + l/k) comparisons. Haggkvist and Hell proved a similar result only for fixed k. 2. For every fixed time k, we show that: (a) O(n l +l/ k 10gn l/k) comparisons are required, (0 (n 1+11k logn) are known to be sufficient in this case), and (b) there exists a randomized algorithm for comparison sort in time k with an expected number of O(n l +l/ k) comparisons. This implies that for every fixed k, any deterministic comparison sort algorithm must be asymptotically worse than this randomized algorithm. The lower bound improves on Haggkvist-Hell's lower bound. 3. We show that "approximate sorting" in time 1 requires asymptotically more than nlogn processors. This settles a problem raised by M. Rabin.
2021
In this paper, we consider decision trees that use both queries based on one attribute each and queries based on hypotheses about values of all attributes. Such decision trees are similar to ones studied in exact learning, where not only membership but also equivalence queries are allowed. For n = 3, . . . , 6, we compare decision trees based on various combinations of attributes and hypotheses for sorting n pairwise different elements from linearly ordered set.
Journal of Discrete Algorithms, 2015
We revisit the classical algorithms for searching over sorted sets to introduce an algorithm refinement, called Adaptive Search, that combines the good features of Interpolation search and those of Binary search. W.r.t. Interpolation search, only a constant number of extra comparisons is introduced. Yet, under diverse input data distributions our algorithm shows costs comparable to that of Interpolation search, i.e., O(log log n) while the worst-case cost is always in O(log n), as with Binary search. On benchmarks drawn from large datasets, both synthetic and real-life, Adaptive search scores better times and lesser memory accesses even than Santoro and Sidney's Interpolation-Binary search.
Bell System Technical Journal, 1983
In this paper, we introduce two new kinds of biased search trees: biased, a, b trees and pseudo-weight-balanced trees. A biased search tree is a data structure for storing a sorted set in which the access time for an item depends on its estimated access frequency in such a way that the average access time is small. Bent, Sleator, and Tarjan were the rust to describe classes of biased search trees that are easy to update; such trees have applications not only in efficient table storage but also in various network optimization algorithms. Our biased a, b trees generalize the biased 2, b trees of Bent, Sleator, and Tarjan. They provide a biased generalization of B-trees and are suitable for use in paged external memory, whereas previous kinds of biased trees are suitable for internal memory. Our pseudo-weight-balanced trees are a biased version of weight-balanced trees much simpler than Bent's version. Weight balance is the natural kind of balance to use in designing biased trees; pseudoweight-balanced trees are especially easy to implement and analyze. I. INTRODUCTION The following problem, which we shall call the dictionary problem, occurs frequently in computer science. Given a totally ordered universe U, we wish to maintain one or more subsets of U under the following operations, where R and 8 denote any subsets of U and i denotes any item in U: access (i, 8)-1f item i is in 8, return a pointer to its location. Otherwise, return a special null pointer. * Research done partly while a summer employee of Bell Laboratories and partly while a graduate student supported by Air Force grant AFOSR-80-042. t Bell Laboratories.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.