Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2005
In this paper we consider the use of variable length non prefix-free codes for coding constrained sequences of symbols. We suppose to have a Markov source where some state transitions are impossible, i.e. the stochastic matrix associated with the Markov chain has some null entries. We show that classic Kraft inequality is not a necessary condition, in general, for unique decodability under the above hypothesis and we propose a relaxed necessary inequality condition. This allows, in some cases, the use of non prefix-free codes that can give very good performance, both in terms of compression and computational efficiency. Some considerations are made on the relation between the proposed approach and other existing coding paradigms.
Lecture Notes in Computer Science, 2016
For many kinds of prefix-free codes there are efficient and compact alternatives to the traditional tree-based representation. Since these put the codes into canonical form, however, they can only be used when we can choose the order in which codewords are assigned to characters. In this paper we first show how, given a probability distribution over an alphabet of σ characters, we can store a nearly optimal alphabetic prefix-free code in o(σ) bits such that we can encode and decode any character in constant time. We then consider a kind of code introduced recently to reduce the space usage of wavelet matrices (Claude, Navarro, and Ordóñez, Information Systems, 2015). They showed how to build an optimal prefix-free code such that the codewords' lengths are non-decreasing when they are arranged such that their reverses are in lexicographic order. We show how to store such a code in O σ log L + 2 L bits, where L is the maximum codeword length and is any positive constant, such that we can encode and decode any character in constant time under reasonable assumptions. Otherwise, we can always encode and decode a codeword of bits in time O() using O(σ log L) bits of space.
IEEE Transactions on Information Theory, 1996
A general class of sequential codes for lossless compression of individual sequences on a finite alphabet is defined, including many types of codes that one would want to implement. The principal requirement for membership in the class is that the encoding and decoding operations be performable on a computer. The OPTA function for the class of codes is then considered, which is the function that assigns to each individual sequence the infimum of the rates at which the sequence can be compressed over this class of sequential codes. Two results about the OPTA function are obtained: 1) it is shown that any sequential code in the class compresses some individual sequence at a rate strictly greater than the rate for that sequence given by the OPTA function; and 2) it is shown that the OPTA function takes a value strictly greater than that of the Kolmogorov complexity rate function for some individual sequences.
IEEE Access
We consider the construction of capacity-approaching variable-length constrained sequence codes based on multi-state encoders that permit state-independent decoding. Based on the finite state machine description of the constraint, we first select the principal states and establish the minimal sets. By performing partial extensions and normalized geometric Huffman coding, efficient codebooks that enable state-independent decoding are obtained. We then extend this multi-state approach to a construction technique based on n-step FSMs. We demonstrate the usefulness of this approach by constructing capacityapproaching variable-length constrained sequence codes with improved efficiency and/or reduced implementation complexity to satisfy a variety of constraints, including the runlength-limited (RLL) constraint, the DC-free constraint, and the DC-free RLL constraint, with an emphasis on their application in visible light communications. INDEX TERMS Constrained sequence codes, variable-length codes, capacity-approaching codes, multistate codes, state-independent decoding, visible light communication
2005
In this paper we discuss the problem of constructing minimum-cost, prefix-free codes for equiprobable words under the assumption that all codewords are restricted to belonging to an arbitrary language L and extend the classes of languages to which L can belong. Note: This extended abstract is essentially the version which appears in The proceedings of WADS'05, but with extra diagrams added.
2008
In this paper we propose a revisitation of the topic of unique decodability and of some of the related fundamental theorems. It is widely believed that, for any discrete source X, every "uniquely decodable" block code satisfies E[l(X1, X2, • • • , Xn)] ≥ H(X1, X2,. .. , Xn), where X1, X2,. .. , Xn are the first n symbols of the source, E[l(X1, X2, • • • , Xn)] is the expected length of the code for those symbols and H(X1, X2,. .. , Xn) is their joint entropy. We show that, for certain sources with memory, the above inequality only holds if a limiting definition of "uniquely decodable code" is considered. In particular, the above inequality is usually assumed to hold for any "practical code" due to a debatable application of McMillan's theorem to sources with memory. We thus propose a clarification of the topic, also providing extended versions of McMillan's theorem and of the Sardinas Patterson test to be used for Markovian sources. This work terminates also with the following interesting remark: both McMillan's original theorem and ours are equivalent to Shannon's theorem on the capacity of noiseless channels.
arXiv: Information Theory, 2015
We propose almost instantaneous fixed-to-variable-length (AIFV) codes such that two (resp. K − 1) code trees are used if code symbols are binary (resp. K-ary for K ≥ 3), and source symbols are assigned to incomplete internal nodes in addition to leaves. Although the AIFV codes are not instantaneous codes, they are devised such that the decoding delay is at most two bits (resp. one code symbol) in the case of binary (resp. K-ary) code alphabet. The AIFV code can attain better average compression rate than the Huffman code at the expenses of a little decoding delay and a little large memory size to store multiple code trees. We also show for the binary and ternary AIFV codes that the optimal AIFV code can be obtained by solving 0-1 integer programming problems. Index Terms AIFV code, Huffman code, FV code, code tree, Kraft inequality, Integer programming I. INTRODUCTION Lossless source codes are classified into fixed-to-variable-length (FV) codes and variable-to-fixed-length (VF) codes, which can be represented by code trees and parse trees, respectively. It is well known that the Huffman coding [1] and Tunstall coding [2] can attain the best compression rate in FV codes and VF codes, respectively, for stationary memoryless sources if a single code tree or a single parse tree is used. But, Yamamoto and Yokoo [3] showed that the AIVF (almost instantaneous variable-to-fixed length) coding can attain better compression rate than the Tunstall coding. An AIVF code uses |X |−1 parse trees for a source alphabet X and codewords are assigned to incomplete internal nodes in addition to leaves in each parse tree. Although instantaneous encoding is not possible since incomplete internal nodes are used for encoding, the AIVF code is devised such that the encoding delay is at most one source symbol, and hence the code is called almost instantaneous. Furthermore, Yoshida and Kida [4][5] showed that any AIVF code can be encoded and decoded by a single virtual multiple parse tree and the total number of nodes can be considerably reduced by the integration. In the case of FV codes, it is well known by Kraft and McMillan Theorems [6][7][8] that any uniquely decodable FV code must satisfy Kraft's inequality, and such a code can be realized by an instantaneous FV
IEEE Transactions on Information Theory, 1995
We present a new technique to construct slidingblock modulation codes with a small decoding window. Our method, which involves both state splitting and look-ahead encoding, crucially depends on a new, "local" construction method for hounded-delay codes. We apply our method to construct several new codes, all with a smaller decoding window than previously known codes for the same constraints at the same rate.
IEEE Transactions on Information Theory, 1989
IEEE Transactions on Communications, 2008
We investigate the complexity of joint source-channel maximum a posteriori (MAP) decoding of a Markov sequence which is first encoded by a source code, then encoded by a convolutional code, and sent through a noisy memoryless channel. As established previously the MAP decoding can be performed by a Viterbi-like algorithm on a trellis whose states are triples of the states of the Markov source, source coder and convolutional coder. The large size of the product space (in the order of K 2 N , where K is the number of source symbols and N is the number of states of the convolutional coder) appears to prohibit such a scheme. We show that for finite impulse response convolutional codes, the state space size of joint source-channel decoding can be reduced to O(K 2 + N log N), hence the decoding time becomes O(T K 2 + T N log N), where T is the length in bits of the decoded bitstream. We further prove that an additional complexity reduction can be achieved when K > N , if the logarithm of the source transition probabilities satisfy the so-called Monge property. This decrease becomes more significant as the tree structure of the source code is more unbalanced. The reduction factor ranges between O(K/N) (for a fixed-length source code) and O(K/ log N) (for Golomb-Rice code).
IEEE Communications Letters, 2017
We present the construction of interleaving arrays for correcting clusters as well as diffuse bursts of insertion or deletion errors in constrained data. In this construction, a constrained information sequence is systematically encoded by computing a small number of parity checks and inserting markers such that the resulting code word is also constrained. Insertions and deletions lead to a shift between successive markers which can thus be detected and recovered using the parity checks. In this paper, as an example, the scheme is developed for Manchester-encoded information sequences.
IEEE Transactions on Information Theory, 2006
This paper presents prefix codes which minimize various criteria constructed as a convex combination of maximum codeword length and average codeword length or maximum redundancy and average redundancy, including a convex combination of the average of an exponential function of the codeword length and the average redundancy. This framework encompasses as a special case several criteria previously investigated in the literature, while relations to universal coding is discussed. The coding algorithm derived is parametric resulting in re-adjusting the initial source probabilities via a weighted probability vector according to a merging rule. The level of desirable merging has implication in applications where the maximum codeword length is bounded.
Variable-length codes can provide compression for data communication. Such codes may be used not only when the source statistics is known but also when we do not know the source probability distribution, and a source with equal symbol probabilities (equiprobable symbols) can or has to be assumed. This paper presents variable-length codes with code words that differ in length by at most one code symbol. Such codes suit the efficient encoding of sources with equiprobable symbols. We accommodate non-binary codes and present an iterative algorithm for the construction of such codes. We also calculate the average codeword length for such codes, which extends Krichevski's result for binary codes [5]. Finally, we propose a scheme that allows the code to be communicated efficiently from transmitter to receiver.
Designs, Codes and Cryptography, 2014
In this paper we give a partial shift version of user-irrepressible sequence sets and conflict-avoiding codes. By means of disjoint difference sets, we obtain an infinite number of such user-irrepressible sequence sets whose lengths are shorter than known results in general. Subsequently, the newly defined partially conflict-avoiding codes are discussed.
Information and Control, 1976
In this note, starting from the notion of "unambiguity" of the product of two subsets of a free monoid, two preliminary characterizations of "variable length codes" are given. By making use of some auxiliary lemmas, from the first we derive in a very simple manner the Sardinas and Patterson theorem on the unique decipherability of coded messages. From the second we are able to obtain a new proof of the Schtitzenberger theorem characterizing the free submonoids of a free monoid.
2007
We offer novel algorithms for efficient encoding/decoding of variable-to-fixed length codes, requiring at most quadratic amount of space: O(L), where L is the depth of a coding tree. This is a major improvement compared to exponential O(2) usage of space by conventional techniques using complete representations of coding trees in computer’s memory. These savings are achieved by utilizing algebraic properties of VF coding trees constructed by using Tunstall or Khodak algorithms, and employing combinatorial enumeration techniques for encoding/decoding of codewords. The encoding/decoding complexity of our algorithms is linear with the number of symbols they process. As a side product, we also derive an exact formulae for the average redundancy of such codes under memoryless sources, and show its usefulness for analysis and design of codes with small number of codewords. 1 Definitions Consider a memoryless source S producing symbols from an input alphabet A = {a1, . . . , am} (2 6 m <...
Information Processing Letters, 2006
Static codes for the non-negative integers have a range of applications, including in the storage of inverted indexes, and in dictionary-based compression systems. We present two simple codes based on binary representations that are suited for message sequences in which the values are locally homogeneous, but not globally consistent. Experimental results based on typical sequences are given to show the efficacy of the new methods.
IEEE Transactions on Information Theory, 2004
The compression performance of grammar-based codes is revisited from a new perspective. Previously, the compression performance of grammar-based codes was evaluated against that of the best arithmetic coding algorithm with finite contexts. In this correspondence, we first define semifinite-state sources and finite-order semi-Markov sources. Based on the definitions of semifinite-state sources and finite-order semi-Markov sources, and the idea of run-length encoding (RLE), we then extend traditional RLE algorithms to context-based RLE algorithms: RLE algorithms with contexts and RLE algorithms of order , where is a nonnegative integer. For each individual sequence , let () and () be the best compression rate given by RLE algorithms with contexts and by RLE algorithms of order , respectively. It is proved that for any , is no greater than the best compression rate among all arithmetic coding algorithms with contexts. Furthermore, it is shown that there exist stationary, ergodic semi-Markov sources for which the best RLE algorithms without any context outperform the best arithmetic coding algorithms with any finite number of contexts. Finally, we show that the worst case redundancies of grammar-based codes against () and () among all lengthindividual sequences from a finite alphabet are upper-bounded by log log log and log log log , respectively, where and are constants. This redundancy result is stronger than all previous corresponding results.
IEEE Access
We study the ability of recently developed variable-length constrained sequence codes to determine codeword boundaries in the received sequence upon initial receipt of the sequence and if errors in the received sequence cause synchronization to be lost. We first investigate construction of these codes based on the finite state machine description of a given constraint, and develop new construction criteria to achieve high synchronization probabilities. Given these criteria, we propose a guided partial extension algorithm to construct variable-length constrained sequence codes with high synchronization probabilities. With this algorithm we construct new codes and determine the number of codewords and coded bits that are needed to recover synchronization once synchronization is lost. We consider a large variety of constraints including the runlength limited (RLL) constraint, the DC-free constraint, the Pearson constraint and constraints for inter-cell interference mitigation in flash memories. Simulation results show that the codes we construct exhibit excellent synchronization properties, often resynchronizing within a few bits.
IEEE Transactions on Information Theory, 2002
We offer two noiseless codes for blocks of integers X n = (X 1 , . . . , X n ). We provide explicit bounds on the relative redundancy that are valid for any distribution F in the class of memoryless sources with a possibly infinite alphabet whose marginal distribution is monotone. Specifically we show that the expected code length L(X n ) of our first universal code is dominated by a linear function of the entropy of X n . Further, we present a second universal code that is efficient in that its length is bounded by
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.