Papers by Richard Fredlund

ArXiv, 2014
Huffman encoding is often improved by using block codes, for example a 3-block would be an alphab... more Huffman encoding is often improved by using block codes, for example a 3-block would be an alphabet consisting of each possible combination of three characters. We take the approach of starting with a base alphabet and expanding it to include frequently occurring aggregates of symbols. We prove that the change in compressed message length by the introduction of a new aggregate symbol can be expressed as the difference of two entropies, dependent only on the probabilities and length of the introduced symbol. The expression is independent of the probability of all other symbols in the alphabet. This measure of information gain, for a new symbol, can be applied in data compression methods. We also demonstrate that aggregate symbol alphabets, as opposed to mutually exclusive alphabets have the potential to provide good levels of compression, with a simple experiment. Finally, compression gain as defined in this paper may also be useful for feature selection.
A Bayesian expected error reduction approach to Active Learning

Huffman encoding is often improved by using block codes, for example a 3-block would be an alphab... more Huffman encoding is often improved by using block codes, for example a 3-block would be an alphabet consisting of each possible combination of three characters. We take the approach of starting with a base alphabet and expanding it to include frequently occurring aggregates of symbols. We prove that the change in compressed message length by the introduction of a new aggregate symbol can be expressed as the difference of two entropies, dependent only on the probabilities and length of the introduced symbol. The expression is independent of the probability of all other symbols in the alphabet. This measure of information gain, for a new symbol, can be applied in data compression methods. We also demonstrate that aggregate symbol alphabets, as opposed to mutually exclusive alphabets have the potential to provide good levels of compression, with a simple experiment. Finally, compression gain as defined in this paper may also be useful for feature selection.
We describe a Bayesian framework for active learning for non-separable data, which incorporates a... more We describe a Bayesian framework for active learning for non-separable data, which incorporates a query density to explicitly model how new data is to be sampled. The model makes no assumption of independence between queried data-points; rather it updates model parameters on the basis of both observations and how those observations were sampled. A `hypothetical' look-ahead is employed to evaluate expected cost in the next time-step. We show the efficacy of this algorithm on the probabilistic high-low game which is a non-separable generalisation of the separable high-low game introduced by Seung et al. Our results indicate that the active Bayes algorithm performs significantly better than passive learning even when the overlap region is wide, covering over 30% of the feature space.
In coding theory it is widely known that the optimal encoding for a given alphabet of symbol code... more In coding theory it is widely known that the optimal encoding for a given alphabet of symbol codes is the Shannon entropy times the number of symbols to be encoded. However, depending on the structure of the message to be encoded it is possible to beat this optimal by including only frequently occurring aggregates of symbols from the base alphabet. We prove that the change in compressed message length by the introduction of a new aggregate symbol can be expressed as the difference of two entropies, dependent only on the probabilities of the characters within the aggregate plus a correction term which involves only the probability and length of the introduced symbol. The expression is independent of the probability of all other symbols in the alphabet. This measure of information gain, for a new symbol, can be applied in data compression methods.
Uploads
Papers by Richard Fredlund