1.
Encoding (Compression)
Step 1: Initialization
Start with an empty dictionary (or sliding window) to store
substrings.
Initialize the input data to be compressed.
Step 2: Search for Matches
Begin reading the input string character by character.
Check if the current substring exists in the dictionary:
o If a match is found:
Continue extending the substring until no match exists.
Record the match's position in the dictionary and its
length.
Add the next unmatched character (if available).
o If no match is found:
Record a match with a length of 0 and the current
character.
Add the new substring to the dictionary.
Step 3: Output Encoded Data
For each match, output a triplet (or tuple):
(position, length, character where:
o Position: The starting position of the match in the
dictionary.
o Length: The length of the matching substring.
o Character: The next character in the input string after the
match.
Step 4: Update the Dictionary
Add the current substring to the dictionary for future matches.
If using a sliding window (e.g., LZ77), remove older entries to limit
the dictionary size.
Example
Compress the string ABABABA:
At step 1: A → no match, output (0,0,A), add A to the dictionary.
At step 2: B → no match, output (0,0,B), add B to the dictionary.
At step 3: AB → match found, output (0,1,B)(0, 1, B)(0,1,B), add
AB to the dictionary.
At step 4: ABA → match found, output (1,2,A)(
Encoded output: (0,0,A),(0,0,B),(0,1,B),(1,2,A)
2. Decoding (Decompression)
Step 1: Initialization
Start with an empty dictionary to rebuild substrings.
Use the encoded triplets (position, length, character)
Step 2: Rebuild Substrings
For each triplet:
o Use the position and length to locate the matching substring
in the dictionary.
o Append the character to the matching substring.
o Add the resulting substring to the dictionary.
Step 3: Output the Reconstructed Data
Concatenate all reconstructed substrings to obtain the original
string.
Example
Decode the encoded sequence (0,0,A),(0,0,B),(0,1,B),(1,2,A) : No match,
output A, add A to the dictionary.
1. (0,0,B): No match, output B, add B to the dictionary.
2. (0,1,B): Match B, append B, output AB, add AB to the dictionary.
3. (1,2,A): Match AB, append A, output ABA, add ABA to the
dictionary.
Reconstructed output: ABABABA.