Нашли опечатку? Выделите ее мышкой и нажмите Ctrl+Enter
Название: Data Compression for Real Programmers
Автор: Wayner P.
Аннотация:
The science of compressing data is the art of creating shorthand representations for the data—that is, automatically finding abbreviations; i.e. yadda yadda yadda, etc.
All of the algorithms can be described with a simple phrase: Look for repetition, and replace the repetition with a shorter representation. This repetition is usually fairly easy to find. The letters rep are repeated eight times in this paragraph alone. If they were replaced with, say, the asterix character (*), then two characters would be saved eight times. It' s not much, but it' s a start.
The algorithms succeed when they have a good model for the underlying data. They can even fail when the model does a bad job of matching the data. The model of looking for three letters like rep works well in some sentences, but it fails in others. The art of designing the algorithm is really the art of finding a good model of the data that can also be fit to the data efficiently.
The algorithms in this book are different attempts to find a good, automatic way of identifying repetitive patterns and removing them from a file. Some work well on text data, while others are tuned to images or audio files. All of them, however, are far from perfect. If an algorithm has a strength, then it will also have a weakness. The best algorithm for some data is often the worst for other types of data. To paraphrase Abraham Lincoln: You can compress all of the types of files some of the time and some of the types of files all of the time, but you can' t compress all of the types of files all of the time.