Kuphatikizika kwa data pogwiritsa ntchito algorithm ya Huffman

kulowa

M'nkhaniyi ndilankhula za algorithm yodziwika bwino ya Huffman, komanso momwe imagwiritsidwira ntchito pakukanika kwa data.

Zotsatira zake, tidzalemba zolemba zosavuta. Izi zakambidwa kale Nkhani ya Habre, koma popanda kuchitapo kanthu. Nkhani zongopeka za positiyi zatengedwa kuchokera kumaphunziro a sayansi yamakompyuta akusukulu komanso buku la Robert Laforet "Data Structures and Algorithms in Java". Kotero, chirichonse chadulidwa!

Malingaliro ochepa

Mufayilo yanthawi zonse, chilembo chimodzi chimasungidwa ndi 8 bits (ASCII encoding) kapena 16 (Unicode encoding). Kenako tikambirana ma encoding a ASCII. Mwachitsanzo, tengani mzere s1 = “SUSIE AMATI NDI ZOsavuta”. Pali zilembo za 22 pamzerewu, mwachilengedwe, kuphatikiza mipata ndi mzere watsopano - 'n'. Fayilo yomwe ili ndi mzerewu idzalemera 22 * ​​8 = 176 bits. Funso limadzuka nthawi yomweyo: kodi ndizomveka kugwiritsa ntchito ma bits onse 8 kuti mulembe zilembo 1? Sitigwiritsa ntchito zilembo zonse za ASCII. Ngakhale atatero, zikanakhala zomveka kuti chilembo chofala kwambiri - S - apatsidwe kachidindo kakang'ono kwambiri, ndipo chilembo chosowa kwambiri - T (kapena U, kapena 'n') - kupatsidwa code yayitali. Izi ndi zomwe aligorivimu ya Huffman imakhala: ndikofunikira kupeza njira yabwino yolumikizira yomwe fayiloyo idzakhala ndi kulemera kochepa. Ndizomveka kuti kutalika kwa ma code kumasiyana pazizindikiro zosiyanasiyana - izi ndizomwe algorithm idakhazikitsidwa.

Coding

Bwanji osapereka chizindikiro cha 'S', mwachitsanzo, 1 pang'ono kutalika: 0 kapena 1. Chikhale 1. Kenako chilembo chachiwiri chodziwika bwino - ' ' (danga) - perekani 0. Tangoganizani kuti mwayamba kumasulira uthenga wanu - chingwe chosungidwa s1 - ndipo mukuwona kuti code imayamba ndi 1. Ndiye, mumachita chiyani: kodi ndi khalidwe S, kapena ndi khalidwe lina, mwachitsanzo A? Chifukwa chake, pali lamulo lofunikira:

Khodi iliyonse iyenera kukhala choyambirira cha ina

Lamulo ili ndilofunika kwambiri mu algorithm. Chifukwa chake, kupanga kachidindo kumayamba ndi tebulo lafupipafupi, lomwe limasonyeza mafupipafupi (chiwerengero cha zochitika) za chizindikiro chilichonse:

Kuphatikizika kwa data pogwiritsa ntchito algorithm ya Huffman Zilembo zomwe zimakonda kuchitika siziyenera kusungidwa zotheka chiwerengero cha bits. Ndipereka chitsanzo cha imodzi mwama tebulo omwe angatheke:

Kuphatikizika kwa data pogwiritsa ntchito algorithm ya Huffman Chifukwa chake uthenga wosungidwa udzawoneka motere:

10 01111 10 110 1111 00 10 010 1110 10 00 110 0110 00 110 10 00 1111 010 10 1110 01110

Ndinalekanitsa code ya munthu aliyense ndi danga. Izi sizichitika mu fayilo yothinikizidwa!
Funso likubuka: kodi mnyamata uyu adabwera bwanji ndi code kuti apange tebulo la zizindikiro? Izi zidzakambidwa pansipa.

Kupanga mtengo wa Huffman

Apa ndi pamene mitengo yosakira bayinare imabwera kudzapulumutsa. Osadandaula, simudzafunika kufufuza, kuyika, kapena kufufuta njira pano. Nayi kapangidwe ka mtengo mu java:

public class Node {
    private int frequence;
    private char letter;
    private Node leftChild;
    private Node rightChild;
    ...
}

class BinaryTree {
    private Node root;

    public BinaryTree() {
        root = new Node();
    }
    public BinaryTree(Node root) {
        this.root = root;
    }
    ...
}

Iyi si code yonse, code yonse idzakhala pansipa.

Nayi algorithm yopangira mtengo:

  1. Pangani chinthu cha Node kwa munthu aliyense kuchokera mu uthenga (mzere s1). Kwa ife padzakhala mfundo 9 (Node zinthu). Node iliyonse imakhala ndi magawo awiri a data: chizindikiro ndi ma frequency
  2. Pangani chinthu cha Mtengo (BinaryTree) pa Node iliyonse. Node imakhala muzu wa mtengo.
  3. Ikani mitengoyi pamzere wofunika kwambiri. Kutsika kwafupipafupi, kumakwera patsogolo. Chifukwa chake, pochotsa, dervo yokhala ndi ma frequency otsika kwambiri imasankhidwa nthawi zonse.

Kenako muyenera kuchita izi cyclically:

  1. Chotsani mitengo iwiri pamzere wofunikira ndikuwapanga kukhala ana a node yatsopano (mfundo yongopangidwa kumene popanda chilembo). Mafupipafupi a node yatsopano ndi ofanana ndi kuchuluka kwa ma frequency a mitengo iwiri yobadwa.
  2. Pa mfundo iyi, pangani mtengo wokhala ndi muzu pa mfundo iyi. Lowetsani mtengo uwu mumzere wofunikira. (Popeza mtengowo uli ndi ma frequency atsopano, umawonekera pamalo atsopano pamzere)
  3. Pitirizani masitepe 1 ndi 2 mpaka mutatsala mtengo umodzi wokha pamzere - mtengo wa Huffman

Ganizirani za algorithm iyi pamzere s1:

Kuphatikizika kwa data pogwiritsa ntchito algorithm ya Huffman

Apa chizindikiro "lf" (linefeed) chimatanthauza mzere watsopano, "sp" (danga) ndi danga.

Chotsatira ndi chiyani?

Ife tiri nawo mtengo wa Huffman. CHABWINO. Ndipo chochita nacho chiyani? Sangatenge ngakhale kwaulere. Kenako, muyenera kutsatira njira zonse zomwe zingatheke kuchokera ku muzu mpaka masamba a mtengowo. Tiyeni tigwirizane kutanthauza m'mphepete 0 ngati imatsogolera kwa mwana wakumanzere ndi 1 ngati ikupita kumanja. Kunena zowona, m'mawu awa, chizindikiro cha chizindikiro ndi njira yochokera ku muzu wa mtengo kupita kutsamba lomwe lili ndi chizindikiro chomwechi.

Kuphatikizika kwa data pogwiritsa ntchito algorithm ya Huffman

Umu ndi momwe tebulo la ma code zidakhalira. Dziwani kuti ngati tilingalira tebulo ili, tikhoza kunena za "kulemera" kwa chizindikiro chilichonse - ichi ndi kutalika kwa code yake. Kenako, mu mawonekedwe oponderezedwa, fayilo yoyambirira idzalemera: 2 * 3 + 2*4 + 3 * 3 + 6 * 2 + 1 * 4 + 1 * 5 + 2 * 4 + 4 * 2 + 1 * 5 = 65 bits. . Poyamba inkalemera ma bits 176. Chifukwa chake, tidachepetsa mpaka 176/65 = 2.7 nthawi! Koma izi ndi utopia. Coefficient yotereyi sizingatheke kupezedwa. Chifukwa chiyani? Izi zidzakambidwa pambuyo pake.

Decoding

Chabwino, mwina chinthu chophweka chomwe chatsala ndikujambula. Ndikuganiza kuti ambiri a inu mwaganiza kuti sitingangopanga fayilo yothinikizidwa popanda lingaliro lililonse la momwe idasungidwira - sitingathe kuyiyika! Inde, zinali zovuta kwa ine kuzindikira izi, koma ndiyenera kupanga fayilo yalemba table.txt yokhala ndi tebulo lopondereza:

01110
 00
A010
E1111
I110
S10
T0110
U01111
Y1110

Cholembera patebulo mu 'symbol' 'character code'. Chifukwa chiyani 01110 ilibe chizindikiro? M'malo mwake, ili ndi chizindikiro, kungoti zida za java zomwe ndimagwiritsa ntchito potulutsa fayilo, mawonekedwe atsopano - 'n' - amasinthidwa kukhala mzere watsopano (zingakhale zopusa bwanji). Choncho, mzere wopanda kanthu pamwamba ndi khalidwe la code 01110. Kwa code 00, khalidwe ndi danga kumayambiriro kwa mzere. Ndidzanena nthawi yomweyo kuti kwa chigawo chathu cha Khan, njira iyi yosungira tebulo inganene kuti ndiyopanda nzeru kwambiri. Koma n’zosavuta kuzimvetsa komanso kuzitsatira. Ndidzakhala wokondwa kumva malingaliro anu mu ndemanga zokhudzana ndi kukhathamiritsa.

Kukhala ndi tebulo ili kumapangitsa kuti zikhale zosavuta kuzizindikira. Tiyeni tikumbukire lamulo lomwe tidatsatira popanga encoding:

Khodi iliyonse iyenera kukhala choyambirira cha ina

Apa ndi pamene imagwira ntchito yotsogolera. Timawerenga motsatizana pang'ono ndi pang'ono ndipo, mwamsanga pamene chingwe chotsatira d, chokhala ndi ma bits owerengeka, chikufanana ndi encoding yomwe ikugwirizana ndi khalidwe, timadziwa nthawi yomweyo kuti khalidwe (ndipo lokha!) Kenako, timalemba zilembo pamzere wotsitsa (mzere womwe uli ndi uthenga wosinthidwa), sinthaninso mzere d, kenako ndikuwerenga fayilo yosungidwa.

Реализация

Yakwana nthawi yochititsa manyazi code yanga ndikulemba zolemba zakale. Tiyeni tizitcha Compressor.

Yambaninso. Choyamba, timalemba kalasi ya Node:

public class Node {
    private int frequence;//частота
    private char letter;//буква
    private Node leftChild;//левый потомок
    private Node rightChild;//правый потомок

   

    public Node(char letter, int frequence) { //собственно, конструктор
        this.letter = letter;
        this.frequence = frequence;
    }

    public Node() {}//перегрузка конструтора для безымянных узлов(см. выше в разделе о построении дерева Хаффмана)
    public void addChild(Node newNode) {//добавить потомка
        if (leftChild == null)//если левый пустой=> правый тоже=> добавляем в левый
            leftChild = newNode;
        else {
            if (leftChild.getFrequence() <= newNode.getFrequence()) //в общем, левым потомком
                rightChild = newNode;//станет тот, у кого меньше частота
            else {
                rightChild = leftChild;
                leftChild = newNode;
            }
        }

        frequence += newNode.getFrequence();//итоговая частота
    }

    public Node getLeftChild() {
        return leftChild;
    }

    public Node getRightChild() {
        return rightChild;
    }

    public int getFrequence() {
        return frequence;
    }

    public char getLetter() {
        return letter;
    }

    public boolean isLeaf() {//проверка на лист
        return leftChild == null && rightChild == null;
    }
}

Tsopano mtengo:

class BinaryTree {
    private Node root;

    public BinaryTree() {
        root = new Node();
    }

    public BinaryTree(Node root) {
        this.root = root;
    }

    public int getFrequence() {
        return root.getFrequence();
    }

    public Node getRoot() {
        return root;
    }
}

Mzere Wofunika Kwambiri:

import java.util.ArrayList;//да-да, очередь будет на базе списка

class PriorityQueue {
    private ArrayList<BinaryTree> data;//список очереди
    private int nElems;//кол-во элементов в очереди

    public PriorityQueue() {
        data = new ArrayList<BinaryTree>();
        nElems = 0;
    }

    public void insert(BinaryTree newTree) {//вставка
        if (nElems == 0)
            data.add(newTree);
        else {
            for (int i = 0; i < nElems; i++) {
                if (data.get(i).getFrequence() > newTree.getFrequence()) {//если частота вставляемого дерева меньше 
                    data.add(i, newTree);//чем част. текущего, то cдвигаем все деревья на позициях справа на 1 ячейку                   
                    break;//затем ставим новое дерево на позицию текущего
                }
                if (i == nElems - 1) 
                    data.add(newTree);
            }
        }
        nElems++;//увеличиваем кол-во элементов на 1
    }

    public BinaryTree remove() {//удаление из очереди
        BinaryTree tmp = data.get(0);//копируем удаляемый элемент
        data.remove(0);//собственно, удаляем
        nElems--;//уменьшаем кол-во элементов на 1
        return tmp;//возвращаем удаленный элемент(элемент с наименьшей частотой)
    }
}

Kalasi yomwe imapanga mtengo wa Huffman:

public class HuffmanTree {
    private final byte ENCODING_TABLE_SIZE = 127;//длина кодировочной таблицы
    private String myString;//сообщение
    private BinaryTree huffmanTree;//дерево Хаффмана
    private int[] freqArray;//частотная таблица
    private String[] encodingArray;//кодировочная таблица


    //----------------constructor----------------------
    public HuffmanTree(String newString) {
        myString = newString;

        freqArray = new int[ENCODING_TABLE_SIZE];
        fillFrequenceArray();

        huffmanTree = getHuffmanTree();

        encodingArray = new String[ENCODING_TABLE_SIZE];
        fillEncodingArray(huffmanTree.getRoot(), "", "");
    }

    //--------------------frequence array------------------------
    private void fillFrequenceArray() {
        for (int i = 0; i < myString.length(); i++) {
            freqArray[(int)myString.charAt(i)]++;
        }
    }

    public int[] getFrequenceArray() {
        return freqArray;
    }

    //------------------------huffman tree creation------------------
    private BinaryTree getHuffmanTree() {
        PriorityQueue pq = new PriorityQueue();
        //алгоритм описан выше
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {//если символ существует в строке
                Node newNode = new Node((char) i, freqArray[i]);//то создать для него Node
                BinaryTree newTree = new BinaryTree(newNode);//а для Node создать BinaryTree
                pq.insert(newTree);//вставить в очередь
            }
        }

        while (true) {
            BinaryTree tree1 = pq.remove();//извлечь из очереди первое дерево.

            try {
                BinaryTree tree2 = pq.remove();//извлечь из очереди второе дерево

                Node newNode = new Node();//создать новый Node
                newNode.addChild(tree1.getRoot());//сделать его потомками два извлеченных дерева
                newNode.addChild(tree2.getRoot());

                pq.insert(new BinaryTree(newNode);
            } catch (IndexOutOfBoundsException e) {//осталось одно дерево в очереди
                return tree1;
            }
        }
    }

    public BinaryTree getTree() {
        return huffmanTree;
    }

    //-------------------encoding array------------------
    void fillEncodingArray(Node node, String codeBefore, String direction) {//заполнить кодировочную таблицу
        if (node.isLeaf()) {
            encodingArray[(int)node.getLetter()] = codeBefore + direction;
        } else {
            fillEncodingArray(node.getLeftChild(), codeBefore + direction, "0");
            fillEncodingArray(node.getRightChild(), codeBefore + direction, "1");
        }
    }

    String[] getEncodingArray() {
        return encodingArray;
    }

    public void displayEncodingArray() {//для отладки
        fillEncodingArray(huffmanTree.getRoot(), "", "");

        System.out.println("======================Encoding table====================");
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {
                System.out.print((char)i + " ");
                System.out.println(encodingArray[i]);
            }
        }
        System.out.println("========================================================");
    }
    //-----------------------------------------------------
    String getOriginalString() {
        return myString;
    }
}

Kalasi yomwe ili ndi ma encodes/decode:

public class HuffmanOperator {
    private final byte ENCODING_TABLE_SIZE = 127;//длина таблицы
    private HuffmanTree mainHuffmanTree;//дерево Хаффмана (используется только для сжатия)
    private String myString;//исходное сообщение
    private int[] freqArray;//частотаная таблица
    private String[] encodingArray;//кодировочная таблица
    private double ratio;//коэффициент сжатия 


    public HuffmanOperator(HuffmanTree MainHuffmanTree) {//for compress
        this.mainHuffmanTree = MainHuffmanTree;

        myString = mainHuffmanTree.getOriginalString();

        encodingArray = mainHuffmanTree.getEncodingArray();

        freqArray = mainHuffmanTree.getFrequenceArray();
    }

    public HuffmanOperator() {}//for extract;

    //---------------------------------------compression-----------------------------------------------------------
    private String getCompressedString() {
        String compressed = "";
        String intermidiate = "";//промежуточная строка(без добавочных нулей)
        //System.out.println("=============================Compression=======================");
        //displayEncodingArray();
        for (int i = 0; i < myString.length(); i++) {
            intermidiate += encodingArray[myString.charAt(i)];
        }
        //Мы не можем писать бит в файл. Поэтому нужно сделать длину сообщения кратной 8=>
        //нужно добавить нули в конец(можно 1, нет разницы)
        byte counter = 0;//количество добавленных в конец нулей (байта в полне хватит: 0<=counter<8<127)
        for (int length = intermidiate.length(), delta = 8 - length % 8; 
        		counter < delta ; counter++) {//delta - количество добавленных нулей
            intermidiate += "0";
        }
        
        //склеить кол-во добавочных нулей в бинарном предаствлении и промежуточную строку 
        compressed = String.format("%8s", Integer.toBinaryString(counter & 0xff)).replace(" ", "0") + intermidiate;
        		
        //идеализированный коэффициент
        setCompressionRatio();
        //System.out.println("===============================================================");
        return compressed;
    }
    
    private void setCompressionRatio() {//посчитать идеализированный коэффициент 
        double sumA = 0, sumB = 0;//A-the original sum
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {
                sumA += 8 * freqArray[i];
                sumB += encodingArray[i].length() * freqArray[i];
            }
        }
        ratio = sumA / sumB;
    }

    public byte[] getBytedMsg() {//final compression
        StringBuilder compressedString = new StringBuilder(getCompressedString());
        byte[] compressedBytes = new byte[compressedString.length() / 8];
        for (int i = 0; i < compressedBytes.length; i++) {
                compressedBytes[i] = (byte) Integer.parseInt(compressedString.substring(i * 8, (i + 1) * 8), 2);
        }
        return compressedBytes;
    }
    //---------------------------------------end of compression----------------------------------------------------------------
    //------------------------------------------------------------extract-----------------------------------------------------
    public String extract(String compressed, String[] newEncodingArray) {
        String decompressed = "";
        String current = "";
        String delta = "";
        encodingArray = newEncodingArray;
        
        //displayEncodingArray();
        //получить кол-во вставленных нулей
        for (int i = 0; i < 8; i++) 
        	delta += compressed.charAt(i);
        int ADDED_ZEROES = Integer.parseInt(delta, 2);
       
        for (int i = 8, l = compressed.length() - ADDED_ZEROES; i < l; i++) {
            //i = 8, т.к. первым байтом у нас идет кол-во вставленных нулей
            current += compressed.charAt(i);
            for (int j = 0; j < ENCODING_TABLE_SIZE; j++) {
                if (current.equals(encodingArray[j])) {//если совпало
                    decompressed += (char)j;//то добавляем элемент
                    current = "";//и обнуляем текущую строку
                }
            }
        }

        return decompressed;
    }

    public String getEncodingTable() {
        String enc = "";
    	for (int i = 0; i < encodingArray.length; i++) {
        	if (freqArray[i] != 0) 
        		enc += (char)i + encodingArray[i] + 'n';
        }
    	return enc;
    }

    public double getCompressionRatio() {
        return ratio;
    }


    public void displayEncodingArray() {//для отладки
        System.out.println("======================Encoding table====================");
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            //if (freqArray[i] != 0) {
                System.out.print((char)i + " ");
                System.out.println(encodingArray[i]);
            //}
        }
        System.out.println("========================================================");
    }
    }

Kalasi yomwe imapangitsa kukhala kosavuta kulembera fayilo:

import java.io.File;
import java.io.PrintWriter;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.Closeable;

public class FileOutputHelper implements Closeable {
    private File outputFile;
    private FileOutputStream fileOutputStream;

    public FileOutputHelper(File file) throws FileNotFoundException {
        outputFile = file;
        fileOutputStream = new FileOutputStream(outputFile);
    }

    public void writeByte(byte msg) throws IOException {
        fileOutputStream.write(msg);
    }

    public void writeBytes(byte[] msg) throws IOException {
        fileOutputStream.write(msg);
    }

    public void writeString(String msg) {
    	try (PrintWriter pw = new PrintWriter(outputFile)) {
    		pw.write(msg);
    	} catch (FileNotFoundException e) {
    		System.out.println("Неверный путь, или такого файла не существует!");
    	}
    }

    @Override
    public void close() throws IOException {
        fileOutputStream.close();
    }

    public void finalize() throws IOException {
        close();
    }
}

Kalasi yomwe imapangitsa kuti zikhale zosavuta kuwerenga kuchokera pafayilo:

import java.io.FileInputStream;
import java.io.EOFException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.Closeable;
import java.io.File;
import java.io.IOException;

public class FileInputHelper implements Closeable {
	private FileInputStream fileInputStream;
	private BufferedReader fileBufferedReader;
	
	public FileInputHelper(File file) throws IOException {
		fileInputStream = new FileInputStream(file);
		fileBufferedReader = new BufferedReader(new InputStreamReader(fileInputStream));
	}
	
	
    public byte readByte() throws IOException {
    	int cur = fileInputStream.read();
    	if (cur == -1)//если закончился файл
    		throw new EOFException();
    	return (byte)cur;
    }
    
    public String readLine() throws IOException {
    	return fileBufferedReader.readLine();
    }
    
    @Override
    public void close() throws IOException{
    	fileInputStream.close();
    }
}

Chabwino, ndi kalasi yaikulu:

import java.io.File;
import java.nio.charset.MalformedInputException;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.NoSuchFileException;
import java.nio.file.Paths;
import java.util.List;
import java.io.EOFException;
public class Main {
	private static final byte ENCODING_TABLE_SIZE = 127;
	
    public static void main(String[] args) throws IOException {
        try {//указываем инструкцию с помощью аргументов командной строки
            if (args[0].equals("--compress") || args[0].equals("-c"))
                compress(args[1]);
            else if ((args[0].equals("--extract") || args[0].equals("-x"))
            		&& (args[2].equals("--table") || args[2].equals("-t"))) {
            	extract(args[1], args[3]);
            }
            else
                throw new IllegalArgumentException();
        } catch (ArrayIndexOutOfBoundsException | IllegalArgumentException e) {
            System.out.println("Неверный формат ввода аргументов ");
            System.out.println("Читайте Readme.txt");
            e.printStackTrace();
        }
    }

	public static void compress(String stringPath) throws IOException {
        List<String> stringList;
        File inputFile = new File(stringPath);
        String s = "";
        File compressedFile, table;
        
        try {
            stringList = Files.readAllLines(Paths.get(inputFile.getAbsolutePath()));
        } catch (NoSuchFileException e) {
            System.out.println("Неверный путь, или такого файла не существует!");
            return;
        } catch (MalformedInputException e) {
        	System.out.println("Текущая кодировка файла не поддерживается");
        	return;
        }

        for (String item : stringList) {
            s += item;
            s += 'n';
        }

        HuffmanOperator operator = new HuffmanOperator(new HuffmanTree(s));

        compressedFile = new File(inputFile.getAbsolutePath() + ".cpr");
        compressedFile.createNewFile();
        try (FileOutputHelper fo = new FileOutputHelper(compressedFile)) {
        	fo.writeBytes(operator.getBytedMsg());
        }
        //create file with encoding table:
        
        table = new File(inputFile.getAbsolutePath() + ".table.txt");
        table.createNewFile();
        try (FileOutputHelper fo = new FileOutputHelper(table)) {
        	fo.writeString(operator.getEncodingTable());
        }
        
        System.out.println("Путь к сжатому файлу: " + compressedFile.getAbsolutePath());
        System.out.println("Путь к кодировочной таблице " + table.getAbsolutePath());
        System.out.println("Без таблицы файл будет невозможно извлечь!");
        
        double idealRatio = Math.round(operator.getCompressionRatio() * 100) / (double) 100;//идеализированный коэффициент
        double realRatio = Math.round((double) inputFile.length() 
        		/ ((double) compressedFile.length() + (double) table.length()) * 100) / (double)100;//настоящий коэффициент
        
        System.out.println("Идеализированный коэффициент сжатия равен " + idealRatio);
        System.out.println("Коэффициент сжатия с учетом кодировочной таблицы " + realRatio);
    }

    public static void extract(String filePath, String tablePath) throws FileNotFoundException, IOException {
        HuffmanOperator operator = new HuffmanOperator();
        File compressedFile = new File(filePath),
        	 tableFile = new File(tablePath),
        	 extractedFile = new File(filePath + ".xtr");
        String compressed = "";
        String[] encodingArray = new String[ENCODING_TABLE_SIZE];
        //read compressed file
        //!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!check here:
        try (FileInputHelper fi = new FileInputHelper(compressedFile)) {
        	byte b;
        	while (true) {
        		b = fi.readByte();//method returns EOFException
        		compressed += String.format("%8s", Integer.toBinaryString(b & 0xff)).replace(" ", "0");
        	}
        } catch (EOFException e) {
        	
        }
        
        //--------------------
        
        //read encoding table:
        try (FileInputHelper fi = new FileInputHelper(tableFile)) {
        	fi.readLine();//skip first empty string
        	encodingArray[(byte)'n'] = fi.readLine();//read code for 'n'
        	while (true) {
        		String s = fi.readLine();
        		if (s == null)
        			throw new EOFException();
        		encodingArray[(byte)s.charAt(0)] = s.substring(1, s.length());        		
        	}
        } catch (EOFException ignore) {}
        
        extractedFile.createNewFile();
        //extract:
		try (FileOutputHelper fo = new FileOutputHelper(extractedFile)) {
			fo.writeString(operator.extract(compressed, encodingArray));
		}
		
		System.out.println("Путь к распакованному файлу " + extractedFile.getAbsolutePath());
    }
}

Muyenera kulemba fayilo ya readme.txt nokha :)

Pomaliza

Ndikuganiza kuti ndizo zonse zomwe ndimafuna kunena. Ngati muli ndi chilichonse choti munene pakulephera kwanga kukonza ma code, algorithm, kapena kukhathamiritsa kulikonse, khalani omasuka kulemba. Ngati sindinafotokoze kalikonse, chonde lembaninso. Ndikufuna kumva kuchokera kwa inu mu ndemanga!

PS

Inde, inde, ndidakali pano, chifukwa sindinaiwale za coefficient. Kwa chingwe s1, tebulo la encoding likulemera ma byte 48 - lalikulu kwambiri kuposa fayilo yochokera, ndipo sitinaiwale za ziro zowonjezera (chiwerengero cha ziro zowonjezera ndi 7) => chiwerengero cha kuponderezana chidzakhala chocheperapo chimodzi: 176 / (65 + 48*8 + 7) = 0.38. Ngati munazindikiranso izi, ndiye kuti si nkhope yanu yokha yomwe ili yabwino. Inde, kukhazikitsa uku sikukhala kothandiza kwambiri pamafayilo ang'onoang'ono. Koma chimachitika ndi chiyani pamafayilo akulu? Makulidwe a mafayilo ndi akulu kuposa kukula kwa tebulo la encoding. Apa ndipamene algorithm imagwira ntchito momwe iyenera! Mwachitsanzo, kwa Faust's monologue Wosungiramo zinthu zakale amatulutsa coefficient yeniyeni (yosavomerezeka) ya 1.46 - pafupifupi nthawi imodzi ndi theka! Ndipo inde, fayiloyo idayenera kukhala mu Chingerezi.

Source: www.habr.com

Kuwonjezera ndemanga