Uxinzelelo lwedatha kunye ne-algorithm ye-Huffman

ukungena

Kweli nqaku ndiza kuthetha nge-algorithm edumileyo ye-Huffman, kunye nokusetyenziswa kwayo kuxinzelelo lwedatha.

Ngenxa yoko, siya kubhala i-archiver elula. Oku sele kuxoxiwe inqaku ngoHabré, kodwa ngaphandle kokuphunyezwa okusebenzayo. Izinto zethiyori zeposti yangoku zithathwa kwizifundo zesayensi yekhomputha yesikolo kunye nencwadi kaRobert Laforet ethi "IiNkcukacha zeDatha kunye ne-Algorithms kwiJava". Ngoko, yonke into isikiwe!

Iingcinga ezimbalwa

Kwifayile yombhalo oqhelekileyo, unobumba omnye ufakwe ngekhowudi nge-8 bits (ASCII encoding) okanye 16 (Unicode encoding). Okulandelayo siza kuthathela ingqalelo i ASCII encoding. Umzekelo, thatha umgca s1 = “USUSIE UTHI KULULA”. Kukho inani lamagama angama-22 emgceni, ngokwendalo, kuquka izithuba kunye nomlinganiswa omtsha womgca - 'n'. Ifayile equkethe lo mgca iya kuba nobunzima be-22 * 8 = 176 bits. Umbuzo uvela ngokukhawuleza: ngaba kunengqiqo ukusebenzisa zonke iibhithi ezi-8 ukubethelela umlinganiswa o-1? Asisebenzisi bonke abalinganiswa be-ASCII. Nokuba bekunjalo, bekuya kuba sengqiqweni ngakumbi ukuba oyena unobumba uqhelekileyo - S - anikwe eyona khowudi imfutshane, kwaye unobumba onqabileyo - T (okanye u-U, okanye 'n') - anikwe ikhowudi ende. Yile nto i-algorithm ye-Huffman iqulethwe: kuyafuneka ukufumana eyona ndlela ilungileyo yokufaka ikhowudi apho ifayile iya kuba nobunzima obuncinci. Yinto eqhelekileyo ukuba ubude bekhowudi buya kwahluka kwiisimboli ezahlukeneyo - yile nto i-algorithm isekwe kuyo.

Ukwenza iikhowudi

Kutheni unganiki umlinganiswa 'S' ikhowudi, umzekelo, i-1 bit ubude: 0 okanye 1. Mayibe ngu-1. Emva koko owesibini owona uxhaphakileyo-' ' (isithuba) - nika u-0. Yiba nomfanekiso uqale ukudekhowuda umyalezo wakho - Umtya okhowudiweyo s1 - kwaye uyabona ukuba ikhowudi iqala ngo 1. Ke, wenza ntoni: ingaba ngu S lo, okanye ngomnye umsebenzi, umzekelo A? Ngoko ke, kuvela umgaqo obalulekileyo:

Nayiphi na ikhowudi kufuneka ibe sisiqalo yenye

Lo mgaqo ungundoqo kwi-algorithm. Ke ngoko, ukwenza ikhowudi iqala ngetafile rhoqo, ebonisa ukuphindaphinda (inani lezehlo) zesimboli nganye:

Uxinzelelo lwedatha kunye ne-algorithm ye-Huffman Oonobumba abaneziganeko ezininzi kufuneka bafakwe kwikhowudi encinci kunokwenzeka inani lamasuntswana. Ndiza kunika umzekelo yenye yeetheyibhile zekhowudi ezinokwenzeka:

Uxinzelelo lwedatha kunye ne-algorithm ye-Huffman Ke umyalezo ofakwe ngekhowudi uya kujongeka ngolu hlobo:

10 01111 10 110 1111 00 10 010 1110 10 00 110 0110 00 110 10 00 1111 010 10 1110 01110

Ndahlula ikhowudi yomlinganiswa ngamnye ngesithuba. Oku akuyi kwenzeka kwifayile ecinezelwe ngokwenene!
Umbuzo uvela: njani lo mfana osemncinci weza nekhowudi yokudala itafile yeekhowudi? Oku kuya kuxoxwa ngezantsi.

Ukwakha umthi weHuffman

Apha kulapho imithi yokukhangela yokubini isiza khona. Sukuba nexhala, awuzukufuna ukhangelo, ufake, okanye ucime iindlela apha. Nasi isakhiwo somthi kwi-java:

public class Node {
    private int frequence;
    private char letter;
    private Node leftChild;
    private Node rightChild;
    ...
}

class BinaryTree {
    private Node root;

    public BinaryTree() {
        root = new Node();
    }
    public BinaryTree(Node root) {
        this.root = root;
    }
    ...
}

Oku akusiyo ikhowudi epheleleyo, ikhowudi epheleleyo iya kuba ngezantsi.

Nantsi i-algorithm yokwenza umthi:

  1. Yenza into yeNode yomlinganiswa ngamnye osuka kumyalezo (umgca s1). Kwimeko yethu kuya kuba ne-9 nodes (izinto zeNode). I-node nganye ineenkalo ezimbini zedatha: isimboli kunye nobuninzi
  2. Yenza into yoMthi (BinaryTree) kwiNode nganye. I-node iba yingcambu yomthi.
  3. Faka le mithi kumgca obalulekileyo. Okukhona kusezantsi i-frequency, kokukhona kuphezulu okuphambili. Ngaloo ndlela, xa ukhupha, i-dervo ene-frequency ephantsi ihlala ikhethiwe.

Okulandelayo kufuneka wenze oku kulandelayo ngomjikelo:

  1. Susa imithi emibini kumgca ophambili kwaye ubenze abantwana be-node entsha (i-node entsha eyenziwe ngaphandle kweleta). I-frequency ye-node entsha ilingana ne-sum of frequencies yemithi emibini yenzala.
  2. Kule nodi, yenza umthi onengcambu kule nodi. Faka lo mthi umva kumgca obalulekileyo. (Ekubeni umthi une-frequency entsha, iya kubonakala kwindawo entsha emgceni)
  3. Qhubeka namanyathelo oku-1 kunye ne-2 kude kubekho umthi omnye kuphela oseleyo emgceni - umthi we-Huffman

Qwalasela le algorithm kumgca s1:

Uxinzelelo lwedatha kunye ne-algorithm ye-Huffman

Apha isimboli “lf” (linefeed) sithetha umgca omtsha, “sp” (isithuba) sisithuba.

Yintoni elandelayo?

Sifumene umthi weHuffman. KULUNGILE. Kwaye ukwenza ntoni ngayo? Abayi kuyithatha simahla kwaye ke, kufuneka ulandele zonke iindlela ezinokwenzeka ukusuka kwingcambu ukuya emagqabini omthi. Masivumelane ukubonisa u-0 ukuba ukhokelela kumntwana osekhohlo kunye no-1 ukuba ukhokelela kwasekunene. Ukuthetha ngokuthe ngqo, kolu phawu, ikhowudi yesimboli yindlela esuka kwingcambu yomthi ukuya egqabini eliqulathe olu simboli.

Uxinzelelo lwedatha kunye ne-algorithm ye-Huffman

Yile ndlela itheyibhile yeekhowudi yavela ngayo. Qaphela ukuba ukuba siqwalasela le theyibhile, sinokugqiba malunga "nobunzima" besimboli ngasinye - lo ubude bekhowudi yayo. Emva koko, kwifom ecinezelweyo, ifayile yokuqala iya kuba nobunzima: 2 * 3 + 2*4 + 3 * 3 + 6 * 2 + 1 * 4 + 1 * 5 + 2 * 4 + 4 * 2 + 1 * 5 = 65 bits. . Ekuqaleni yayinobunzima beebhithi ezili-176. Ngenxa yoko, siye sayinciphisa ngo-176/65 = 2.7 amaxesha! Kodwa le yi-utopia. I-coefficient enjalo ayinakwenzeka ukuba ifumaneke. Ngoba? Oku kuya kuxutyushwa kamva.

Ukuguqula iikhowudi

Kulungile, mhlawumbi eyona nto ilula eseleyo kukwenza iikhowudi. Ndicinga ukuba uninzi lwenu luqikelele ukuba asinakwenza ngokulula ifayile ecinezelweyo ngaphandle koluthfifi lwendlela efakwe ngayo iikhowudi - asizukwazi ukuyicacisa! Ewe, ewe, bekunzima kum ukuqonda oku, kodwa kuya kufuneka ndenze ifayile yokubhaliweyo yetafile.txt ngetafile yocinezelo:

01110
 00
A010
E1111
I110
S10
T0110
U01111
Y1110

Ungeniso kwitheyibhile ngendlela 'yesimboli' 'ikhowudi yomlinganiswa'. Kutheni u-01110 engenasimboli? Ngapha koko, inesimboli, yinto nje yokuba izixhobo ze-java endizisebenzisayo xa ndikhupha ifayile, unobumba omtsha womgca- 'n' - uguqulwa ube ngumgca omtsha (nokuba ingaba bubudenge kangakanani na). Ngoko ke, umgca ongenanto phezulu ngumlingisi wekhowudi 01110. Kwikhowudi 00, umlingiswa uyindawo ekuqaleni komgca. Ndizakuthi ngoko nangoko kwi-Coefficient yethu ye-Khan, le ndlela yokugcina itafile inokubanga ukuba yeyona nto ingenangqiqo. Kodwa kulula ukuyiqonda nokuphumeza. Ndiya konwaba ukuva iingcebiso zakho kwizimvo malunga nokwenza ngcono.

Ukuba nale theyibhile kwenza kube lula kakhulu ukuyicacisa. Masikhumbule ukuba ngowuphi umgaqo esiwulandelayo xa sisenza ikhowudi:

Nayiphi na ikhowudi kufuneka ibe sisiqalo yenye

Kulapho idlala indima yokuququzelela. Sifunda ngokulandelelana kancinci kancinci kwaye, ngokukhawuleza ukuba umtya ophumayo u-d, oquka amasuntswana afundwayo, uhambelana ne-encoding ehambelana nomlinganiswa womlingiswa, ngokukhawuleza sazi ukuba umlinganiswa womlinganiswa (kwaye kuphela!) wayekhowudiwe. Emva koko, sibhala umlingiswa kumgca we-decoding (umgca oqulethe umyalezo ogqityiweyo), setha kwakhona umgca d, uze ufunde ifayile ekhowudiweyo.

Ukuphunyezwa

Lixesha lokuhlazisa ikhowudi yam kwaye ndibhale i-archiver. Masiyibize ngokuba yiCompressor.

Qalela ekuqaleni. Okokuqala, sibhala iklasi yeNode:

public class Node {
    private int frequence;//частота
    private char letter;//буква
    private Node leftChild;//левый потомок
    private Node rightChild;//правый потомок

   

    public Node(char letter, int frequence) { //собственно, конструктор
        this.letter = letter;
        this.frequence = frequence;
    }

    public Node() {}//перегрузка конструтора для безымянных узлов(см. выше в разделе о построении дерева Хаффмана)
    public void addChild(Node newNode) {//добавить потомка
        if (leftChild == null)//если левый пустой=> правый тоже=> добавляем в левый
            leftChild = newNode;
        else {
            if (leftChild.getFrequence() <= newNode.getFrequence()) //в общем, левым потомком
                rightChild = newNode;//станет тот, у кого меньше частота
            else {
                rightChild = leftChild;
                leftChild = newNode;
            }
        }

        frequence += newNode.getFrequence();//итоговая частота
    }

    public Node getLeftChild() {
        return leftChild;
    }

    public Node getRightChild() {
        return rightChild;
    }

    public int getFrequence() {
        return frequence;
    }

    public char getLetter() {
        return letter;
    }

    public boolean isLeaf() {//проверка на лист
        return leftChild == null && rightChild == null;
    }
}

Ngoku umthi:

class BinaryTree {
    private Node root;

    public BinaryTree() {
        root = new Node();
    }

    public BinaryTree(Node root) {
        this.root = root;
    }

    public int getFrequence() {
        return root.getFrequence();
    }

    public Node getRoot() {
        return root;
    }
}

Uluhlu oluphambili:

import java.util.ArrayList;//да-да, очередь будет на базе списка

class PriorityQueue {
    private ArrayList<BinaryTree> data;//список очереди
    private int nElems;//кол-во элементов в очереди

    public PriorityQueue() {
        data = new ArrayList<BinaryTree>();
        nElems = 0;
    }

    public void insert(BinaryTree newTree) {//вставка
        if (nElems == 0)
            data.add(newTree);
        else {
            for (int i = 0; i < nElems; i++) {
                if (data.get(i).getFrequence() > newTree.getFrequence()) {//если частота вставляемого дерева меньше 
                    data.add(i, newTree);//чем част. текущего, то cдвигаем все деревья на позициях справа на 1 ячейку                   
                    break;//затем ставим новое дерево на позицию текущего
                }
                if (i == nElems - 1) 
                    data.add(newTree);
            }
        }
        nElems++;//увеличиваем кол-во элементов на 1
    }

    public BinaryTree remove() {//удаление из очереди
        BinaryTree tmp = data.get(0);//копируем удаляемый элемент
        data.remove(0);//собственно, удаляем
        nElems--;//уменьшаем кол-во элементов на 1
        return tmp;//возвращаем удаленный элемент(элемент с наименьшей частотой)
    }
}

Iklasi eyenza umthi we-Huffman:

public class HuffmanTree {
    private final byte ENCODING_TABLE_SIZE = 127;//длина кодировочной таблицы
    private String myString;//сообщение
    private BinaryTree huffmanTree;//дерево Хаффмана
    private int[] freqArray;//частотная таблица
    private String[] encodingArray;//кодировочная таблица


    //----------------constructor----------------------
    public HuffmanTree(String newString) {
        myString = newString;

        freqArray = new int[ENCODING_TABLE_SIZE];
        fillFrequenceArray();

        huffmanTree = getHuffmanTree();

        encodingArray = new String[ENCODING_TABLE_SIZE];
        fillEncodingArray(huffmanTree.getRoot(), "", "");
    }

    //--------------------frequence array------------------------
    private void fillFrequenceArray() {
        for (int i = 0; i < myString.length(); i++) {
            freqArray[(int)myString.charAt(i)]++;
        }
    }

    public int[] getFrequenceArray() {
        return freqArray;
    }

    //------------------------huffman tree creation------------------
    private BinaryTree getHuffmanTree() {
        PriorityQueue pq = new PriorityQueue();
        //алгоритм описан выше
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {//если символ существует в строке
                Node newNode = new Node((char) i, freqArray[i]);//то создать для него Node
                BinaryTree newTree = new BinaryTree(newNode);//а для Node создать BinaryTree
                pq.insert(newTree);//вставить в очередь
            }
        }

        while (true) {
            BinaryTree tree1 = pq.remove();//извлечь из очереди первое дерево.

            try {
                BinaryTree tree2 = pq.remove();//извлечь из очереди второе дерево

                Node newNode = new Node();//создать новый Node
                newNode.addChild(tree1.getRoot());//сделать его потомками два извлеченных дерева
                newNode.addChild(tree2.getRoot());

                pq.insert(new BinaryTree(newNode);
            } catch (IndexOutOfBoundsException e) {//осталось одно дерево в очереди
                return tree1;
            }
        }
    }

    public BinaryTree getTree() {
        return huffmanTree;
    }

    //-------------------encoding array------------------
    void fillEncodingArray(Node node, String codeBefore, String direction) {//заполнить кодировочную таблицу
        if (node.isLeaf()) {
            encodingArray[(int)node.getLetter()] = codeBefore + direction;
        } else {
            fillEncodingArray(node.getLeftChild(), codeBefore + direction, "0");
            fillEncodingArray(node.getRightChild(), codeBefore + direction, "1");
        }
    }

    String[] getEncodingArray() {
        return encodingArray;
    }

    public void displayEncodingArray() {//для отладки
        fillEncodingArray(huffmanTree.getRoot(), "", "");

        System.out.println("======================Encoding table====================");
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {
                System.out.print((char)i + " ");
                System.out.println(encodingArray[i]);
            }
        }
        System.out.println("========================================================");
    }
    //-----------------------------------------------------
    String getOriginalString() {
        return myString;
    }
}

Iklasi equlathe ukuba yeyiphi ikhowudi/ikhowudi:

public class HuffmanOperator {
    private final byte ENCODING_TABLE_SIZE = 127;//длина таблицы
    private HuffmanTree mainHuffmanTree;//дерево Хаффмана (используется только для сжатия)
    private String myString;//исходное сообщение
    private int[] freqArray;//частотаная таблица
    private String[] encodingArray;//кодировочная таблица
    private double ratio;//коэффициент сжатия 


    public HuffmanOperator(HuffmanTree MainHuffmanTree) {//for compress
        this.mainHuffmanTree = MainHuffmanTree;

        myString = mainHuffmanTree.getOriginalString();

        encodingArray = mainHuffmanTree.getEncodingArray();

        freqArray = mainHuffmanTree.getFrequenceArray();
    }

    public HuffmanOperator() {}//for extract;

    //---------------------------------------compression-----------------------------------------------------------
    private String getCompressedString() {
        String compressed = "";
        String intermidiate = "";//промежуточная строка(без добавочных нулей)
        //System.out.println("=============================Compression=======================");
        //displayEncodingArray();
        for (int i = 0; i < myString.length(); i++) {
            intermidiate += encodingArray[myString.charAt(i)];
        }
        //Мы не можем писать бит в файл. Поэтому нужно сделать длину сообщения кратной 8=>
        //нужно добавить нули в конец(можно 1, нет разницы)
        byte counter = 0;//количество добавленных в конец нулей (байта в полне хватит: 0<=counter<8<127)
        for (int length = intermidiate.length(), delta = 8 - length % 8; 
        		counter < delta ; counter++) {//delta - количество добавленных нулей
            intermidiate += "0";
        }
        
        //склеить кол-во добавочных нулей в бинарном предаствлении и промежуточную строку 
        compressed = String.format("%8s", Integer.toBinaryString(counter & 0xff)).replace(" ", "0") + intermidiate;
        		
        //идеализированный коэффициент
        setCompressionRatio();
        //System.out.println("===============================================================");
        return compressed;
    }
    
    private void setCompressionRatio() {//посчитать идеализированный коэффициент 
        double sumA = 0, sumB = 0;//A-the original sum
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {
                sumA += 8 * freqArray[i];
                sumB += encodingArray[i].length() * freqArray[i];
            }
        }
        ratio = sumA / sumB;
    }

    public byte[] getBytedMsg() {//final compression
        StringBuilder compressedString = new StringBuilder(getCompressedString());
        byte[] compressedBytes = new byte[compressedString.length() / 8];
        for (int i = 0; i < compressedBytes.length; i++) {
                compressedBytes[i] = (byte) Integer.parseInt(compressedString.substring(i * 8, (i + 1) * 8), 2);
        }
        return compressedBytes;
    }
    //---------------------------------------end of compression----------------------------------------------------------------
    //------------------------------------------------------------extract-----------------------------------------------------
    public String extract(String compressed, String[] newEncodingArray) {
        String decompressed = "";
        String current = "";
        String delta = "";
        encodingArray = newEncodingArray;
        
        //displayEncodingArray();
        //получить кол-во вставленных нулей
        for (int i = 0; i < 8; i++) 
        	delta += compressed.charAt(i);
        int ADDED_ZEROES = Integer.parseInt(delta, 2);
       
        for (int i = 8, l = compressed.length() - ADDED_ZEROES; i < l; i++) {
            //i = 8, т.к. первым байтом у нас идет кол-во вставленных нулей
            current += compressed.charAt(i);
            for (int j = 0; j < ENCODING_TABLE_SIZE; j++) {
                if (current.equals(encodingArray[j])) {//если совпало
                    decompressed += (char)j;//то добавляем элемент
                    current = "";//и обнуляем текущую строку
                }
            }
        }

        return decompressed;
    }

    public String getEncodingTable() {
        String enc = "";
    	for (int i = 0; i < encodingArray.length; i++) {
        	if (freqArray[i] != 0) 
        		enc += (char)i + encodingArray[i] + 'n';
        }
    	return enc;
    }

    public double getCompressionRatio() {
        return ratio;
    }


    public void displayEncodingArray() {//для отладки
        System.out.println("======================Encoding table====================");
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            //if (freqArray[i] != 0) {
                System.out.print((char)i + " ");
                System.out.println(encodingArray[i]);
            //}
        }
        System.out.println("========================================================");
    }
    }

Iklasi eyenza kube lula ukubhala kwifayile:

import java.io.File;
import java.io.PrintWriter;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.Closeable;

public class FileOutputHelper implements Closeable {
    private File outputFile;
    private FileOutputStream fileOutputStream;

    public FileOutputHelper(File file) throws FileNotFoundException {
        outputFile = file;
        fileOutputStream = new FileOutputStream(outputFile);
    }

    public void writeByte(byte msg) throws IOException {
        fileOutputStream.write(msg);
    }

    public void writeBytes(byte[] msg) throws IOException {
        fileOutputStream.write(msg);
    }

    public void writeString(String msg) {
    	try (PrintWriter pw = new PrintWriter(outputFile)) {
    		pw.write(msg);
    	} catch (FileNotFoundException e) {
    		System.out.println("Неверный путь, или такого файла не существует!");
    	}
    }

    @Override
    public void close() throws IOException {
        fileOutputStream.close();
    }

    public void finalize() throws IOException {
        close();
    }
}

Iklasi eyenza kube lula ukufunda kwifayile:

import java.io.FileInputStream;
import java.io.EOFException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.Closeable;
import java.io.File;
import java.io.IOException;

public class FileInputHelper implements Closeable {
	private FileInputStream fileInputStream;
	private BufferedReader fileBufferedReader;
	
	public FileInputHelper(File file) throws IOException {
		fileInputStream = new FileInputStream(file);
		fileBufferedReader = new BufferedReader(new InputStreamReader(fileInputStream));
	}
	
	
    public byte readByte() throws IOException {
    	int cur = fileInputStream.read();
    	if (cur == -1)//если закончился файл
    		throw new EOFException();
    	return (byte)cur;
    }
    
    public String readLine() throws IOException {
    	return fileBufferedReader.readLine();
    }
    
    @Override
    public void close() throws IOException{
    	fileInputStream.close();
    }
}

Ewe, kunye neklasi ephambili:

import java.io.File;
import java.nio.charset.MalformedInputException;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.NoSuchFileException;
import java.nio.file.Paths;
import java.util.List;
import java.io.EOFException;
public class Main {
	private static final byte ENCODING_TABLE_SIZE = 127;
	
    public static void main(String[] args) throws IOException {
        try {//указываем инструкцию с помощью аргументов командной строки
            if (args[0].equals("--compress") || args[0].equals("-c"))
                compress(args[1]);
            else if ((args[0].equals("--extract") || args[0].equals("-x"))
            		&& (args[2].equals("--table") || args[2].equals("-t"))) {
            	extract(args[1], args[3]);
            }
            else
                throw new IllegalArgumentException();
        } catch (ArrayIndexOutOfBoundsException | IllegalArgumentException e) {
            System.out.println("Неверный формат ввода аргументов ");
            System.out.println("Читайте Readme.txt");
            e.printStackTrace();
        }
    }

	public static void compress(String stringPath) throws IOException {
        List<String> stringList;
        File inputFile = new File(stringPath);
        String s = "";
        File compressedFile, table;
        
        try {
            stringList = Files.readAllLines(Paths.get(inputFile.getAbsolutePath()));
        } catch (NoSuchFileException e) {
            System.out.println("Неверный путь, или такого файла не существует!");
            return;
        } catch (MalformedInputException e) {
        	System.out.println("Текущая кодировка файла не поддерживается");
        	return;
        }

        for (String item : stringList) {
            s += item;
            s += 'n';
        }

        HuffmanOperator operator = new HuffmanOperator(new HuffmanTree(s));

        compressedFile = new File(inputFile.getAbsolutePath() + ".cpr");
        compressedFile.createNewFile();
        try (FileOutputHelper fo = new FileOutputHelper(compressedFile)) {
        	fo.writeBytes(operator.getBytedMsg());
        }
        //create file with encoding table:
        
        table = new File(inputFile.getAbsolutePath() + ".table.txt");
        table.createNewFile();
        try (FileOutputHelper fo = new FileOutputHelper(table)) {
        	fo.writeString(operator.getEncodingTable());
        }
        
        System.out.println("Путь к сжатому файлу: " + compressedFile.getAbsolutePath());
        System.out.println("Путь к кодировочной таблице " + table.getAbsolutePath());
        System.out.println("Без таблицы файл будет невозможно извлечь!");
        
        double idealRatio = Math.round(operator.getCompressionRatio() * 100) / (double) 100;//идеализированный коэффициент
        double realRatio = Math.round((double) inputFile.length() 
        		/ ((double) compressedFile.length() + (double) table.length()) * 100) / (double)100;//настоящий коэффициент
        
        System.out.println("Идеализированный коэффициент сжатия равен " + idealRatio);
        System.out.println("Коэффициент сжатия с учетом кодировочной таблицы " + realRatio);
    }

    public static void extract(String filePath, String tablePath) throws FileNotFoundException, IOException {
        HuffmanOperator operator = new HuffmanOperator();
        File compressedFile = new File(filePath),
        	 tableFile = new File(tablePath),
        	 extractedFile = new File(filePath + ".xtr");
        String compressed = "";
        String[] encodingArray = new String[ENCODING_TABLE_SIZE];
        //read compressed file
        //!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!check here:
        try (FileInputHelper fi = new FileInputHelper(compressedFile)) {
        	byte b;
        	while (true) {
        		b = fi.readByte();//method returns EOFException
        		compressed += String.format("%8s", Integer.toBinaryString(b & 0xff)).replace(" ", "0");
        	}
        } catch (EOFException e) {
        	
        }
        
        //--------------------
        
        //read encoding table:
        try (FileInputHelper fi = new FileInputHelper(tableFile)) {
        	fi.readLine();//skip first empty string
        	encodingArray[(byte)'n'] = fi.readLine();//read code for 'n'
        	while (true) {
        		String s = fi.readLine();
        		if (s == null)
        			throw new EOFException();
        		encodingArray[(byte)s.charAt(0)] = s.substring(1, s.length());        		
        	}
        } catch (EOFException ignore) {}
        
        extractedFile.createNewFile();
        //extract:
		try (FileOutputHelper fo = new FileOutputHelper(extractedFile)) {
			fo.writeString(operator.extract(compressed, encodingArray));
		}
		
		System.out.println("Путь к распакованному файлу " + extractedFile.getAbsolutePath());
    }
}

Kuya kufuneka ubhale ifayile ye readme.txt ngokwakho :)

isiphelo

Ndicinga ukuba yiyo yonke into ebendifuna ukuyithetha. Ukuba unento yokuthetha malunga nokungakwazi kwam ekuphuculeni ikhowudi, i-algorithm, okanye nayiphi na i-optimization ngokubanzi, ngoko uzive ukhululekile ukubhala. Ukuba andichazanga nto, ndicela ubhale nam. Ndingathanda ukuva kuwe kwizimvo!

PS

Ewe, ewe, ndiselapha, kuba andizange ndilibale malunga ne-coefficient. Ngomtya we-s1, itafile ye-encoding inobunzima be-48 bytes - inkulu kunefayile yomthombo, kwaye asizange silibale malunga neerosi ezongezelelweyo (inani le-zero elongezelelweyo li-7) => umlinganiselo woxinzelelo uya kuba ngaphantsi kweyodwa: 176/ (65 + 48*8 + 7) = 0.38. Ukuba nawe uyiqaphele le nto, ayibobuso bakho nje obumnandi. Ewe, oku kuphunyezwa kuya kungasebenzi ngokugqithisileyo kwiifayile ezincinci. Kodwa kwenzeka ntoni kwiifayile ezinkulu? Ubungakanani befayile bukhulu kakhulu kunobukhulu betafile yokufaka iikhowudi. Apha kulapho i-algorithm isebenza njengoko kufanele! Umzekelo, kuba I-monologue kaFaust I-archiver ivelisa i-coefficient yokwenyani (engafanelanga) ye-1.46 - phantse ixesha elinesiqingatha! Kwaye ewe, ifayile bekufanele ukuba ibe ngesiNgesi.

umthombo: www.habr.com

Yongeza izimvo