Komprési data ngagunakeun algoritma Huffman

asup

Dina artikel ieu kuring baris ngobrol ngeunaan algoritma Huffman kawentar, kitu ogé aplikasi dina komprési data.

Hasilna, urang bakal nulis archiver basajan. Ieu parantos dibahas artikel ngeunaan Habré, Tapi tanpa palaksanaan praktis. Materi téoritis tina pos ayeuna dicokot tina palajaran elmu komputer sakola jeung buku Robert Laforet "Data Structures and Algorithm in Java". Janten, sadayana dipotong!

Sababaraha pikiran

Dina file téks biasa, hiji karakter disandikeun ku 8 bit (ASCII encoding) atanapi 16 (Unicode encoding). Salajengna urang bakal mertimbangkeun encoding ASCII. Salaku conto, cokot garis s1 = "SUSIE NYEBUT ITU MUDAH". Jumlahna aya 22 karakter dina garis, sacara alami, kalebet rohangan sareng karakter garis énggal - 'n'. Berkas anu ngandung garis ieu bakal beuratna 22*8 = 176 bit. Patarosan langsung timbul: naha éta rasional ngagunakeun sadaya 8 bit pikeun encode 1 karakter? Kami henteu nganggo sadaya karakter ASCII. Malah lamun maranehna ngalakukeun, eta bakal leuwih rasional pikeun hurup paling umum - S - mun dibere kode shortest mungkin, sarta pikeun hurup rarest - T (atawa U, atawa 'n') - mun dibere kode panjang. Ieu mangrupikeun algoritma Huffman: anjeun kedah milarian pilihan panyandian anu optimal dimana filena bakal gaduh beurat minimum. Biasana yén panjang kode bakal béda pikeun simbol anu béda - ieu anu dumasar kana algoritma.

Coding

Naha henteu masihan karakter 'S' kode, contona, 1 bit panjang: 0 atawa 1. Hayu deui jadi 1. Lajeng karakter kadua paling umum - ' ' (spasi) - masihan 0. Bayangkeun anjeun mimiti decoding pesen anjeun - string s1 disandikeun - jeung anjeun nempo yén kode dimimitian ku 1. Ku kituna, naon anu anjeun ngalakukeun: ieu karakter S, atawa éta sababaraha karakter sejenna, contona A? Ku alatan éta, hiji aturan penting timbul:

Henteu aya kode anu kedah janten awalan anu sanés

Aturan ieu konci dina algoritma. Ku alatan éta, nyieun kode dimimitian ku tabel frékuénsi, nu nunjukkeun frékuénsi (jumlah kajadian) unggal simbol:

Komprési data ngagunakeun algoritma Huffman Karakter anu paling sering kajadian kedah dikodekeun sahenteuna mungkin jumlah bit. Kuring bakal masihan conto salah sahiji tabel kode anu mungkin:

Komprési data ngagunakeun algoritma Huffman Janten pesen anu disandikeun bakal katingali sapertos kieu:

10 01111 10 110 1111 00 10 010 1110 10 00 110 0110 00 110 10 00 1111 010 10 1110 01110

Kuring misahkeun kodeu unggal karakter ku spasi. Ieu moal lumangsung dina file sabenerna dikomprés!
Patarosan timbul: kumaha lalaki ngora ieu datang nepi ka kode pikeun nyieun tabel kode? Ieu bakal dibahas di handap.

Ngawangun tangkal Huffman

Ieu tempat tangkal pilarian binér datang ka nyalametkeun teh. Tong hariwang, anjeun henteu kedah milarian, nyelapkeun, atanapi ngahapus metode di dieu. Ieu struktur tangkal di java:

public class Node {
    private int frequence;
    private char letter;
    private Node leftChild;
    private Node rightChild;
    ...
}

class BinaryTree {
    private Node root;

    public BinaryTree() {
        root = new Node();
    }
    public BinaryTree(Node root) {
        this.root = root;
    }
    ...
}

Ieu sanes kode lengkep, kodeu lengkep bakal di handap.

Ieu algoritma pikeun ngawangun tangkal:

  1. Jieun objek Node pikeun tiap karakter tina pesen (garis s1). Dina hal urang bakal aya 9 titik (objék Node). Unggal titik diwangun ku dua widang data: simbol jeung frékuénsi
  2. Jieun objék Tangkal (BinaryTree) pikeun tiap Node. Node jadi akar tangkal.
  3. Selapkeun tangkal ieu kana antrian prioritas. Nu handap frékuénsi, nu leuwih luhur prioritas. Ku kituna, nalika extracting, dervo kalawan frékuénsi panghandapna salawasna dipilih.

Satuluyna anjeun perlu ngalakukeun di handap cyclically:

  1. Cabut dua tangkal tina antrian prioritas sarta jadikeun aranjeunna barudak tina titik anyar (titik nu anyar dijieun tanpa hurup). Frékuénsi titik anyar sarua jeung jumlah frékuénsi dua tangkal turunan.
  2. Pikeun titik ieu, jieun tangkal kalayan akar dina titik ieu. Selapkeun tangkal ieu deui kana antrian prioritas. (Kusabab tangkal boga frékuénsi anyar, éta paling dipikaresep bakal muncul dina tempat anyar dina antrian)
  3. Nuluykeun lengkah 1 jeung 2 nepi ka aya ngan hiji tangkal ditinggalkeun dina antrian - tangkal Huffman

Pertimbangkeun algoritma ieu dina garis s1:

Komprési data ngagunakeun algoritma Huffman

Di dieu simbol "lf" (linefeed) hartina garis anyar, "sp" (spasi) mangrupa spasi.

Naon salajengna?

Kami ngagaduhan tangkal Huffman. OKÉ. Sarta naon anu kudu dipigawé kalayan eta? Aranjeunna malah moal nyandak éta gratis, teras anjeun kedah ngalacak sadaya jalur anu mungkin tina akar dugi ka daun tangkal. Hayu urang satuju kana denote an tepi 0 lamun ngarah ka anak kénca jeung 1 lamun ngarah ka katuhu. Tegesna, dina notasi ieu, kode simbol nyaéta jalur tina akar tangkal ka daun anu ngandung simbol ieu.

Komprési data ngagunakeun algoritma Huffman

Ieu kumaha tabel kode tétéla. Catet yén upami urang nganggap tabel ieu, urang tiasa nyimpulkeun ngeunaan "beurat" unggal simbol - ieu mangrupikeun panjang kode na. Lajeng, dina bentuk dikomprés, file aslina bakal beuratna: 2 * 3 + 2 * 4 + 3 * 3 + 6 * 2 + 1 * 4 + 1 * 5 + 2 * 4 + 4 * 2 + 1 * 5 = 65 bit . Mimitina beuratna 176 bit. Akibatna, urang ngurangan eta ku saloba 176/65 = 2.7 kali! Tapi ieu utopia. Koefisien sapertos kitu henteu mungkin dimeunangkeun. Naha? Ieu bakal dibahas engké.

Decoding

Nya, panginten anu paling saderhana nyaéta decoding. Jigana loba anjeun geus ditebak yen urang teu bisa saukur nyieun file dikomprés tanpa hint kumaha eta disandi - urang moal bisa decode eta! Sumuhun, enya, éta hésé pikeun kuring sadar ieu, tapi kuring kudu nyieun hiji file téks table.txt kalawan tabel komprési:

01110
 00
A010
E1111
I110
S10
T0110
U01111
Y1110

Éntri tabel dina bentuk 'simbol' 'kode karakter'. Naha 01110 tanpa simbol? Nyatana, éta kalayan simbol, ngan ukur alat-alat java anu kuring anggo nalika kaluaran kana file, karakter baris anyar - 'n' - dirobih janten garis énggal (teu paduli kumaha bodona disada). Ku alatan éta, garis kosong di luhur nyaéta karakter pikeun kode 01110. Pikeun kode 00, karakter mangrupa spasi dina awal garis. Kuring bakal langsung nyarios yén pikeun koéfisién Khan urang, metode nyimpen méja ieu tiasa ngaku paling irasional. Tapi éta gampang kahartos sareng dilaksanakeun. Kuring bakal senang ngadangu saran anjeun dina komentar ngeunaan optimasi.

Ngabogaan tabel ieu ngajadikeun eta pisan gampang decode. Hayu urang émut aturan naon anu urang dituturkeun nalika nyiptakeun encoding:

Henteu aya kode anu kedah janten awalan anu sanés

Ieu dimana eta muterkeun hiji peran facilitating. Urang maca sequentially bit ku bit jeung, pas hasilna string d, diwangun ku bit dibaca, cocog encoding pakait jeung karakter karakter, urang langsung nyaho yén karakter karakter (jeung ngan eta!) Ieu disandi. Salajengna, urang nulis karakter kana garis decoding (garis ngandung pesen decoded), reset garis d, lajeng maca file disandikeun.

Реализация

Waktosna pikeun ngahinakeun kode kuring sareng nyerat arsip. Hayu urang sebut wae Compressor.

Mimitian deui. Anu mimiti, urang nyerat kelas Node:

public class Node {
    private int frequence;//частота
    private char letter;//буква
    private Node leftChild;//левый потомок
    private Node rightChild;//правый потомок

   

    public Node(char letter, int frequence) { //собственно, конструктор
        this.letter = letter;
        this.frequence = frequence;
    }

    public Node() {}//перегрузка конструтора для безымянных узлов(см. выше в разделе о построении дерева Хаффмана)
    public void addChild(Node newNode) {//добавить потомка
        if (leftChild == null)//если левый пустой=> правый тоже=> добавляем в левый
            leftChild = newNode;
        else {
            if (leftChild.getFrequence() <= newNode.getFrequence()) //в общем, левым потомком
                rightChild = newNode;//станет тот, у кого меньше частота
            else {
                rightChild = leftChild;
                leftChild = newNode;
            }
        }

        frequence += newNode.getFrequence();//итоговая частота
    }

    public Node getLeftChild() {
        return leftChild;
    }

    public Node getRightChild() {
        return rightChild;
    }

    public int getFrequence() {
        return frequence;
    }

    public char getLetter() {
        return letter;
    }

    public boolean isLeaf() {//проверка на лист
        return leftChild == null && rightChild == null;
    }
}

Ayeuna tangkal:

class BinaryTree {
    private Node root;

    public BinaryTree() {
        root = new Node();
    }

    public BinaryTree(Node root) {
        this.root = root;
    }

    public int getFrequence() {
        return root.getFrequence();
    }

    public Node getRoot() {
        return root;
    }
}

Antrian Prioritas:

import java.util.ArrayList;//да-да, очередь будет на базе списка

class PriorityQueue {
    private ArrayList<BinaryTree> data;//список очереди
    private int nElems;//кол-во элементов в очереди

    public PriorityQueue() {
        data = new ArrayList<BinaryTree>();
        nElems = 0;
    }

    public void insert(BinaryTree newTree) {//вставка
        if (nElems == 0)
            data.add(newTree);
        else {
            for (int i = 0; i < nElems; i++) {
                if (data.get(i).getFrequence() > newTree.getFrequence()) {//если частота вставляемого дерева меньше 
                    data.add(i, newTree);//чем част. текущего, то cдвигаем все деревья на позициях справа на 1 ячейку                   
                    break;//затем ставим новое дерево на позицию текущего
                }
                if (i == nElems - 1) 
                    data.add(newTree);
            }
        }
        nElems++;//увеличиваем кол-во элементов на 1
    }

    public BinaryTree remove() {//удаление из очереди
        BinaryTree tmp = data.get(0);//копируем удаляемый элемент
        data.remove(0);//собственно, удаляем
        nElems--;//уменьшаем кол-во элементов на 1
        return tmp;//возвращаем удаленный элемент(элемент с наименьшей частотой)
    }
}

Kelas anu nyiptakeun tangkal Huffman:

public class HuffmanTree {
    private final byte ENCODING_TABLE_SIZE = 127;//длина кодировочной таблицы
    private String myString;//сообщение
    private BinaryTree huffmanTree;//дерево Хаффмана
    private int[] freqArray;//частотная таблица
    private String[] encodingArray;//кодировочная таблица


    //----------------constructor----------------------
    public HuffmanTree(String newString) {
        myString = newString;

        freqArray = new int[ENCODING_TABLE_SIZE];
        fillFrequenceArray();

        huffmanTree = getHuffmanTree();

        encodingArray = new String[ENCODING_TABLE_SIZE];
        fillEncodingArray(huffmanTree.getRoot(), "", "");
    }

    //--------------------frequence array------------------------
    private void fillFrequenceArray() {
        for (int i = 0; i < myString.length(); i++) {
            freqArray[(int)myString.charAt(i)]++;
        }
    }

    public int[] getFrequenceArray() {
        return freqArray;
    }

    //------------------------huffman tree creation------------------
    private BinaryTree getHuffmanTree() {
        PriorityQueue pq = new PriorityQueue();
        //алгоритм описан выше
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {//если символ существует в строке
                Node newNode = new Node((char) i, freqArray[i]);//то создать для него Node
                BinaryTree newTree = new BinaryTree(newNode);//а для Node создать BinaryTree
                pq.insert(newTree);//вставить в очередь
            }
        }

        while (true) {
            BinaryTree tree1 = pq.remove();//извлечь из очереди первое дерево.

            try {
                BinaryTree tree2 = pq.remove();//извлечь из очереди второе дерево

                Node newNode = new Node();//создать новый Node
                newNode.addChild(tree1.getRoot());//сделать его потомками два извлеченных дерева
                newNode.addChild(tree2.getRoot());

                pq.insert(new BinaryTree(newNode);
            } catch (IndexOutOfBoundsException e) {//осталось одно дерево в очереди
                return tree1;
            }
        }
    }

    public BinaryTree getTree() {
        return huffmanTree;
    }

    //-------------------encoding array------------------
    void fillEncodingArray(Node node, String codeBefore, String direction) {//заполнить кодировочную таблицу
        if (node.isLeaf()) {
            encodingArray[(int)node.getLetter()] = codeBefore + direction;
        } else {
            fillEncodingArray(node.getLeftChild(), codeBefore + direction, "0");
            fillEncodingArray(node.getRightChild(), codeBefore + direction, "1");
        }
    }

    String[] getEncodingArray() {
        return encodingArray;
    }

    public void displayEncodingArray() {//для отладки
        fillEncodingArray(huffmanTree.getRoot(), "", "");

        System.out.println("======================Encoding table====================");
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {
                System.out.print((char)i + " ");
                System.out.println(encodingArray[i]);
            }
        }
        System.out.println("========================================================");
    }
    //-----------------------------------------------------
    String getOriginalString() {
        return myString;
    }
}

Kelas nu ngandung nu encodes / decodes:

public class HuffmanOperator {
    private final byte ENCODING_TABLE_SIZE = 127;//длина таблицы
    private HuffmanTree mainHuffmanTree;//дерево Хаффмана (используется только для сжатия)
    private String myString;//исходное сообщение
    private int[] freqArray;//частотаная таблица
    private String[] encodingArray;//кодировочная таблица
    private double ratio;//коэффициент сжатия 


    public HuffmanOperator(HuffmanTree MainHuffmanTree) {//for compress
        this.mainHuffmanTree = MainHuffmanTree;

        myString = mainHuffmanTree.getOriginalString();

        encodingArray = mainHuffmanTree.getEncodingArray();

        freqArray = mainHuffmanTree.getFrequenceArray();
    }

    public HuffmanOperator() {}//for extract;

    //---------------------------------------compression-----------------------------------------------------------
    private String getCompressedString() {
        String compressed = "";
        String intermidiate = "";//промежуточная строка(без добавочных нулей)
        //System.out.println("=============================Compression=======================");
        //displayEncodingArray();
        for (int i = 0; i < myString.length(); i++) {
            intermidiate += encodingArray[myString.charAt(i)];
        }
        //Мы не можем писать бит в файл. Поэтому нужно сделать длину сообщения кратной 8=>
        //нужно добавить нули в конец(можно 1, нет разницы)
        byte counter = 0;//количество добавленных в конец нулей (байта в полне хватит: 0<=counter<8<127)
        for (int length = intermidiate.length(), delta = 8 - length % 8; 
        		counter < delta ; counter++) {//delta - количество добавленных нулей
            intermidiate += "0";
        }
        
        //склеить кол-во добавочных нулей в бинарном предаствлении и промежуточную строку 
        compressed = String.format("%8s", Integer.toBinaryString(counter & 0xff)).replace(" ", "0") + intermidiate;
        		
        //идеализированный коэффициент
        setCompressionRatio();
        //System.out.println("===============================================================");
        return compressed;
    }
    
    private void setCompressionRatio() {//посчитать идеализированный коэффициент 
        double sumA = 0, sumB = 0;//A-the original sum
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            if (freqArray[i] != 0) {
                sumA += 8 * freqArray[i];
                sumB += encodingArray[i].length() * freqArray[i];
            }
        }
        ratio = sumA / sumB;
    }

    public byte[] getBytedMsg() {//final compression
        StringBuilder compressedString = new StringBuilder(getCompressedString());
        byte[] compressedBytes = new byte[compressedString.length() / 8];
        for (int i = 0; i < compressedBytes.length; i++) {
                compressedBytes[i] = (byte) Integer.parseInt(compressedString.substring(i * 8, (i + 1) * 8), 2);
        }
        return compressedBytes;
    }
    //---------------------------------------end of compression----------------------------------------------------------------
    //------------------------------------------------------------extract-----------------------------------------------------
    public String extract(String compressed, String[] newEncodingArray) {
        String decompressed = "";
        String current = "";
        String delta = "";
        encodingArray = newEncodingArray;
        
        //displayEncodingArray();
        //получить кол-во вставленных нулей
        for (int i = 0; i < 8; i++) 
        	delta += compressed.charAt(i);
        int ADDED_ZEROES = Integer.parseInt(delta, 2);
       
        for (int i = 8, l = compressed.length() - ADDED_ZEROES; i < l; i++) {
            //i = 8, т.к. первым байтом у нас идет кол-во вставленных нулей
            current += compressed.charAt(i);
            for (int j = 0; j < ENCODING_TABLE_SIZE; j++) {
                if (current.equals(encodingArray[j])) {//если совпало
                    decompressed += (char)j;//то добавляем элемент
                    current = "";//и обнуляем текущую строку
                }
            }
        }

        return decompressed;
    }

    public String getEncodingTable() {
        String enc = "";
    	for (int i = 0; i < encodingArray.length; i++) {
        	if (freqArray[i] != 0) 
        		enc += (char)i + encodingArray[i] + 'n';
        }
    	return enc;
    }

    public double getCompressionRatio() {
        return ratio;
    }


    public void displayEncodingArray() {//для отладки
        System.out.println("======================Encoding table====================");
        for (int i = 0; i < ENCODING_TABLE_SIZE; i++) {
            //if (freqArray[i] != 0) {
                System.out.print((char)i + " ");
                System.out.println(encodingArray[i]);
            //}
        }
        System.out.println("========================================================");
    }
    }

Kelas anu ngagampangkeun nyerat kana file:

import java.io.File;
import java.io.PrintWriter;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.Closeable;

public class FileOutputHelper implements Closeable {
    private File outputFile;
    private FileOutputStream fileOutputStream;

    public FileOutputHelper(File file) throws FileNotFoundException {
        outputFile = file;
        fileOutputStream = new FileOutputStream(outputFile);
    }

    public void writeByte(byte msg) throws IOException {
        fileOutputStream.write(msg);
    }

    public void writeBytes(byte[] msg) throws IOException {
        fileOutputStream.write(msg);
    }

    public void writeString(String msg) {
    	try (PrintWriter pw = new PrintWriter(outputFile)) {
    		pw.write(msg);
    	} catch (FileNotFoundException e) {
    		System.out.println("Неверный путь, или такого файла не существует!");
    	}
    }

    @Override
    public void close() throws IOException {
        fileOutputStream.close();
    }

    public void finalize() throws IOException {
        close();
    }
}

Kelas anu ngagampangkeun maca tina file:

import java.io.FileInputStream;
import java.io.EOFException;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.io.Closeable;
import java.io.File;
import java.io.IOException;

public class FileInputHelper implements Closeable {
	private FileInputStream fileInputStream;
	private BufferedReader fileBufferedReader;
	
	public FileInputHelper(File file) throws IOException {
		fileInputStream = new FileInputStream(file);
		fileBufferedReader = new BufferedReader(new InputStreamReader(fileInputStream));
	}
	
	
    public byte readByte() throws IOException {
    	int cur = fileInputStream.read();
    	if (cur == -1)//если закончился файл
    		throw new EOFException();
    	return (byte)cur;
    }
    
    public String readLine() throws IOException {
    	return fileBufferedReader.readLine();
    }
    
    @Override
    public void close() throws IOException{
    	fileInputStream.close();
    }
}

Nya, sareng kelas utama:

import java.io.File;
import java.nio.charset.MalformedInputException;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.NoSuchFileException;
import java.nio.file.Paths;
import java.util.List;
import java.io.EOFException;
public class Main {
	private static final byte ENCODING_TABLE_SIZE = 127;
	
    public static void main(String[] args) throws IOException {
        try {//указываем инструкцию с помощью аргументов командной строки
            if (args[0].equals("--compress") || args[0].equals("-c"))
                compress(args[1]);
            else if ((args[0].equals("--extract") || args[0].equals("-x"))
            		&& (args[2].equals("--table") || args[2].equals("-t"))) {
            	extract(args[1], args[3]);
            }
            else
                throw new IllegalArgumentException();
        } catch (ArrayIndexOutOfBoundsException | IllegalArgumentException e) {
            System.out.println("Неверный формат ввода аргументов ");
            System.out.println("Читайте Readme.txt");
            e.printStackTrace();
        }
    }

	public static void compress(String stringPath) throws IOException {
        List<String> stringList;
        File inputFile = new File(stringPath);
        String s = "";
        File compressedFile, table;
        
        try {
            stringList = Files.readAllLines(Paths.get(inputFile.getAbsolutePath()));
        } catch (NoSuchFileException e) {
            System.out.println("Неверный путь, или такого файла не существует!");
            return;
        } catch (MalformedInputException e) {
        	System.out.println("Текущая кодировка файла не поддерживается");
        	return;
        }

        for (String item : stringList) {
            s += item;
            s += 'n';
        }

        HuffmanOperator operator = new HuffmanOperator(new HuffmanTree(s));

        compressedFile = new File(inputFile.getAbsolutePath() + ".cpr");
        compressedFile.createNewFile();
        try (FileOutputHelper fo = new FileOutputHelper(compressedFile)) {
        	fo.writeBytes(operator.getBytedMsg());
        }
        //create file with encoding table:
        
        table = new File(inputFile.getAbsolutePath() + ".table.txt");
        table.createNewFile();
        try (FileOutputHelper fo = new FileOutputHelper(table)) {
        	fo.writeString(operator.getEncodingTable());
        }
        
        System.out.println("Путь к сжатому файлу: " + compressedFile.getAbsolutePath());
        System.out.println("Путь к кодировочной таблице " + table.getAbsolutePath());
        System.out.println("Без таблицы файл будет невозможно извлечь!");
        
        double idealRatio = Math.round(operator.getCompressionRatio() * 100) / (double) 100;//идеализированный коэффициент
        double realRatio = Math.round((double) inputFile.length() 
        		/ ((double) compressedFile.length() + (double) table.length()) * 100) / (double)100;//настоящий коэффициент
        
        System.out.println("Идеализированный коэффициент сжатия равен " + idealRatio);
        System.out.println("Коэффициент сжатия с учетом кодировочной таблицы " + realRatio);
    }

    public static void extract(String filePath, String tablePath) throws FileNotFoundException, IOException {
        HuffmanOperator operator = new HuffmanOperator();
        File compressedFile = new File(filePath),
        	 tableFile = new File(tablePath),
        	 extractedFile = new File(filePath + ".xtr");
        String compressed = "";
        String[] encodingArray = new String[ENCODING_TABLE_SIZE];
        //read compressed file
        //!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!check here:
        try (FileInputHelper fi = new FileInputHelper(compressedFile)) {
        	byte b;
        	while (true) {
        		b = fi.readByte();//method returns EOFException
        		compressed += String.format("%8s", Integer.toBinaryString(b & 0xff)).replace(" ", "0");
        	}
        } catch (EOFException e) {
        	
        }
        
        //--------------------
        
        //read encoding table:
        try (FileInputHelper fi = new FileInputHelper(tableFile)) {
        	fi.readLine();//skip first empty string
        	encodingArray[(byte)'n'] = fi.readLine();//read code for 'n'
        	while (true) {
        		String s = fi.readLine();
        		if (s == null)
        			throw new EOFException();
        		encodingArray[(byte)s.charAt(0)] = s.substring(1, s.length());        		
        	}
        } catch (EOFException ignore) {}
        
        extractedFile.createNewFile();
        //extract:
		try (FileOutputHelper fo = new FileOutputHelper(extractedFile)) {
			fo.writeString(operator.extract(compressed, encodingArray));
		}
		
		System.out.println("Путь к распакованному файлу " + extractedFile.getAbsolutePath());
    }
}

Anjeun kedah nyerat file readme.txt nyalira :)

kacindekan

Kuring nebak éta sakabéh kuring hayang ngomong. Upami Anjeun gaduh nanaon ngomong ngeunaan incompetence kuring dina ngaronjatkeun kode, algoritma, atawa optimasi sagala sacara umum, teras ngarasa Luncat ka nulis. Upami kuring henteu acan ngajelaskeun nanaon, punten nyerat ogé. Abdi hoyong ngadangu ti anjeun dina komentar!

PS

Sumuhun, abdi masih di dieu, sabab kuring teu poho ngeunaan koefisien. Pikeun string s1, tabel encoding beuratna 48 bait - leuwih badag batan file sumber, sarta kami teu poho ngeunaan enol tambahan (jumlah enol ditambahkeun nyaéta 7) => rasio komprési bakal kirang ti hiji: 176 / (65 + 48*8 + 7) = 0.38. Upami anjeun ogé perhatikeun ieu, maka éta sanés ngan ukur wajah anjeun anu saé. Sumuhun, palaksanaan ieu bakal pisan teu episien pikeun file leutik. Tapi naon anu lumangsung kana file ageung? Ukuran file jauh leuwih badag batan ukuran tabel encoding. Ieu dimana algoritma jalan sakumaha sakuduna! Contona, pikeun Monolog Faust Arsip ngahasilkeun koefisien nyata (teu idealized) 1.46 - ampir hiji satengah kali! Sareng leres, file éta kedahna dina basa Inggris.

sumber: www.habr.com

Tambahkeun komentar