ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman

ΠŸΡ€Π΅Π΄ΠΈ Π½Π°Ρ‡Π°Π»ΠΎΡ‚ΠΎ Π½Π° курса "Алгоритми Π·Π° Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Ρ‡ΠΈΡ†ΠΈ" ΠΏΠΎΠ΄Π³ΠΎΡ‚Π²ΠΈ Π·Π° вас ΠΏΡ€Π΅Π²ΠΎΠ΄ Π½Π° Π΄Ρ€ΡƒΠ³ ΠΏΠΎΠ»Π΅Π·Π΅Π½ ΠΌΠ°Ρ‚Π΅Ρ€ΠΈΠ°Π».

ΠšΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ Π½Π° Huffman Π΅ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Π΄Π°Π½Π½ΠΈ, ΠΊΠΎΠΉΡ‚ΠΎ Ρ„ΠΎΡ€ΠΌΡƒΠ»ΠΈΡ€Π° основната идСя Π·Π° компрСсиранС Π½Π° Ρ„Π°ΠΉΠ»ΠΎΠ²Π΅. Π’ Ρ‚Π°Π·ΠΈ статия Ρ‰Π΅ Π³ΠΎΠ²ΠΎΡ€ΠΈΠΌ Π·Π° ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ с фиксирана ΠΈ ΠΏΡ€ΠΎΠΌΠ΅Π½Π»ΠΈΠ²Π° дълТина, ΡƒΠ½ΠΈΠΊΠ°Π»Π½ΠΎ Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€ΡƒΠ΅ΠΌΠΈ ΠΊΠΎΠ΄ΠΎΠ²Π΅, прСфиксни ΠΏΡ€Π°Π²ΠΈΠ»Π° ΠΈ ΠΈΠ·Π³Ρ€Π°ΠΆΠ΄Π°Π½Π΅ Π½Π° Π΄ΡŠΡ€Π²ΠΎ Π½Π° Π₯ΡŠΡ„ΠΌΠ°Π½.

Π—Π½Π°Π΅ΠΌ, Ρ‡Π΅ всСки Π·Π½Π°ΠΊ сС ΡΡŠΡ…Ρ€Π°Π½ΡΠ²Π° ΠΊΠ°Ρ‚ΠΎ послСдоватСлност ΠΎΡ‚ 0 ΠΈ 1 ΠΈ Π·Π°Π΅ΠΌΠ° 8 Π±ΠΈΡ‚Π°. Π’ΠΎΠ²Π° сС Π½Π°Ρ€ΠΈΡ‡Π° ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ с фиксирана дълТина, Π·Π°Ρ‰ΠΎΡ‚ΠΎ всСки Π·Π½Π°ΠΊ ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π° Π΅Π΄ΠΈΠ½ ΠΈ ΡΡŠΡ‰ фиксиран Π±Ρ€ΠΎΠΉ Π±ΠΈΡ‚ΠΎΠ²Π΅ Π·Π° ΡΡŠΡ…Ρ€Π°Π½Π΅Π½ΠΈΠ΅.

Π”Π° ΠΊΠ°ΠΆΠ΅ΠΌ, Ρ‡Π΅ Π½ΠΈ Π΅ Π΄Π°Π΄Π΅Π½ тСкст. Как ΠΌΠΎΠΆΠ΅ΠΌ Π΄Π° Π½Π°ΠΌΠ°Π»ΠΈΠΌ количСството пространство, Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΎ Π·Π° ΡΡŠΡ…Ρ€Π°Π½ΡΠ²Π°Π½Π΅ Π½Π° Π΅Π΄ΠΈΠ½ Π·Π½Π°ΠΊ?

ΠžΡΠ½ΠΎΠ²Π½Π°Ρ‚Π° идСя Π΅ ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ с ΠΏΡ€ΠΎΠΌΠ΅Π½Π»ΠΈΠ²Π° дълТина. МоТСм Π΄Π° ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΌΠ΅ Ρ„Π°ΠΊΡ‚Π°, Ρ‡Π΅ някои Π·Π½Π°Ρ†ΠΈ Π² тСкста сС срСщат ΠΏΠΎ-чСсто ΠΎΡ‚ Π΄Ρ€ΡƒΠ³ΠΈ (см. Ρ‚ΡƒΠΊ), Π·Π° Π΄Π° сС Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΈ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ, ΠΊΠΎΠΉΡ‚ΠΎ Ρ‰Π΅ прСдстави ΡΡŠΡ‰Π°Ρ‚Π° послСдоватСлност ΠΎΡ‚ Π·Π½Π°Ρ†ΠΈ Π² ΠΏΠΎ-ΠΌΠ°Π»ΠΊΠΎ Π±ΠΈΡ‚ΠΎΠ²Π΅. ΠŸΡ€ΠΈ ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ с ΠΏΡ€ΠΎΠΌΠ΅Π½Π»ΠΈΠ²Π° дълТина Π½ΠΈΠ΅ присвоявамС Π½Π° Π·Π½Π°Ρ†ΠΈΡ‚Π΅ ΠΏΡ€ΠΎΠΌΠ΅Π½Π»ΠΈΠ² Π±Ρ€ΠΎΠΉ Π±ΠΈΡ‚ΠΎΠ²Π΅ Π² зависимост ΠΎΡ‚ Ρ‚ΠΎΠ²Π° ΠΊΠΎΠ»ΠΊΠΎ чСсто сС появяват Π² Π΄Π°Π΄Π΅Π½ тСкст. Π’ ΠΊΡ€Π°ΠΉΠ½Π° смСтка някои Π·Π½Π°Ρ†ΠΈ ΠΌΠΎΠ³Π°Ρ‚ Π΄Π° Π·Π°Π΅ΠΌΠ°Ρ‚ само 1 Π±ΠΈΡ‚, Π΄ΠΎΠΊΠ°Ρ‚ΠΎ Π΄Ρ€ΡƒΠ³ΠΈ ΠΌΠΎΠ³Π°Ρ‚ Π΄Π° Π·Π°Π΅ΠΌΠ°Ρ‚ 2 Π±ΠΈΡ‚Π°, 3 ΠΈΠ»ΠΈ ΠΏΠΎΠ²Π΅Ρ‡Π΅. ΠŸΡ€ΠΎΠ±Π»Π΅ΠΌΡŠΡ‚ с ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ с ΠΏΡ€ΠΎΠΌΠ΅Π½Π»ΠΈΠ²Π° дълТина Π΅ само послСдващото Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ Π½Π° послСдоватСлността.

Как, Π·Π½Π°Π΅ΠΉΠΊΠΈ послСдоватСлността ΠΎΡ‚ Π±ΠΈΡ‚ΠΎΠ²Π΅, Π΄Π° Π³ΠΎ Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°ΠΌ нСдвусмислСно?

ΠŸΠΎΠΌΠΈΡΠ»Π΅Ρ‚Π΅ Π·Π° линията "Π°Π±Π°ΠΊΠ΄Π°Π±". Π’ΠΎΠΉ ΠΈΠΌΠ° 8 Π·Π½Π°ΠΊΠ° ΠΈ ΠΏΡ€ΠΈ ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ с фиксирана дълТина Ρ‰Π΅ са ΠΌΡƒ Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΠΈ 64 Π±ΠΈΡ‚Π°, Π·Π° Π΄Π° Π³ΠΎ ΡΡŠΡ…Ρ€Π°Π½ΠΈ. Π˜ΠΌΠ°ΠΉΡ‚Π΅ ΠΏΡ€Π΅Π΄Π²ΠΈΠ΄, Ρ‡Π΅ чСстотата Π½Π° символа "Π°", "Π±", "Π²" ΠΈ "Π”" Π΅ Ρ€Π°Π²Π½ΠΎ Π½Π° 4, 2, 1, 1 ΡΡŠΠΎΡ‚Π²Π΅Ρ‚Π½ΠΎ. НСка сС ΠΎΠΏΠΈΡ‚Π°ΠΌΠ΅ Π΄Π° си прСдставим "Π°Π±Π°ΠΊΠ΄Π°Π±" ΠΏΠΎ-ΠΌΠ°Π»ΠΊΠΎ Π±ΠΈΡ‚ΠΎΠ²Π΅, ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΉΠΊΠΈ Ρ„Π°ΠΊΡ‚Π°, Ρ‡Π΅ "Π΄Π° сС" сС срСща ΠΏΠΎ-чСсто ΠΎΡ‚ "Π’"И "Π’" сС срСща ΠΏΠΎ-чСсто ΠΎΡ‚ "Β° Π‘" ΠΈ "Π”". Π”Π° Π·Π°ΠΏΠΎΡ‡Π½Π΅ΠΌ с ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ "Π΄Π° сС" с Π΅Π΄ΠΈΠ½ Π±ΠΈΡ‚ Ρ€Π°Π²Π΅Π½ Π½Π° 0, "Π’" Ρ‰Π΅ Π·Π°Π΄Π°Π΄Π΅ΠΌ Π΄Π²ΡƒΠ±ΠΈΡ‚ΠΎΠ² ΠΊΠΎΠ΄ 11 ΠΈ ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΉΠΊΠΈ Ρ‚Ρ€ΠΈ Π±ΠΈΡ‚Π° 100 ΠΈ 011 Ρ‰Π΅ ΠΊΠΎΠ΄ΠΈΡ€Π°ΠΌΠ΅ "Β° Π‘" ΠΈ "Π”".

Π’ Ρ€Π΅Π·ΡƒΠ»Ρ‚Π°Ρ‚ Π½Π° Ρ‚ΠΎΠ²Π° Ρ‰Π΅ ΠΏΠΎΠ»ΡƒΡ‡ΠΈΠΌ:

a
0

b
11

c
100

d
011

Π’Π°ΠΊΠ° Ρ‡Π΅ линията "Π°Π±Π°ΠΊΠ΄Π°Π±" Ρ‰Π΅ ΠΊΠΎΠ΄ΠΈΡ€Π°ΠΌΠ΅ ΠΊΠ°Ρ‚ΠΎ 00110100011011 (0|0|11|0|100|011|0|11)ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΉΠΊΠΈ ΠΊΠΎΠ΄ΠΎΠ²Π΅Ρ‚Π΅ ΠΏΠΎ-Π³ΠΎΡ€Π΅. ΠžΡΠ½ΠΎΠ²Π½ΠΈΡΡ‚ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌ ΠΎΠ±Π°Ρ‡Π΅ Ρ‰Π΅ бъдС Π² Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ. ΠšΠΎΠ³Π°Ρ‚ΠΎ сС ΠΎΠΏΠΈΡ‚Π°ΠΌΠ΅ Π΄Π° Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°ΠΌΠ΅ Π½ΠΈΠ·Π° 00110100011011, Ρ‰Π΅ ΠΏΠΎΠ»ΡƒΡ‡ΠΈΠΌ двусмислСн Ρ€Π΅Π·ΡƒΠ»Ρ‚Π°Ρ‚, Ρ‚ΡŠΠΉ ΠΊΠ°Ρ‚ΠΎ ΠΌΠΎΠΆΠ΅ Π΄Π° бъдС прСдставСн ΠΊΠ°Ρ‚ΠΎ:

0|011|0|100|011|0|11    adacdab
0|0|11|0|100|0|11|011   aabacabd
0|011|0|100|0|11|0|11   adacabab 

...
ΠΈ Ρ‚.Π½.

Π—Π° Π΄Π° ΠΈΠ·Π±Π΅Π³Π½Π΅ΠΌ Ρ‚Π°Π·ΠΈ нСяснота, трябва Π΄Π° Π³Π°Ρ€Π°Π½Ρ‚ΠΈΡ€Π°ΠΌΠ΅, Ρ‡Π΅ Π½Π°ΡˆΠ΅Ρ‚ΠΎ ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ отговаря Π½Π° Ρ‚Π°ΠΊΠ°Π²Π° концСпция ΠΊΠ°Ρ‚ΠΎ ΠΏΡ€Π°Π²ΠΈΠ»ΠΎ Π·Π° прСфикс, ΠΊΠΎΠ΅Ρ‚ΠΎ Π½Π° свой Ρ€Π΅Π΄ ΠΏΡ€Π΅Π΄ΠΏΠΎΠ»Π°Π³Π°, Ρ‡Π΅ ΠΊΠΎΠ΄ΠΎΠ²Π΅Ρ‚Π΅ ΠΌΠΎΠ³Π°Ρ‚ Π΄Π° Π±ΡŠΠ΄Π°Ρ‚ Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°Π½ΠΈ само ΠΏΠΎ Π΅Π΄ΠΈΠ½ ΡƒΠ½ΠΈΠΊΠ°Π»Π΅Π½ Π½Π°Ρ‡ΠΈΠ½. ΠŸΡ€Π°Π²ΠΈΠ»ΠΎΡ‚ΠΎ Π·Π° прСфикс Π³Π°Ρ€Π°Π½Ρ‚ΠΈΡ€Π°, Ρ‡Π΅ Π½ΠΈΠΊΠΎΠΉ ΠΊΠΎΠ΄ Π½Π΅ Π΅ прСфикс Π½Π° Π΄Ρ€ΡƒΠ³. Под ΠΊΠΎΠ΄ ΠΈΠΌΠ°ΠΌΠ΅ ΠΏΡ€Π΅Π΄Π²ΠΈΠ΄ Π±ΠΈΡ‚ΠΎΠ²Π΅Ρ‚Π΅, ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°Π½ΠΈ Π·Π° прСдставянС Π½Π° ΠΎΠΏΡ€Π΅Π΄Π΅Π»Π΅Π½ символ. Π’ горния ΠΏΡ€ΠΈΠΌΠ΅Ρ€ 0 Π΅ прСфикс 011, ΠΊΠΎΠ΅Ρ‚ΠΎ Π½Π°Ρ€ΡƒΡˆΠ°Π²Π° ΠΏΡ€Π°Π²ΠΈΠ»ΠΎΡ‚ΠΎ Π·Π° прСфикса. Π’Π°ΠΊΠ° Ρ‡Π΅, Π°ΠΊΠΎ Π½Π°ΡˆΠΈΡ‚Π΅ ΠΊΠΎΠ΄ΠΎΠ²Π΅ удовлСтворяват ΠΏΡ€Π°Π²ΠΈΠ»ΠΎΡ‚ΠΎ Π·Π° прСфикса, Ρ‚ΠΎΠ³Π°Π²Π° ΠΌΠΎΠΆΠ΅ΠΌ ΡƒΠ½ΠΈΠΊΠ°Π»Π½ΠΎ Π΄Π° Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°ΠΌΠ΅ (ΠΈ ΠΎΠ±Ρ€Π°Ρ‚Π½ΠΎ).

НСка ΠΏΡ€Π΅Ρ€Π°Π·Π³Π»Π΅Π΄Π°ΠΌΠ΅ ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π° ΠΏΠΎ-Π³ΠΎΡ€Π΅. Π’ΠΎΠ·ΠΈ ΠΏΡŠΡ‚ Ρ‰Π΅ Π·Π°Π΄Π°Π΄Π΅ΠΌ Π·Π° символи "Π°", "Π±", "Π²" ΠΈ "Π”" ΠΊΠΎΠ΄ΠΎΠ²Π΅, ΠΊΠΎΠΈΡ‚ΠΎ отговарят Π½Π° ΠΏΡ€Π°Π²ΠΈΠ»ΠΎΡ‚ΠΎ Π·Π° прСфикса.

a
0

b
10

c
110

d
111

Π‘ Ρ‚ΠΎΠ²Π° ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅ Π½ΠΈΠ·ΡŠΡ‚ "Π°Π±Π°ΠΊΠ΄Π°Π±" Ρ‰Π΅ бъдС ΠΊΠΎΠ΄ΠΈΡ€Π°Π½ ΠΊΠ°Ρ‚ΠΎ 00100100011010 (0|0|10|0|100|011|0|10). Но 00100100011010 Π²Π΅Ρ‡Π΅ Ρ‰Π΅ ΠΌΠΎΠΆΠ΅ΠΌ нСдвусмислСно Π΄Π° Π΄Π΅ΠΊΠΎΠ΄ΠΈΡ€Π°ΠΌΠ΅ ΠΈ Π΄Π° сС Π²ΡŠΡ€Π½Π΅ΠΌ към нашия ΠΎΡ€ΠΈΠ³ΠΈΠ½Π°Π»Π΅Π½ Π½ΠΈΠ· "Π°Π±Π°ΠΊΠ΄Π°Π±".

ΠšΠΎΠ΄ΠΈΡ€Π°Π½Π΅ Π½Π° Π₯ΡŠΡ„ΠΌΠ°Π½

Π‘Π΅Π³Π°, слСд ΠΊΠ°Ρ‚ΠΎ сС справихмС с ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ с ΠΏΡ€ΠΎΠΌΠ΅Π½Π»ΠΈΠ²Π° дълТина ΠΈ ΠΏΡ€Π°Π²ΠΈΠ»ΠΎΡ‚ΠΎ Π·Π° прСфикса, Π½Π΅ΠΊΠ° ΠΏΠΎΠ³ΠΎΠ²ΠΎΡ€ΠΈΠΌ Π·Π° ΠΊΠΎΠ΄ΠΈΡ€Π°Π½Π΅Ρ‚ΠΎ Π½Π° Huffman.

ΠœΠ΅Ρ‚ΠΎΠ΄ΡŠΡ‚ сС основава Π½Π° ΡΡŠΠ·Π΄Π°Π²Π°Π½Π΅Ρ‚ΠΎ Π½Π° Π΄Π²ΠΎΠΈΡ‡Π½ΠΈ Π΄ΡŠΡ€Π²Π΅Ρ‚Π°. Π’ Π½Π΅Π³ΠΎ Π²ΡŠΠ·Π΅Π»ΡŠΡ‚ ΠΌΠΎΠΆΠ΅ Π΄Π° бъдС ΠΊΡ€Π°Π΅Π½ ΠΈΠ»ΠΈ Π²ΡŠΡ‚Ρ€Π΅ΡˆΠ΅Π½. ΠŸΡŠΡ€Π²ΠΎΠ½Π°Ρ‡Π°Π»Π½ΠΎ всички възли сС считат Π·Π° листа (Ρ‚Π΅Ρ€ΠΌΠΈΠ½Π°Π»ΠΈ), ΠΊΠΎΠΈΡ‚ΠΎ прСдставляват самия символ ΠΈ Π½Π΅Π³ΠΎΠ²ΠΎΡ‚ΠΎ Ρ‚Π΅Π³Π»ΠΎ (Ρ‚.Π΅. чСстотата Π½Π° появяванС). Π’ΡŠΡ‚Ρ€Π΅ΡˆΠ½ΠΈΡ‚Π΅ възли ΡΡŠΠ΄ΡŠΡ€ΠΆΠ°Ρ‚ Ρ‚Π΅Π³Π»ΠΎΡ‚ΠΎ Π½Π° Π·Π½Π°ΠΊΠ° ΠΈ сС отнасят Π·Π° Π΄Π²Π° наслСдствСни възСла. По ΠΎΠ±Ρ‰ΠΎ съгласиС Π±ΠΈΡ‚ Β«0Β» прСдставлява слСдванС Π½Π° лСвия ΠΊΠ»ΠΎΠ½ ΠΈ Β«1Β» - вдясно. Π² пълно Π΄ΡŠΡ€Π²ΠΎ N листа ΠΈ N-1 Π²ΡŠΡ‚Ρ€Π΅ΡˆΠ½ΠΈ възли. ΠŸΡ€Π΅ΠΏΠΎΡ€ΡŠΡ‡Π²Π° сС, ΠΊΠΎΠ³Π°Ρ‚ΠΎ сС конструира Π΄ΡŠΡ€Π²ΠΎ Π½Π° Huffman, Π½Π΅ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°Π½ΠΈΡ‚Π΅ символи Π΄Π° сС ΠΈΠ·Ρ…Π²ΡŠΡ€Π»ΡΡ‚, Π·Π° Π΄Π° сС ΠΏΠΎΠ»ΡƒΡ‡Π°Ρ‚ ΠΊΠΎΠ΄ΠΎΠ²Π΅ с ΠΎΠΏΡ‚ΠΈΠΌΠ°Π»Π½Π° дълТина.

Π©Π΅ ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΌΠ΅ опашка с ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚, Π·Π° Π΄Π° ΠΈΠ·Π³Ρ€Π°Π΄ΠΈΠΌ Π΄ΡŠΡ€Π²ΠΎ Π½Π° Π₯ΡŠΡ„ΠΌΠ°Π½, ΠΊΡŠΠ΄Π΅Ρ‚ΠΎ Π²ΡŠΠ·Π΅Π»ΡŠΡ‚ с Π½Π°ΠΉ-ниска чСстота Ρ‰Π΅ ΠΏΠΎΠ»ΡƒΡ‡ΠΈ Π½Π°ΠΉ-висок ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚. Π‘Ρ‚ΡŠΠΏΠΊΠΈΡ‚Π΅ Π½Π° ΠΈΠ·Π³Ρ€Π°ΠΆΠ΄Π°Π½Π΅ са описани ΠΏΠΎ-Π΄ΠΎΠ»Ρƒ:

  1. Π‘ΡŠΠ·Π΄Π°ΠΉΡ‚Π΅ листов възСл Π·Π° всСки Π·Π½Π°ΠΊ ΠΈ Π³ΠΈ Π΄ΠΎΠ±Π°Π²Π΅Ρ‚Π΅ към ΠΎΠΏΠ°ΡˆΠΊΠ°Ρ‚Π° с ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ΠΈ.
  2. Π”ΠΎΠΊΠ°Ρ‚ΠΎ ΠΈΠΌΠ° ΠΏΠΎΠ²Π΅Ρ‡Π΅ ΠΎΡ‚ Π΅Π΄ΠΈΠ½ лист Π² ΠΎΠΏΠ°ΡˆΠΊΠ°Ρ‚Π°, Π½Π°ΠΏΡ€Π°Π²Π΅Ρ‚Π΅ слСдното:
    • ΠŸΡ€Π΅ΠΌΠ°Ρ…Π½Π΅Ρ‚Π΅ Π΄Π²Π°Ρ‚Π° възСла с Π½Π°ΠΉ-висок ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ (Π½Π°ΠΉ-ниска чСстота) ΠΎΡ‚ ΠΎΠΏΠ°ΡˆΠΊΠ°Ρ‚Π°;
    • Π‘ΡŠΠ·Π΄Π°ΠΉΡ‚Π΅ Π½ΠΎΠ² Π²ΡŠΡ‚Ρ€Π΅ΡˆΠ΅Π½ възСл, ΠΊΡŠΠ΄Π΅Ρ‚ΠΎ Ρ‚Π΅Π·ΠΈ Π΄Π²Π° възСла Ρ‰Π΅ Π±ΡŠΠ΄Π°Ρ‚ Π΄ΡŠΡ‰Π΅Ρ€Π½ΠΈ ΠΈ чСстотата Π½Π° поява Ρ‰Π΅ бъдС Ρ€Π°Π²Π½Π° Π½Π° сумата ΠΎΡ‚ чСстотитС Π½Π° Ρ‚Π΅Π·ΠΈ Π΄Π²Π° възСла.
    • Π”ΠΎΠ±Π°Π²Π΅Ρ‚Π΅ Π½ΠΎΠ² възСл към ΠΎΠΏΠ°ΡˆΠΊΠ°Ρ‚Π° с ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚ΠΈ.
  3. ЕдинствСният останал възСл Ρ‰Π΅ бъдС ΠΊΠΎΡ€Π΅Π½ΡŠΡ‚ ΠΈ Ρ‚ΠΎΠ²Π° Ρ‰Π΅ Π·Π°Π²ΡŠΡ€ΡˆΠΈ ΠΈΠ·Π³Ρ€Π°ΠΆΠ΄Π°Π½Π΅Ρ‚ΠΎ Π½Π° Π΄ΡŠΡ€Π²ΠΎΡ‚ΠΎ.

ΠŸΡ€Π΅Π΄ΡΡ‚Π°Π²Π΅Ρ‚Π΅ си, Ρ‡Π΅ ΠΈΠΌΠ°ΠΌΠ΅ някакъв тСкст, ΠΊΠΎΠΉΡ‚ΠΎ сС ΡΡŠΡΡ‚ΠΎΠΈ само ΠΎΡ‚ Π·Π½Π°Ρ†ΠΈ "a", "b", "c", "d" ΠΈ "ΠΈ"ΠΈ Ρ‚Π΅Ρ…Π½ΠΈΡ‚Π΅ чСстоти Π½Π° поява са ΡΡŠΠΎΡ‚Π²Π΅Ρ‚Π½ΠΎ 15, 7, 6, 6 ΠΈ 5. По-Π΄ΠΎΠ»Ρƒ ΠΈΠΌΠ° ΠΈΠ»ΡŽΡΡ‚Ρ€Π°Ρ†ΠΈΠΈ, ΠΊΠΎΠΈΡ‚ΠΎ отразяват ΡΡ‚ΡŠΠΏΠΊΠΈΡ‚Π΅ Π½Π° Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌΠ°.

ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman

ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman

ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman

ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman

ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman

ΠŸΡŠΡ‚ΡΡ‚ ΠΎΡ‚ ΠΊΠΎΡ€Π΅Π½Π° Π΄ΠΎ ΠΊΠΎΠΉΡ‚ΠΎ ΠΈ Π΄Π° Π΅ ΠΊΡ€Π°Π΅Π½ възСл Ρ‰Π΅ ΡΡŠΡ…Ρ€Π°Π½ΡΠ²Π° оптималния ΠΊΠΎΠ΄ Π½Π° прСфикса (извСстСн ΡΡŠΡ‰ΠΎ ΠΊΠ°Ρ‚ΠΎ ΠΊΠΎΠ΄ Π½Π° Π₯ΡŠΡ„ΠΌΠ°Π½), ΡΡŠΠΎΡ‚Π²Π΅Ρ‚ΡΡ‚Π²Π°Ρ‰ Π½Π° Π·Π½Π°ΠΊΠ°, ΡΠ²ΡŠΡ€Π·Π°Π½ с Ρ‚ΠΎΠ·ΠΈ ΠΊΡ€Π°Π΅Π½ възСл.

ΠΠ»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌ Π·Π° компрСсиранС Π½Π° Huffman
Π”ΡŠΡ€Π²ΠΎ Π½Π° Π₯ΡŠΡ„ΠΌΠ°Π½

По-Π΄ΠΎΠ»Ρƒ Ρ‰Π΅ Π½Π°ΠΌΠ΅Ρ€ΠΈΡ‚Π΅ ΠΈΠ·ΠΏΡŠΠ»Π½Π΅Π½ΠΈΠ΅Ρ‚ΠΎ Π½Π° Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌΠ° Π·Π° компрСсиранС Π½Π° Huffman Π² C++ ΠΈ Java:

#include <iostream>
#include <string>
#include <queue>
#include <unordered_map>
using namespace std;

// A Tree node
struct Node
{
	char ch;
	int freq;
	Node *left, *right;
};

// Function to allocate a new tree node
Node* getNode(char ch, int freq, Node* left, Node* right)
{
	Node* node = new Node();

	node->ch = ch;
	node->freq = freq;
	node->left = left;
	node->right = right;

	return node;
}

// Comparison object to be used to order the heap
struct comp
{
	bool operator()(Node* l, Node* r)
	{
		// highest priority item has lowest frequency
		return l->freq > r->freq;
	}
};

// traverse the Huffman Tree and store Huffman Codes
// in a map.
void encode(Node* root, string str,
			unordered_map<char, string> &huffmanCode)
{
	if (root == nullptr)
		return;

	// found a leaf node
	if (!root->left && !root->right) {
		huffmanCode[root->ch] = str;
	}

	encode(root->left, str + "0", huffmanCode);
	encode(root->right, str + "1", huffmanCode);
}

// traverse the Huffman Tree and decode the encoded string
void decode(Node* root, int &index, string str)
{
	if (root == nullptr) {
		return;
	}

	// found a leaf node
	if (!root->left && !root->right)
	{
		cout << root->ch;
		return;
	}

	index++;

	if (str[index] =='0')
		decode(root->left, index, str);
	else
		decode(root->right, index, str);
}

// Builds Huffman Tree and decode given input text
void buildHuffmanTree(string text)
{
	// count frequency of appearance of each character
	// and store it in a map
	unordered_map<char, int> freq;
	for (char ch: text) {
		freq[ch]++;
	}

	// Create a priority queue to store live nodes of
	// Huffman tree;
	priority_queue<Node*, vector<Node*>, comp> pq;

	// Create a leaf node for each character and add it
	// to the priority queue.
	for (auto pair: freq) {
		pq.push(getNode(pair.first, pair.second, nullptr, nullptr));
	}

	// do till there is more than one node in the queue
	while (pq.size() != 1)
	{
		// Remove the two nodes of highest priority
		// (lowest frequency) from the queue
		Node *left = pq.top(); pq.pop();
		Node *right = pq.top();	pq.pop();

		// Create a new internal node with these two nodes
		// as children and with frequency equal to the sum
		// of the two nodes' frequencies. Add the new node
		// to the priority queue.
		int sum = left->freq + right->freq;
		pq.push(getNode('', sum, left, right));
	}

	// root stores pointer to root of Huffman Tree
	Node* root = pq.top();

	// traverse the Huffman Tree and store Huffman Codes
	// in a map. Also prints them
	unordered_map<char, string> huffmanCode;
	encode(root, "", huffmanCode);

	cout << "Huffman Codes are :n" << 'n';
	for (auto pair: huffmanCode) {
		cout << pair.first << " " << pair.second << 'n';
	}

	cout << "nOriginal string was :n" << text << 'n';

	// print encoded string
	string str = "";
	for (char ch: text) {
		str += huffmanCode[ch];
	}

	cout << "nEncoded string is :n" << str << 'n';

	// traverse the Huffman Tree again and this time
	// decode the encoded string
	int index = -1;
	cout << "nDecoded string is: n";
	while (index < (int)str.size() - 2) {
		decode(root, index, str);
	}
}

// Huffman coding algorithm
int main()
{
	string text = "Huffman coding is a data compression algorithm.";

	buildHuffmanTree(text);

	return 0;
}

import java.util.HashMap;
import java.util.Map;
import java.util.PriorityQueue;

// A Tree node
class Node
{
	char ch;
	int freq;
	Node left = null, right = null;

	Node(char ch, int freq)
	{
		this.ch = ch;
		this.freq = freq;
	}

	public Node(char ch, int freq, Node left, Node right) {
		this.ch = ch;
		this.freq = freq;
		this.left = left;
		this.right = right;
	}
};

class Huffman
{
	// traverse the Huffman Tree and store Huffman Codes
	// in a map.
	public static void encode(Node root, String str,
							  Map<Character, String> huffmanCode)
	{
		if (root == null)
			return;

		// found a leaf node
		if (root.left == null && root.right == null) {
			huffmanCode.put(root.ch, str);
		}


		encode(root.left, str + "0", huffmanCode);
		encode(root.right, str + "1", huffmanCode);
	}

	// traverse the Huffman Tree and decode the encoded string
	public static int decode(Node root, int index, StringBuilder sb)
	{
		if (root == null)
			return index;

		// found a leaf node
		if (root.left == null && root.right == null)
		{
			System.out.print(root.ch);
			return index;
		}

		index++;

		if (sb.charAt(index) == '0')
			index = decode(root.left, index, sb);
		else
			index = decode(root.right, index, sb);

		return index;
	}

	// Builds Huffman Tree and huffmanCode and decode given input text
	public static void buildHuffmanTree(String text)
	{
		// count frequency of appearance of each character
		// and store it in a map
		Map<Character, Integer> freq = new HashMap<>();
		for (int i = 0 ; i < text.length(); i++) {
			if (!freq.containsKey(text.charAt(i))) {
				freq.put(text.charAt(i), 0);
			}
			freq.put(text.charAt(i), freq.get(text.charAt(i)) + 1);
		}

		// Create a priority queue to store live nodes of Huffman tree
		// Notice that highest priority item has lowest frequency
		PriorityQueue<Node> pq = new PriorityQueue<>(
										(l, r) -> l.freq - r.freq);

		// Create a leaf node for each character and add it
		// to the priority queue.
		for (Map.Entry<Character, Integer> entry : freq.entrySet()) {
			pq.add(new Node(entry.getKey(), entry.getValue()));
		}

		// do till there is more than one node in the queue
		while (pq.size() != 1)
		{
			// Remove the two nodes of highest priority
			// (lowest frequency) from the queue
			Node left = pq.poll();
			Node right = pq.poll();

			// Create a new internal node with these two nodes as children 
			// and with frequency equal to the sum of the two nodes
			// frequencies. Add the new node to the priority queue.
			int sum = left.freq + right.freq;
			pq.add(new Node('', sum, left, right));
		}

		// root stores pointer to root of Huffman Tree
		Node root = pq.peek();

		// traverse the Huffman tree and store the Huffman codes in a map
		Map<Character, String> huffmanCode = new HashMap<>();
		encode(root, "", huffmanCode);

		// print the Huffman codes
		System.out.println("Huffman Codes are :n");
		for (Map.Entry<Character, String> entry : huffmanCode.entrySet()) {
			System.out.println(entry.getKey() + " " + entry.getValue());
		}

		System.out.println("nOriginal string was :n" + text);

		// print encoded string
		StringBuilder sb = new StringBuilder();
		for (int i = 0 ; i < text.length(); i++) {
			sb.append(huffmanCode.get(text.charAt(i)));
		}

		System.out.println("nEncoded string is :n" + sb);

		// traverse the Huffman Tree again and this time
		// decode the encoded string
		int index = -1;
		System.out.println("nDecoded string is: n");
		while (index < sb.length() - 2) {
			index = decode(root, index, sb);
		}
	}

	public static void main(String[] args)
	{
		String text = "Huffman coding is a data compression algorithm.";

		buildHuffmanTree(text);
	}
}

Π—Π°Π±Π΅Π»Π΅ΠΆΠΊΠ°: ΠΏΠ°ΠΌΠ΅Ρ‚Ρ‚Π°, ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°Π½Π° ΠΎΡ‚ входния Π½ΠΈΠ·, Π΅ 47 * 8 = 376 Π±ΠΈΡ‚Π°, Π° кодираният Π½ΠΈΠ· Π΅ само 194 Π±ΠΈΡ‚Π°, Ρ‚.Π΅. Π΄Π°Π½Π½ΠΈΡ‚Π΅ са компрСсирани с ΠΎΠΊΠΎΠ»ΠΎ 48%. Π’ C++ ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠ°Ρ‚Π° ΠΏΠΎ-Π³ΠΎΡ€Π΅ Π½ΠΈΠ΅ ΠΈΠ·ΠΏΠΎΠ»Π·Π²Π°ΠΌΠ΅ класа string, Π·Π° Π΄Π° ΡΡŠΡ…Ρ€Π°Π½ΠΈΠΌ кодирания Π½ΠΈΠ·, Π·Π° ​​да Π½Π°ΠΏΡ€Π°Π²ΠΈΠΌ ΠΏΡ€ΠΎΠ³Ρ€Π°ΠΌΠ°Ρ‚Π° Ρ‡Π΅Ρ‚ΠΈΠΌΠ°.

Въй ΠΊΠ°Ρ‚ΠΎ Π΅Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΈΡ‚Π΅ ΠΏΡ€ΠΈΠΎΡ€ΠΈΡ‚Π΅Ρ‚Π½ΠΈ структури ΠΎΡ‚ Π΄Π°Π½Π½ΠΈ Π½Π° опашка изискват всяко вмъкванС O(log(N)) Π²Ρ€Π΅ΠΌΠ΅, Π½ΠΎ Π² пълно Π΄Π²ΠΎΠΈΡ‡Π½ΠΎ Π΄ΡŠΡ€Π²ΠΎ с N ΠΏΡ€ΠΈΡΡŠΡΡ‚Π²Π°Ρ‚ листа 2N-1 възли ΠΈ Π΄ΡŠΡ€Π²ΠΎΡ‚ΠΎ Π½Π° Π₯ΡŠΡ„ΠΌΠ°Π½ Π΅ пълно Π΄Π²ΠΎΠΈΡ‡Π½ΠΎ Π΄ΡŠΡ€Π²ΠΎ, Ρ‚ΠΎΠ³Π°Π²Π° Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΡŠΠΌΡŠΡ‚ сС изпълнява O(Nlog(N)) Π²Ρ€Π΅ΠΌΠ΅, къдС N - Π“Π΅Ρ€ΠΎΠΈ.

Π˜Π·Ρ‚ΠΎΡ‡Π½ΠΈΡ†ΠΈ:

en.wikipedia.org/wiki/Huffman_coding
en.wikipedia.org/wiki/Variable-length_code
www.youtube.com/watch?v=5wRPin4oxCo

НаучСтС ΠΏΠΎΠ²Π΅Ρ‡Π΅ Π·Π° курса.

Π˜Π·Ρ‚ΠΎΡ‡Π½ΠΈΠΊ: www.habr.com

ДобавянС Π½Π° Π½ΠΎΠ² ΠΊΠΎΠΌΠ΅Π½Ρ‚Π°Ρ€