Awọn ọna fun titẹkuro/fipamọ data media ni awọn ọna kika WAVE ati JPEG, apakan 1

Pẹlẹ o! Mi akọkọ jara ti awọn nkan yoo dojukọ lori kikọ aworan / funmorawon ohun ati awọn ọna ibi ipamọ bii JPEG (aworan) ati WAVE (ohun), ati pe yoo tun pẹlu awọn apẹẹrẹ ti awọn eto nipa lilo awọn ọna kika wọnyi (.jpg, .wav) ni iṣe. Ni apakan yii a yoo wo WAVE.

История

WAVE (Iwe kika faili Audio Waveform) jẹ ọna kika faili eiyan fun titoju gbigbasilẹ ti ṣiṣan ohun kan. Eiyan yii ni a maa n lo lati ṣafipamọ koodu pulse ti a ko tẹ ohun ti a yipada. (Yi lati Wikipedia)

O jẹ idasilẹ ati titẹjade ni ọdun 1991 papọ pẹlu RIFF nipasẹ Microsoft ati IBM (Awọn ile-iṣẹ IT ti o ṣaju ni akoko yẹn).

Ilana faili

Faili naa ni apakan akọsori, data funrararẹ, ṣugbọn ko si ẹlẹsẹ. Akọsori ṣe iwọn apapọ 44 baiti.
Akọsori naa ni awọn eto fun nọmba awọn die-die ninu ayẹwo, oṣuwọn ayẹwo, ijinle ohun, ati bẹbẹ lọ. alaye ti o nilo fun kaadi ohun. (Gbogbo awọn iye tabili nọmba gbọdọ wa ni kikọ ni aṣẹ Little-Endian)

Orukọ Àkọsílẹ
Iwọn idina (B)
Apejuwe / Idi
Iye (fun diẹ ninu awọn ti o wa titi

chunkId
4
Ti n ṣalaye faili kan bi eiyan media
0x52494646 ni Big-Endian ("RIFF")

chunkIwọn
4
Iwọn gbogbo faili laisi chunkId ati chunkSize
FILE_SIZE - 8

kika
4
Iru asọye lati RIFF
0x57415645 ni Big-Endian ("WAVE")

subchunk1Id
4
Ki faili naa gba aaye diẹ sii nipa titẹsiwaju kika
0x666d7420 ni Big-Endian ("fmt")

subchunk1Iwon
4
Akọsori ti o ku (ninu awọn baiti)
16 nipasẹ aiyipada (fun ọran laisi funmorawon ṣiṣan ohun)

kika ohun
2
Ọna kika ohun (da lori ọna funmorawon ati eto data ohun ohun)
1 (fun PCM, eyiti o jẹ ohun ti a nro)

numChannels
2
Nọmba ti awọn ikanni
1/2, a yoo gba ikanni 1 (3/4/5/6/7... - orin ohun kan pato, fun apẹẹrẹ 4 fun ohun quad, bbl)

sampleRate
4
Oṣuwọn iṣapẹẹrẹ ohun (ni Hertz)
Ti o ga julọ, ohun naa yoo dara julọ, ṣugbọn iranti diẹ yoo nilo lati ṣẹda orin ohun ti ipari kanna, iye ti a ṣeduro jẹ 48000 (didara ohun itẹwọgba julọ)

byteRate
4
Nọmba ti awọn baiti fun keji
sampleRate numChannels bitsPerSample (siwaju sii)

blockAlign
2
Nọmba ti awọn baiti fun 1 ayẹwo
numChannels * bitsPerSample: 8

bitsPerSample
2
Nọmba awọn die-die fun apẹẹrẹ 1 (ijinle)
Nọmba eyikeyi ti o jẹ ọpọ ti 8. Nọmba ti o ga julọ, ohun ti o dara ati wuwo yoo jẹ; lati awọn bit 32 ko si iyatọ fun eniyan

subchunk2Id
4
Aami itọkasi data (niwọn igba ti awọn eroja akọsori miiran le wa ti o da lori kika ohun ohun)
0x64617461 ni Big-Endian("data")

subchunk2Iwon
4
Iwọn agbegbe data
data iwọn ni int

data
byteRate * iwe iye akoko
Data ohun
?

Apẹẹrẹ WAVE

Tabili ti tẹlẹ le ni irọrun tumọ si ọna kan ni C, ṣugbọn ede wa fun oni ni Python. Ohun ti o rọrun julọ ti o le ṣe ni lilo “igbi” - olupilẹṣẹ ariwo. Fun iṣẹ-ṣiṣe yii a ko nilo ga byteRate ati funmorawon.
Ni akọkọ, jẹ ki a gbe wọle awọn modulu pataki:

# WAV.py

from struct import pack  # перевод py-объектов в базовые типы из C
from os import urandom  # функция для чтения /dev/urandom, для windows:
# from random import randint
# urandom = lambda sz: bytes([randint(0, 255) for _ in range(sz)])  # лямбда под windows, т.к. urandom'а в винде нет
from sys import argv, exit  # аргументы к проге и выход

if len(argv) != 3:  # +1 имя скрипта (-1, если будете замораживать)
    print('Usage: python3 WAV.py [num of samples] [output]')
    exit(1)

Nigbamii ti, a nilo lati ṣẹda gbogbo awọn oniyipada pataki lati tabili gẹgẹbi awọn iwọn wọn. Awọn iye oniyipada ti o wa ninu rẹ dale lori awọn numSamples nikan (nọmba awọn ayẹwo). Bi wọn ṣe pọ sii, ariwo wa yoo pẹ to.

numSamples = int(argv[1])
output_path = argv[2]

chunkId = b'RIFF'
Format = b'WAVE'
subchunk1ID = b'fmt '
subchunk1Size = b'x10x00x00x00'  # 0d16
audioFormat = b'x01x00'
numChannels = b'x02x00'  # 2-х каналов будет достаточно (стерео)
sampleRate = pack('<L', 1000)  # 1000 хватит, но если поставить больше, то шум будет слышен лучше. С 1000-ю он звучит, как ветер
bitsPerSample = b'x20x00'  # 0d32
byteRate = pack('<L', 1000 * 2 * 4)  # sampleRate * numChannels * bitsPerSample / 8  (32 bit sound)
blockAlign = b'x08x00'  # numChannels * BPS / 8
subchunk2ID = b'data'
subchunk2Size = pack('<L', numSamples * 2 * 4)  # * numChannels * BPS / 8
chunkSize = pack('<L', 36 + numSamples * 2 * 4)  # 36 + subchunk2Size

data = urandom(1000 * 2 * 4 * numSamples)  # сам шум

Gbogbo ohun ti o ku ni lati kọ wọn silẹ ni ọna ti a beere (bii ninu tabili):

with open(output_path, 'wb') as fh:
    fh.write(chunkId + chunkSize + Format + subchunk1ID +
            subchunk1Size + audioFormat + numChannels + 
            sampleRate + byteRate + blockAlign + bitsPerSample +
            subchunk2ID + subchunk2Size + data)  # записываем

Ati bẹ, setan. Lati lo iwe afọwọkọ, a nilo lati ṣafikun awọn ariyanjiyan laini aṣẹ pataki:
python3 WAV.py [num of samples] [output]
nọmba ti awọn ayẹwo - ka. awọn apẹẹrẹ
o wu - ọna si faili o wu

Eyi ni ọna asopọ si faili ohun afetigbọ idanwo pẹlu ariwo, ṣugbọn lati ṣafipamọ iranti Mo sọ BPS silẹ si 1b/s ati sọ nọmba awọn ikanni silẹ si 1 (pẹlu ṣiṣan ohun ohun afetigbọ sitẹrio 32-bit kan ni 64kbs, o yipada lati jẹ. 80M ti funfun .wav faili, ati ki o nikan 10): https://instaud.io/3Dcy

Gbogbo koodu naa (WAV.py) (koodu naa ni ọpọlọpọ awọn iye oniyipada pidánpidán, eyi jẹ apẹrẹ kan):

from struct import pack  # перевод py-объектов в базовые типы из C
from os import urandom  # функция для чтения /dev/urandom, для windows:
# from random import randint
# urandom = lambda sz: bytes([randint(0, 255) for _ in range(sz)])  # лямбда под windows, т.к. urandom'а в винде нет
from sys import argv, exit  # аргументы к проге и выход

if len(argv) != 3:  # +1 имя скрипта (-1, если будете замораживать)
    print('Usage: python3 WAV.py [num of samples] [output]')
    exit(1)

numSamples = int(argv[1])
output_path = argv[2]

chunkId = b'RIFF'
Format = b'WAVE'
subchunk1ID = b'fmt '
subchunk1Size = b'x10x00x00x00'  # 0d16
audioFormat = b'x01x00'
numChannels = b'x02x00'  # 2-х каналов будет достаточно (стерео) 
sampleRate = pack('<L', 1000)  # 1000 хватит, но можно и больше.
bitsPerSample = b'x20x00'  # 0d32
byteRate = pack('<L', 1000 * 2 * 4)  # sampleRate * numChannels * bitsPerSample / 8  (32 bit sound)
blockAlign = b'x08x00'  # numChannels * BPS / 8
subchunk2ID = b'data'
subchunk2Size = pack('<L', numSamples * 2 * 4)  # * numChannels * BPS / 8
chunkSize = pack('<L', 36 + numSamples * 2 * 4)  # 36 + subchunk2Size

data = urandom(1000 * 2 * 4 * numSamples)  # сам шум

with open(output_path, 'wb') as fh:
    fh.write(chunkId + chunkSize + Format + subchunk1ID +
            subchunk1Size + audioFormat + numChannels + 
            sampleRate + byteRate + blockAlign + bitsPerSample +
            subchunk2ID + subchunk2Size + data)  # записываем в файл результат

Abajade

Nitorinaa o ti kọ ẹkọ diẹ sii nipa ohun oni-nọmba ati bii o ṣe fipamọ. Ninu ifiweranṣẹ yii a ko lo funmorawon (audioFormat), ṣugbọn lati gbero ọkọọkan awọn olokiki, awọn nkan 10 yoo nilo. Mo nireti pe o kọ nkan tuntun fun ararẹ ati pe eyi yoo ran ọ lọwọ ni awọn idagbasoke iwaju.
O ṣeun!

Awọn orisun

WAV faili be
WAV - Wikipedia

orisun: www.habr.com

Fi ọrọìwòye kun