AAC音頻格式分析

關于AAC音頻格式基本情況，可參考維基百科http://en.wikipedia.org/wiki/Advanced_Audio_Coding

AAC音頻格式分析

AAC音頻格式有ADIF和ADTS：?

ADIF：Audio Data Interchange Format 音頻數據交換格式。這種格式的特征是可以確定的找到這個音頻數據的開始，不需進行在音頻數據流中間開始的解碼，即它的解碼必須在明確定義的開始處進行。故這種格式常用在磁盤文件中。?

ADTS：Audio Data Transport Stream 音頻數據傳輸流。這種格式的特征是它是一個有同步字的比特流，解碼可以在這個流中任何位置開始。它的特征類似于mp3數據流格式。?

簡單說，ADTS可以在任意幀解碼，也就是說它每一幀都有頭信息。ADIF只有一個統一的頭，所以必須得到所有的數據后解碼。且這兩種的header的格式也是不同的，目前一般編碼后的和抽取出的都是ADTS格式的音頻流。?

語音系統對實時性要求較高，基本是這樣一個流程，采集音頻數據，本地編碼，數據上傳，服務器處理，數據下發，本地解碼?

ADTS是幀序列，本身具備流特征，在音頻流的傳輸與處理方面更加合適。?

ADTS幀結構：

header

body

ADTS幀首部結構：

序號	域	長度（bits）	說明
1	Syncword	12	all bits?must?be 1
2	MPEG version	1	0 for MPEG-4, 1 for MPEG-2
3	Layer	2	always 0
4	Protection Absent	1	et to 1 if there is no CRC and 0 if there is CRC
5	Profile	2	the?MPEG-4 Audio Object Type?minus 1
6	MPEG-4 Sampling Frequency Index	4	MPEG-4 Sampling Frequency Index?(15 is forbidden)
7	Private Stream	1	set to 0 when encoding, ignore when decoding
8	MPEG-4 Channel Configuration	3	MPEG-4 Channel Configuration?(in the case of 0, the channel configuration is sent via an inband PCE)
9	Originality	1	set to 0 when encoding, ignore when decoding
10	Home	1	set to 0 when encoding, ignore when decoding
11	Copyrighted Stream	1	set to 0 when encoding, ignore when decoding
12	Copyrighted Start	1	set to 0 when encoding, ignore when decoding
13	Frame Length	13	this value must include 7 or 9 bytes of header length: FrameLength = (ProtectionAbsent == 1 ? 7 : 9) + size(AACFrame)
14	Buffer Fullness	11	buffer fullness
15	Number of AAC Frames	2	number of AAC frames (RDBs) in ADTS frame?minus 1, for maximum compatibility always use 1 AAC frame per ADTS frame
16	CRC	16	CRC if?protection absent?is 0

MPEG-4 Audio

Company:?ISO
Samples:?http://samples.mplayerhq.hu/MPEG-4/
Samples:?http://samples.mplayerhq.hu/A-codecs/AAC/
Samples:?sample repo at standards.iso.org
Sample Docs:?sample docs

Specification links:

MPEG-4 Audio:?ISO/IEC 14496-3:2009
Conformance:?ISO/IEC 14496-26:2010

?[hide]?

1?MPEG-4 Audio
2?Subparts
3?Audio Specific Config
4?Audio Object Types
5?Sampling Frequencies
6?Channel Configurations

MPEG-4 Audio

MPEG-4 includes a system for handling a diverse group of audio formats in a uniform matter. Each format is assigned a unique Audio Object Type (AOT) to represent it. The common format Global header shared by all AOTs is called the Audio Specific Config.?

Subparts

Subpart 0: Overview
Subpart 1: Main (Systems?Interaction)
Subpart 2: Speech coding - HVXC
Subpart 3: Speech coding - CELP
Subpart 4: General Audio coding (GA) -?AAC, TwinVQ, BSAC
Subpart 5: Structured Audio (SA)
Subpart 6: Text To Speech Interface (TTSI)
Subpart 7: Parametric Audio Coding - HILN
Subpart 8: Parametric coding for high quality audio - SSC (and?Parametric Stereo)
Subpart 9:?MPEG-1/2 Audio?in MPEG-4
Subpart 10: Lossless coding of oversampled audio - DST
Subpart 11: Audio lossless coding -?ALS
Subpart 12: Scalable lossless coding -?SLS

Audio Specific Config

The Audio Specific Config is the global header for MPEG-4 Audio:

5 bits: object type
if (object type == 31)6 bits + 32: object type
4 bits: frequency index
if (frequency index == 15)24 bits: frequency
4 bits: channel configuration
var bits: AOT Specific Config

Audio Object Types

MPEG-4 Audio Object Types:

0: Null?
1:?AAC?Main
2:?AAC?LC (Low Complexity)
3:?AAC?SSR (Scalable Sample Rate)
4:?AAC?LTP (Long Term Prediction)
5: SBR (Spectral Band Replication)
6:?AAC?Scalable
7:?TwinVQ
8:?CELP?(Code Excited Linear Prediction)
9: HXVC (Harmonic Vector eXcitation Coding)
10: Reserved
11: Reserved
12: TTSI (Text-To-Speech Interface)
13: Main Synthesis
14: Wavetable Synthesis
15: General MIDI
16: Algorithmic Synthesis and Audio Effects
17: ER (Error Resilient)?AAC?LC
18: Reserved
19: ER?AAC?LTP
20: ER?AAC?Scalable
21: ER?TwinVQ
22: ER?BSAC?(Bit-Sliced Arithmetic Coding)
23: ER?AAC?LD (Low Delay)
24: ER?CELP
25: ER HVXC
26: ER HILN (Harmonic and Individual Lines plus Noise)
27: ER Parametric
28: SSC (SinuSoidal Coding)
29: PS (Parametric Stereo)
30:?MPEG Surround
31: (Escape value)
32:?Layer-1
33:?Layer-2
34:?Layer-3
35: DST (Direct Stream Transfer)
36:?ALS?(Audio Lossless)
37:?SLS?(Scalable LosslesS)
38:?SLS?non-core
39: ER?AAC?ELD (Enhanced Low Delay)
40: SMR (Symbolic Music Representation) Simple
41: SMR Main
42:?USAC?(Unified Speech and Audio Coding) (no?SBR)
43: SAOC (Spatial Audio Object Coding)
44: LD?MPEG Surround
45:?USAC

Sampling Frequencies

There are 13 supported frequencies:

0: 96000 Hz
1: 88200 Hz
2: 64000 Hz
3: 48000 Hz
4: 44100 Hz
5: 32000 Hz
6: 24000 Hz
7: 22050 Hz
8: 16000 Hz
9: 12000 Hz
10: 11025 Hz
11: 8000 Hz
12: 7350 Hz
13: Reserved
14: Reserved
15: frequency is written explictly

Channel Configurations

These are the channel configurations:

0: Defined in AOT Specifc Config
1: 1 channel: front-center
2: 2 channels: front-left, front-right
3: 3 channels: front-center, front-left, front-right
4: 4 channels: front-center, front-left, front-right, back-center
5: 5 channels: front-center, front-left, front-right, back-left, back-right
6: 6 channels: front-center, front-left, front-right, back-left, back-right, LFE-channel
7: 8 channels: front-center, front-left, front-right, side-left, side-right, back-left, back-right, LFE-channel
8-15: Reserved

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/383464.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/383464.shtml
英文地址，請注明出處：http://en.pswp.cn/news/383464.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！