MPEG-4 Part 3 or MPEG-4 Audio (formally ISO/IEC 14496-3) is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.
The MPEG-4 Part 3 consists of a variety of audio coding technologies – from lossy speech coding (HVXC, CELP), general audio coding (AAC, TwinVQ, BSAC), lossless audio compression (MPEG-4 SLS, Audio Lossless Coding, MPEG-4 DST), a Text-To-Speech Interface (TTSI), Structured Audio (using SAOL, SASL, MIDI) and many additional audio synthesis and coding techniques.
MPEG-4 Audio does not target a single application such as real-time telephony or high-quality audio compression. It applies to every application which requires the use of advanced sound compression, synthesis, manipulation, or playback.
MPEG-4 Audio is a new type of audio standard that integrates numerous different types of audio coding: natural sound and synthetic sound, low bitrate delivery and high-quality delivery, speech and music, complex soundtracks and simple ones, traditional content and interactive content.
|-
! Edition
! Release date
! Latest amendment
! Standard
! Description
|-
| First edition
| 1999
| 2001
| ISO/IEC 14496-3:1999
| also known as "MPEG-4 Audio Version 2", an Amendment to first edition
|
|-
| Third edition
| 2005
| 2008
| ISO/IEC 14496-3:2005
|
|-
| Fourth edition
| 2009
| 2015 and under development
|
|-
| Fifth edition
| 2019
|
| ISO/IEC 14496-3:2019
| Current version
|}
Subparts
MPEG-4 Part 3 contains following subparts: Object Type is used to distinguish between different coding methods. It directly determines the MPEG-4 tool subset required to decode a specific object. The MPEG-4 profiles are based on the object types and each profile supports a different list of object types.
|-
! Object Type ID
! Audio Object Type
! First public release date
! Description
|-
| 1
| AAC Main
| 1999
| contains AAC LC
|-
| 2
| AAC LC (Low Complexity)
| 1999
| Used in the "AAC Profile". MPEG-4 AAC LC Audio Object Type is based on the MPEG-2 Part 7 Low Complexity profile (LC) combined with Perceptual Noise Substitution (PNS) (defined in MPEG-4 Part 3 Subpart 4).
|-
| 3
| AAC SSR (Scalable Sample Rate)
| 1999
| MPEG-4 AAC SSR Audio Object Type is based on the MPEG-2 Part 7 Scalable Sampling Rate profile (SSR) combined with Perceptual Noise Substitution (PNS) (defined in MPEG-4 Part 3 Subpart 4).
| used with AAC LC in the "High Efficiency AAC Profile" (HE-AAC v1)
|-
| 6
| AAC Scalable
| 1999
|
|-
| 7
| TwinVQ
| 1999
| audio coding at very low bitrates
|-
| 8
| CELP (Code Excited Linear Prediction)
| 1999
| speech coding
|-
| 9
| HVXC (Harmonic Vector eXcitation Coding)
| 1999
| speech coding
|-
| 10
| (Reserved)
|
|
|-
| 11
| (Reserved)
|
|
|-
| 12
| TTSI (Text-To-Speech Interface)
| 1999
|
|-
| 13
| Main synthesis
| 1999
| contains 'wavetable' sample-based synthesis contains General MIDI
|-
| 15
| General MIDI
| 1999
|
|-
| 16
| Algorithmic Synthesis and Audio Effects
| 1999
|
|-
| 17
| ER AAC LC
| 2000
| Error Resilient
|-
| 18
| (Reserved )
|
|
|-
| 19
| ER AAC LTP
| 2000
| Error Resilient
|-
| 20
| ER AAC Scalable
| 2000
| Error Resilient
|-
| 21
| ER TwinVQ
| 2000
| Error Resilient
|-
| 22
| ER BSAC (Bit-Sliced Arithmetic Coding)
| 2000
| It is also known as "Fine Granule Audio" or fine grain scalability tool. It is used in combination with the AAC coding tools and replaces the noiseless coding and the bitstream formatting of MPEG-4 Version 1 GA coder. Error Resilient
|-
| 23
| ER AAC LD (Low Delay)
| 2000
| Error Resilient, used with CELP, ER CELP, HVXC, ER HVXC and TTSI in the "Low Delay Profile", (commonly used for real-time conversation applications)
|-
| 24
| ER CELP
| 2000
| Error Resilient
|-
| 25
| ER HVXC
| 2000
| Error Resilient
|-
| 26
| ER HILN (Harmonic and Individual Lines plus Noise)
| 2000
| Error Resilient
|-
| 27
| ER Parametric
| 2000
| Error Resilient
|-
| 28
| SSC (SinuSoidal Coding)
| 2004
|
|-
| 29
| PS (Parametric Stereo)
| 2004 and 2006
| used with AAC LC and SBR in the "HE-AAC v2 Profile". PS coding tool was defined in 2004 and Object Type defined in 2006.
|-
| 30
| MPEG Surround
| 2007
| also known as MPEG Spatial Audio Coding (SAC), it is a type of spatial audio coding (MPEG Surround was also defined in ISO/IEC 23003-1 in 2007)
|-
| 31
| (ESCAPE)
|
|
|-
| 32
| MPEG-1/2 Layer-1
| 2005
|
|-
| 33
| MPEG-1/2 Layer-2
| 2005
| lossless audio coding, used on Super Audio CD
|-
| 36
| ALS (Audio Lossless Coding)
| 2006
| two-layer audio coding with lossless layer and lossy General Audio core/layer (e.g. AAC)
|-
| 38
| SLS non-core
| 2006
| lossless audio coding without lossy General Audio core/layer (e.g. AAC)
|-
| 39
| ER AAC ELD (Enhanced Low Delay)
| 2008
| Error Resilient
|-
| 40
| SMR (Symbolic Music Representation) Simple
| 2008
| note: Symbolic Music Representation is also the MPEG-4 Part 23 standard (ISO/IEC 14496-23:2008)
|-
| 41
| SMR Main
| 2008
|
|-
| 42
| USAC (Unified Speech and Audio Coding)
| 2012
| Unified Speech and audio Coding is defined in MPEG-D Part 3 (ISO/IEC 23003-3:2012)
|-
| 43
| SAOC (Spatial Audio Object Coding)
| 2010
| note: Spatial Audio Object Coding is also the MPEG-D Part 2 standard (ISO/IEC 23003-2:2010)
|-
| 44
| LD MPEG Surround
| 2010
| This object type conveys Low Delay MPEG Surround Coding side information (that was defined in MPEG-D Part 2 – ISO/IEC 23003-2
| 2009
|-
| ALS Simple Profile
| ALS
| 2010
|}
Audio storage and transport
{| class="wikitable sortable" width="100%"
|+Multiplex, storage and transmission formats for MPEG-4 Audio
|-
| Multiplex
| ISO/IEC 14496-3
| Low Overhead Audio Transport Multiplex (LATM)
|-
| Storage
| ISO/IEC 14496-3 (informative)
| Audio Data Interchange Format (ADIF) – only for AAC
|-
| Storage
| ISO/IEC 14496-12
| MPEG-4 file format (MP4) / ISO base media file format
|-
| Transmission
| ISO/IEC 14496-3 (informative)
| Audio Data Transport Stream (ADTS) – only for AAC
|-
| Transmission
| ISO/IEC 14496-3
| Low Overhead Audio Stream (LOAS), based on LATM
|}
There is no standard for transport of elementary streams over a channel, because the broad range of MPEG-4 applications have delivery requirements that are too wide to easily characterize with a single solution.
The capabilities of a transport layer and the communication between transport, multiplex, and demultiplex functions are described in the Delivery Multimedia Integration Framework (DMIF) in ISO/IEC 14496-6. Low Complexity profile (LC), Main profile and Scalable Sampling Rate profile (SSR).
The MPEG-4 Part 3 Subpart 4 (General Audio Coding) combined the profiles from MPEG-2 Part 7 with Perceptual Noise Substitution (PNS) and defined them as Audio Object Types (AAC LC, AAC Main, AAC SSR).
See also
- TwinVQ – one of the object types defined in MPEG-4 Audio version 1
- MPEG-4 Part 2
- MPEG-4 Part 14 container format (MP4)
- Digital rights management
- Advanced Audio Coding (AAC)
- ISO/IEC JTC 1/SC 29
References
External links
- Apple: MPEG-4: AAC
- "AAC" (VideoLAN WIKI)
- EBU subjective listening tests on low-bitrate audio codecs
- AAC radio stations – Online radio stations in AAC format
- Tuner2 – Directory of radio stations in AAC+ format at various bitrates
- RadioFeeds UK & Ireland – Page containing plenty of terrestrial stations webcasting in AAC+ format.
- Results of 64 kbit/s Listening Test A page comparing codecs including HE-AAC @64 kbit/s by listening tests. (Page is offline)
- Official MPEG web site
- – RTP Payload Format for MPEG-4 Audio/Visual Streams
- – RTP Payload Format for Transport of MPEG-4 Elementary Streams
- – The Codecs Parameter for "Bucket" Media Types
- – MIME Type Registration for MPEG-4
