2009年9月13日 星期日

Vertical Bit Map Coding

[2009-09-20]

Bad news. I have found one bug in my Matlab code. The bitrate counter have been implemented in a wrong way so that the bitrate have been overcounted. The bitstream of 32kbps is actually the bitstream of 64kbps. Therefore, the coding efficiency shall be further underestimated.

=====================================================

A. Introduction
For recording my research in audio bit plane coding technique, here I demonstrate a series of decoded results produced by a fine-grain-scalable lossy-to-lossless audio coder. The key idea is quite simple but effective. From my doctoral study, I learned the significant map (the most significant bits of all coefficient to be encoded) plays a critical role in both coding efficiency and audio quality. If I can use as less bits as possible to encode the significant map, then the decoded quality will be as good as possible even in a very low bitrate. Unlike the scalable audio coder, Harmonic Quad Tree coding (HQT), in my doctoral thesis, this coder explores the relationship between the most significant bits of two adjacent frequency coefficients, not harmonic relationships among frequency coefficients. In addition, the bit plane coding technique, Set Partitioning In Hierarchical Trees (SPIHT), is absent in the new coder. I would like to named this coder as Vertical Bit Map coding (VBM) at this moment.

B. Experiment
The chosen test audio piece is from "castant" which is a famous test material for audio compression. Due to its rich and fast claps, the audio artifact so called "pre-echo" can be easily perceived after lossy compressed.

The following two examples show an interesting function I called it variable bitrate playback. To make you quickly understand this function, I decoded the bitstreams according a series of specified varying bitrates frame by frame and drawn their spectrum. The figures have the original spectrum in the top, the decoded spectrum in the middle, and the specified varying decoding bitrate of each frame in the bottom.

The first experiment is set the varying bitrate from 16kbps to 64kbps in a step size of 2kbits.


The second experiment is set the varying bitrate from 32kbps to 64kbps in a step unit of 1kbits.
The audio qualities are very defective because these two experiments are extraordinary only for demonstration. In an usual scenario, the varying decoding bitrate is expected to change in a smooth way. For keeping acceptable audio quality, the decoding bitrate would not be allowed to adjust so frequently and amplify so large range in a short time.

There are also some comparsions with other scalable audio coders. They can be downloaded from Result Comparsion, all 44.1KHz, 16-bit, mono.

The file list describes briefly here,
cast_ori.wav – castant original wave file
cast_rec_(16_64_2kbps).wav – the first experiment
cast_rec_(32_64_1kbps).wav – the second experiment
cast_rec_(16kbps).wav – 16kbps fixed rate VBM version
cast_rec_(32kbps).wav – 32kbps fixed rate VBM version
cast_rec_(64kbps).wav – 64kbps fixed rate VBM version
cast32_bsac.wav – 32kbps fixed rate MPEG4 BSAC version
cast32_sac.wav – 32kbps fixed rate Scala SAC version
cast32_hqt.wav – 32kbps fixed rate HQT version
cast32_eac.wav – 32kbps fixed rate Mircosoft EAC version

C. Summary
As a conclusion and a reminder for myself, this VBM coder is an experiment of my speculation in the importance of significat map in audio coding. This VBM audio coder can be categorized as:
1. Fine-scalability: scalable grain is downto 1 bit,
2. Lossy-to-lossless: the embedded bitstream contains all possible playback bitrates, even the archive,
3. Low complexity: there are very few prediction calculations and history storages in analyzing and reconstucting the significant map. Therefore it can be claimed that complexity is equivalent and low in both encoding and decoding,

and has the following features:
1. Encode once, decode many time: the basic goal VBM tries to achieve,
2. Variable bitrate playback: by indicating the decoding bitrate of every frame, decoder can achieve variable bitrate playback in frame scale,
3. Real-time encoding and decoding.

It is noted that the VBM coder has NO
1. Psychoaoustic model: there is no perceptual controlling in the current implement,
2. Window switch: there is no attack detection to control long/short window switch. Yes, there is one single frame size.

3 則留言:

好想永遠這樣 提到...

Sorry that I found the file link is broken. Please someone help me to fix it out.
I think my ftp account has some problem that it can only see two directories, Backup and root...

D. N. A. 提到...

應該沒有問題 大家似乎都是Backup & root

記得只要放在root下 URL是http://ludwig.csie.ncku.edu.tw/members/%username%/%filename%

%username%是ftp帳號
%filenme%就是filename啦~~

這個似乎很有趣 超想看結果XD

SCREAMLab 提到...

I will love to discuss VBM in more detail for I do much desire to talk to you.

I wish to invite DNA to join if you agree.