3D Motion estimation for 3D video coding

Research output: Book chapter/Published conference paperConference paper

7 Citations (Scopus)
6 Downloads (Pure)

Abstract

H.264/MVC multi-view video coding provides a better compression rate compared to the simulcast coding using hierarchical B-picture prediction structure exploiting inter- and intra-view redundancy. However, this technique imposes random access frame delay as well as requiring huge computational time. In this paper a novel technique is proposed using 3D motion estimation (3D-ME) to overcome the problems. In the 3D-ME technique, a 3D frame is formed using the same temporal frames of all views and ME is carried out for the current 3D frame using the immediate previous 3D frame as a reference frame. As the correlation among the intra-view images is higher compared to the correlation among the inter-view images, the proposed 3D-ME technique reduces the overall computational time and eliminates the frame delay with comparable rate-distortion (RD) performance compared to H.264/MVC. Another technique is also proposed in the paper where an extra reference 3D frame comprising dynamic background frames (the most common frame of a scene i.e., McFIS) of each view is used for 3D-ME. Experimental results reveal that the proposed 3D-ME-McFIS technique outperforms the H.264/MVC in terms of improved RD performance by reducing computational time and by eliminating the random access frame delay.
Original languageEnglish
Title of host publication2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings
EditorsHideaki Sakai
Place of PublicationUSA
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages1189-1192
Number of pages4
DOIs
Publication statusPublished - 2012
Event2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) - Kyoto International Conference Center (KICC), Kyoto, Japan
Duration: 25 Mar 201230 Mar 2012

Conference

Conference2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
CountryJapan
CityKyoto
Period25/03/1230/03/12
OtherThe latest research results on both theories and applications on signal processing will be presented and discussed among participants from all over the world. Video/Speech Signal processing used in human interface between Robots and Personal users will be highlighted.

Fingerprint

Motion estimation
Image coding
Redundancy

Cite this

Paul, M., Gao, J., & Antolovich, M. (2012). 3D Motion estimation for 3D video coding. In H. Sakai (Ed.), 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings (pp. 1189-1192). USA: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICASSP.2012.6288100
Paul, Manoranjan ; Gao, Junbin ; Antolovich, Michael. / 3D Motion estimation for 3D video coding. 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings. editor / Hideaki Sakai. USA : IEEE, Institute of Electrical and Electronics Engineers, 2012. pp. 1189-1192
@inproceedings{036593740db74568861fbd6eab52a143,
title = "3D Motion estimation for 3D video coding",
abstract = "H.264/MVC multi-view video coding provides a better compression rate compared to the simulcast coding using hierarchical B-picture prediction structure exploiting inter- and intra-view redundancy. However, this technique imposes random access frame delay as well as requiring huge computational time. In this paper a novel technique is proposed using 3D motion estimation (3D-ME) to overcome the problems. In the 3D-ME technique, a 3D frame is formed using the same temporal frames of all views and ME is carried out for the current 3D frame using the immediate previous 3D frame as a reference frame. As the correlation among the intra-view images is higher compared to the correlation among the inter-view images, the proposed 3D-ME technique reduces the overall computational time and eliminates the frame delay with comparable rate-distortion (RD) performance compared to H.264/MVC. Another technique is also proposed in the paper where an extra reference 3D frame comprising dynamic background frames (the most common frame of a scene i.e., McFIS) of each view is used for 3D-ME. Experimental results reveal that the proposed 3D-ME-McFIS technique outperforms the H.264/MVC in terms of improved RD performance by reducing computational time and by eliminating the random access frame delay.",
keywords = "3D Motion Estimation, 3D Video Coding, Hierarchical B-picture, McFIS, Reference frames, Uncovered background",
author = "Manoranjan Paul and Junbin Gao and Michael Antolovich",
note = "Imported on 03 May 2017 - DigiTool details were: publisher = USA: IEEE, 2012. editor/s (773b) = Hideaki Sakai; Event dates (773o) = March 25 - 30, 2012; Parent title (773t) = IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).",
year = "2012",
doi = "10.1109/ICASSP.2012.6288100",
language = "English",
pages = "1189--1192",
editor = "Hideaki Sakai",
booktitle = "2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings",
publisher = "IEEE, Institute of Electrical and Electronics Engineers",
address = "United States",

}

Paul, M, Gao, J & Antolovich, M 2012, 3D Motion estimation for 3D video coding. in H Sakai (ed.), 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings. IEEE, Institute of Electrical and Electronics Engineers, USA, pp. 1189-1192, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 25/03/12. https://doi.org/10.1109/ICASSP.2012.6288100

3D Motion estimation for 3D video coding. / Paul, Manoranjan; Gao, Junbin; Antolovich, Michael.

2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings. ed. / Hideaki Sakai. USA : IEEE, Institute of Electrical and Electronics Engineers, 2012. p. 1189-1192.

Research output: Book chapter/Published conference paperConference paper

TY - GEN

T1 - 3D Motion estimation for 3D video coding

AU - Paul, Manoranjan

AU - Gao, Junbin

AU - Antolovich, Michael

N1 - Imported on 03 May 2017 - DigiTool details were: publisher = USA: IEEE, 2012. editor/s (773b) = Hideaki Sakai; Event dates (773o) = March 25 - 30, 2012; Parent title (773t) = IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

PY - 2012

Y1 - 2012

N2 - H.264/MVC multi-view video coding provides a better compression rate compared to the simulcast coding using hierarchical B-picture prediction structure exploiting inter- and intra-view redundancy. However, this technique imposes random access frame delay as well as requiring huge computational time. In this paper a novel technique is proposed using 3D motion estimation (3D-ME) to overcome the problems. In the 3D-ME technique, a 3D frame is formed using the same temporal frames of all views and ME is carried out for the current 3D frame using the immediate previous 3D frame as a reference frame. As the correlation among the intra-view images is higher compared to the correlation among the inter-view images, the proposed 3D-ME technique reduces the overall computational time and eliminates the frame delay with comparable rate-distortion (RD) performance compared to H.264/MVC. Another technique is also proposed in the paper where an extra reference 3D frame comprising dynamic background frames (the most common frame of a scene i.e., McFIS) of each view is used for 3D-ME. Experimental results reveal that the proposed 3D-ME-McFIS technique outperforms the H.264/MVC in terms of improved RD performance by reducing computational time and by eliminating the random access frame delay.

AB - H.264/MVC multi-view video coding provides a better compression rate compared to the simulcast coding using hierarchical B-picture prediction structure exploiting inter- and intra-view redundancy. However, this technique imposes random access frame delay as well as requiring huge computational time. In this paper a novel technique is proposed using 3D motion estimation (3D-ME) to overcome the problems. In the 3D-ME technique, a 3D frame is formed using the same temporal frames of all views and ME is carried out for the current 3D frame using the immediate previous 3D frame as a reference frame. As the correlation among the intra-view images is higher compared to the correlation among the inter-view images, the proposed 3D-ME technique reduces the overall computational time and eliminates the frame delay with comparable rate-distortion (RD) performance compared to H.264/MVC. Another technique is also proposed in the paper where an extra reference 3D frame comprising dynamic background frames (the most common frame of a scene i.e., McFIS) of each view is used for 3D-ME. Experimental results reveal that the proposed 3D-ME-McFIS technique outperforms the H.264/MVC in terms of improved RD performance by reducing computational time and by eliminating the random access frame delay.

KW - 3D Motion Estimation

KW - 3D Video Coding

KW - Hierarchical B-picture

KW - McFIS

KW - Reference frames

KW - Uncovered background

U2 - 10.1109/ICASSP.2012.6288100

DO - 10.1109/ICASSP.2012.6288100

M3 - Conference paper

SP - 1189

EP - 1192

BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings

A2 - Sakai, Hideaki

PB - IEEE, Institute of Electrical and Electronics Engineers

CY - USA

ER -

Paul M, Gao J, Antolovich M. 3D Motion estimation for 3D video coding. In Sakai H, editor, 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing: Proceedings. USA: IEEE, Institute of Electrical and Electronics Engineers. 2012. p. 1189-1192 https://doi.org/10.1109/ICASSP.2012.6288100