Abstract
To exploit high temporal correlations in video frames of the same scene, the current frame is predicted from the already-encoded reference frames using block-based motion estimation and compensation techniques. While this approach can efficiently exploit the translation motion of the moving objects, it is susceptible to other types of affine motion and object occlusion/deocclusion. Recently, deep learning has been used to model the high-level structure of human pose in specific actions from short videos and then generate virtual frames in future time by predicting the pose using a generative adversarial network (GAN). Therefore, modelling the high-level structure of human pose is able to exploit semantic correlation by predicting human actions and determining its trajectory. Video surveillance applications will benefit as stored 'big' surveillance data can be compressed by estimating human pose trajectories and generating future frames through semantic correlation. This paper explores a new way of video coding by modelling human pose from the already-encoded frames and using the generated frame at the current time as an additional forward-referencing frame. It is expected that the proposed approach can overcome the limitations of the traditional backward-referencing frames by predicting the blocks containing the moving objects with lower residuals. Our experimental results show that the proposed approach can achieve on average up to 2.83 dB PSNR gain and 25.93% bitrate savings for high motion video sequences compared to standard video coding.
Original language | English |
---|---|
Title of host publication | 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) |
Place of Publication | United States |
Publisher | IEEE |
Number of pages | 5 |
ISBN (Electronic) | 9781665475921 |
ISBN (Print) | 9781665475938 |
DOIs | |
Publication status | E-pub ahead of print - 16 Jan 2023 |
Event | 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) - Virtual, Suzhou, China Duration: 13 Dec 2022 → 16 Dec 2022 https://ieeexplore-ieee-org.ezproxy.csu.edu.au/xpl/conhome/10008391/proceeding (Proceedings) https://web.archive.org/web/20221114191144/http://vcip2022.org/ (Conference website) |
Publication series
Name | 2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022 |
---|
Conference
Conference | 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) |
---|---|
Country/Territory | China |
City | Suzhou |
Period | 13/12/22 → 16/12/22 |
Other | The IEEE Visual Communications and Image Processing (VCIP) Conference, sponsored by the IEEE Circuits and Systems Society, will be held in Suzhou, China, during December 13 – 16, 2022. VCIP is the oldest conference in the field and one of the flagship conferences of the IEEE CAS Visual Signal Processing and Communications. Since 1986 VCIP has served as a premier forum for the exchange of fundamental and applied research in the field of visual communications and image processing. VCIP has a long tradition in showcasing pioneering technologies in visual communication and processing, and many landmark papers first appeared in VCIP. VCIP 2022 will carry on this tradition of VCIP in disseminating the state of art of visual communication technology, brainstorming and envisioning the future of visual communication technology and applications. The main theme would be new media, including VR, point cloud capture and playback, and new visual processing tools including deep learning for intelligence distilling in visual information pre- and post-processing such as de-blurring, super resolution, 3D understanding, and content based image enhancement. High quality papers will be recommended to TCSVT for journal extension! |
Internet address |