Efficient video and point cloud compression through common information exploitation

  • Fariha Afsana

Research output: ThesisDoctoral Thesis

113 Downloads (Pure)

Abstract

The proliferation of multimedia services has driven an ever-increasing demand for high-quality visual data, encompassing both video and point clouds, necessitating efficient compression for storage, transmission, and playback. Existing video compression methods, relying on non-overlapping block partitioning, minimize spatial and temporal redundancies but lack effective global redundancy exploitation. This limitation is also apparent in point cloud compression, highlighting the challenge of optimizing compression by efficiently exploiting information from different perspectives. Since the quality of visual data compression depends on the efficacy of common and redundant information exploitation, in this thesis, we take a comprehensive approach to understanding the impact of common information exploitation in compression on diverse visual data, including UHD/360 video and point clouds, from both local and global perspectives. Our study spans various levels- object, frame, and scene by utilizing different approaches including frequency domain manipulation, background analysis, and 3D spatio-temporal analysis to exploit redundant information efficiently. The first contribution for this purpose involves improving the intra-frame compression efficiency of 360 videos leveraging low-frequency domain information as common information to reduce global redundancy at the frame level. Expanding on the initial contribution, our subsequent investigation further refines the common information exploitation approach by emphasizing spatial as well as temporal correlation. We leverage background information across frames to exploit temporal redundancy more effectively. To further minimize finer-level redundancy at the spatial scale, we employ cuboid-based, object-aligned partitioning. Thus, the impact of cuboid coding and background frames is analyzed for both inter and intra-frame compression, specifically in the context of UHD/360 video. Deviating from considering background across frames, in the second work, we refine our exploration by investigating temporal correlation among adjacent frames. Inspired by prior success with low-frequency, background, and cuboid exploitation, our second investigation delves into common information exploitation in both frequency and pixel domains for intra and inter-frame video compression. The approach involves cuboid partitioning on both low-frequency information and adjacent background details for UHD/360 videos. Building upon this foundation, we extend our exploration into 3D cuboid-based video compression to exploit both local and global redundancy by treating a group of frames as a 3D cuboid. Scene-level decomposition, accounting for the static and dynamic nature of the scene, is conducted to exploit local commonality over successive frames, thereby enhancing compression efficiency.
In the final phase of the research, our attention is directed to point cloud compression which shares spatial correlations among points similar to the pixels in video frames. However, their unordered and permutation-invariant nature presents challenges to traditional common information exploitation approaches, necessitating distinct compression techniques. To this aim, we employ a cluster-based approach to optimize point cloud geometry compression. By leveraging global relations among points during clustering and grouping them for local processing, we ensure spatial density preservation through an end-to-end learnable deep architecture. Extensive experiments are conducted to validate the claimed arguments against state-of-the-art approaches on benchmark datasets. The experimental analysis, both quantitative and qualitative, confirms the superiority of our methods compared to representative methods. Therefore, we believe that our proposed visual data compression techniques represent an advancement in optimizing resource utilization.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Charles Sturt University
Supervisors/Advisors
  • Paul, Manoranjan, Principal Supervisor
  • Debnath, Tanmoy, Co-Supervisor
Award date20 Feb 2025
Place of PublicationAustralia
Publisher
Publication statusPublished - 2025

Fingerprint

Dive into the research topics of 'Efficient video and point cloud compression through common information exploitation'. Together they form a unique fingerprint.

Cite this