AI Codec 入門

此文由 Mix Space 同步更新至 xLog
為獲得最佳瀏覽體驗，建議訪問原始鏈接
https://www.do1e.cn/posts/codec/AICodecIntro

數字圖像處理#

電子版鏈接：數字圖像處理（中）第三版 (1).pdf
學習第1、2、4、6.1-6.2、8章，第八章可結合JPEG 編碼細節介紹 - CSDN 博客看，掌握編碼的大致流程

首先需要對 Python 有足夠的了解，可選電子書：Python 編程：從入門到實踐.pdf
學習 Pytorch，B 站相關課程：跟李沐學 AI 的個人空間 - 跟李沐學 AI 個人主頁 - 哔哩哔哩視頻 (bilibili.com)，重點看00~29.2，31，33-37，47，47.2

結合論文和代碼（CompressAI）嘗試自己訓練一組模型，繪製 RD 曲線

常用訓練、驗證集：ImageNet/COCO
常用測試集：24 張 Kodak 圖片，由於原始圖片邊緣異常，有時會使用裁剪為方形的圖片

Ballé, J., et al. (2015). "Density modeling of images using a generalized normalization transformation." arXiv preprint arXiv:1511.06281.

AI Codec 中常用的激活層 GDN，相關代碼：CompressAI/compressai/layers/gdn.py at master · InterDigitalInc/CompressAI (github.com)
Ballé, J., et al. (2016). "End-to-end optimized image compression." arXiv preprint arXiv:1611.01704.

介紹 AI Codec 的基礎架構，可以結合 JPEG 編碼看，對比它們流程中均有的變換、量化、熵編碼，理解RD 損失函數。相關代碼：CompressAI/compressai/models/google.py at a4ae2eeef7bdb1b84ba076ac0d650b523f3fa882 · InterDigitalInc/CompressAI · GitHub
Ballé, J., et al. (2018). "Variational image compression with a scale hyperprior." arXiv preprint arXiv:1802.01436.

在基礎架構上添加超先驗（hyper），相關代碼：CompressAI/compressai/models/google.py at a4ae2eeef7bdb1b84ba076ac0d650b523f3fa882 · InterDigitalInc/CompressAI · GitHub
Minnen, D., et al. (2018). "Joint autoregressive and hierarchical priors for learned image compression." Advances in neural information processing systems.

自回歸（autoregressive）與超先驗，相關代碼：CompressAI/compressai/models/google.py at a4ae2eeef7bdb1b84ba076ac0d650b523f3fa882 · InterDigitalInc/CompressAI · GitHub

注：CompressAI 在 Linux 下可直接 pip 安裝，但不提供 Windows 安裝包，參照下述流程安裝：