Knowledge Vault 2/27 - ICLR 2014-2023
Johannes Ballé, Valero Laparra, Eero Simoncelli ICLR 2016 - Density Modeling of Images using a Generalized Normalization Transformation
<Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:

graph LR classDef unsupervised fill:#f9d4d4, font-weight:bold, font-size:14px; classDef density fill:#d4f9d4, font-weight:bold, font-size:14px; classDef gaussianization fill:#d4d4f9, font-weight:bold, font-size:14px; classDef representation fill:#f9f9d4, font-weight:bold, font-size:14px; classDef perception fill:#f9d4f9, font-weight:bold, font-size:14px; Main[Johannes Ballé et al.
ICLR 2016] --> A[Unsupervised learning: structure in
unlabeled data 1] Main --> B[Density estimation: fitting positive-valued
function parameters 2] B --> C[Normalizing requires intractable integration 3] Main --> D[Gaussianization: transforming data to
standard normal 4] D --> E[Input density modeled by
inverse transformation 5] D --> F[Parameters fit by maximizing
likelihood 6] Main --> G[Heavy-tailed symmetric marginal response
distributions 7] G --> H[Logistic Gaussianization: poor fit,
discontinuities 8] G --> I[Alternative Gaussianization: affine, exponentiation,
division 9] D --> J[Marginal Gaussianization doesn't guarantee
joint Gaussianity 10] J --> K[Older approaches: repeatedly Gaussianize
new directions 11] Main --> L[Divisive normalization Gaussianizes joint
density 12] L --> M[Cross-filter terms create shared
joint nonlinearity 13] L --> N[Model captures joint density
shapes 14] L --> O[Extension to multiple dimensions:
generalized divisive normalization 15] D --> P[Previous models are special
cases 16] Main --> Q[Log-likelihood determinant breaks into
additive terms 17] Main --> R[Deep network with joint
normalization nonlinearities 18] R --> S[One layer joint normalization
outperforms multiple marginal 19] Main --> T[Gaussianization learns representations relating
to biology 20] T --> U[Pixel space distances don't
correlate with perception 21] T --> V[Gaussianized representation aligns with
perceptual expectations 22] V --> W[Gaussianized distances correlate strongly
with human judgments 23] T --> X[Unsupervised Gaussianized representation outperforms
industry standard 24] T --> Y[Correlation merits further research 25] Main --> Z[Gaussianization: density modeling and
representation learning 26] Z --> AA[Generalized divisive normalization applies
joint nonlinearities 27] Z --> AB[One layer outperforms multiple
marginal layers 28] Z --> AC[Unsupervised representation accounts for
human judgments 29] Z --> AD[Understanding Gaussianization's perceptual relevance
needs more work 30] class A unsupervised; class B,C density; class D,E,F,H,I,J,K,P gaussianization; class L,M,N,O,Q,R,S,T,U,V,W,X,Y,Z,AA,AB,AC,AD representation; class U,V,W,X,Y perception;

Resume:

1.-Unsupervised learning aims to find structure in unlabeled data and may help understand how sensory representations are learned in the brain.

2.-Density estimation is a classic unsupervised learning approach, typically fitting parameters of a positive-valued function to data.

3.-Normalizing parametric density functions requires intractable integration over the data. An alternative is to find a parametric transformation to Gaussianize data.

4.-Gaussianization transforms data into a standard normal density. The input density can be modeled by pushing the Gaussian through the inverse transformation.

5.-Computing the input density from the Gaussianized data only requires taking derivatives, which is more efficient than integration, especially with modern hardware.

6.-Parameters of the Gaussianizing transformation are fit by maximizing the likelihood, taking derivatives, and using stochastic gradient descent.

7.-Images filtered with linear filters yield heavy-tailed symmetric marginal response distributions. Gaussianization aims to expand the center and contract the tails.

8.-A logistic function Gaussianization has poor fit at the center and discontinuities at the tails due to saturation.

9.-An alternative Gaussianization using an affine function, exponentiation, and division fits the data better without discontinuities.

10.-Marginal Gaussianization of individual filters does not guarantee joint Gaussianity. Rotated marginals reveal non-Gaussian structure.

11.-Older approaches repeatedly find new Gaussian directions and Gaussianize. The process is similar to a deep neural network with many layers.

12.-Divisive normalization, inspired by biological neurons, Gaussianizes the joint density of multiple filters in one step.

13.-Divisive normalization introduces cross-filter terms in the denominator, creating a shared joint nonlinearity across feature maps.

14.-The model captures the continuum of shapes observed in joint densities of pairs of linear filters, from elliptical to marginally independent.

15.-The model is extended to multiple dimensions by learning both the filters and normalization parameters jointly. This is called generalized divisive normalization.

16.-Several previous image models can be seen as special cases of Gaussianization and generalized divisive normalization.

17.-Under certain conditions, the log-likelihood determinant term breaks down into additive terms, enabling fitting of convolutional and stacked versions.

18.-A deep neural network can be built using the joint divisive normalization nonlinearities instead of the typical pointwise nonlinearities.

19.-One layer of joint normalization Gaussianizes the data much more effectively than multiple layers of marginal pointwise nonlinearities.

20.-Beyond density modeling, Gaussianization learns representations that may relate to biology. Distances in the representation may predict human perceptual judgments.

21.-Euclidean distances in pixel space do not correlate well with human perception of image distortion and visual quality.

22.-Reordering distorted images by Euclidean distance in the Gaussianized representation aligns better with perceptual expectations compared to pixel distance.

23.-Euclidean distances in the Gaussianized representation correlate much more strongly (0.84) with human distortion judgments than pixel distances (0.40).

24.-The unsupervised Gaussianized representation outperforms the industry standard (0.74 correlation) for measuring perceptual image quality, without supervised fitting to human responses.

25.-The strong correlation between Gaussianized representation distances and human perceptual judgments is surprising and merits further research.

26.-Gaussianization serves as a vehicle for both density modeling and unsupervised representation learning.

27.-Generalized divisive normalization applies joint nonlinearities across feature maps, inspired by and generalizing biological neural nonlinearities.

28.-One layer of generalized divisive normalization Gaussianizes image data much better than multiple layers of marginal pointwise nonlinearities.

29.-The learned representation accounts for human image quality judgments better than the industry standard, despite being unsupervised.

30.-More work is needed to understand why Gaussianization yields perceptually relevant representations.

Knowledge Vault built byDavid Vivancos 2024