Johannes Ballé, Valero Laparra, Eero Simoncelli ICLR 2016 - Density Modeling of Images using a Generalized Normalization Transformation

**Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:**

graph LR
classDef unsupervised fill:#f9d4d4, font-weight:bold, font-size:14px;
classDef density fill:#d4f9d4, font-weight:bold, font-size:14px;
classDef gaussianization fill:#d4d4f9, font-weight:bold, font-size:14px;
classDef representation fill:#f9f9d4, font-weight:bold, font-size:14px;
classDef perception fill:#f9d4f9, font-weight:bold, font-size:14px;
Main[Johannes Ballé et al.

ICLR 2016] --> A[Unsupervised learning: structure in

unlabeled data 1] Main --> B[Density estimation: fitting positive-valued

function parameters 2] B --> C[Normalizing requires intractable integration 3] Main --> D[Gaussianization: transforming data to

standard normal 4] D --> E[Input density modeled by

inverse transformation 5] D --> F[Parameters fit by maximizing

likelihood 6] Main --> G[Heavy-tailed symmetric marginal response

distributions 7] G --> H[Logistic Gaussianization: poor fit,

discontinuities 8] G --> I[Alternative Gaussianization: affine, exponentiation,

division 9] D --> J[Marginal Gaussianization doesn't guarantee

joint Gaussianity 10] J --> K[Older approaches: repeatedly Gaussianize

new directions 11] Main --> L[Divisive normalization Gaussianizes joint

density 12] L --> M[Cross-filter terms create shared

joint nonlinearity 13] L --> N[Model captures joint density

shapes 14] L --> O[Extension to multiple dimensions:

generalized divisive normalization 15] D --> P[Previous models are special

cases 16] Main --> Q[Log-likelihood determinant breaks into

additive terms 17] Main --> R[Deep network with joint

normalization nonlinearities 18] R --> S[One layer joint normalization

outperforms multiple marginal 19] Main --> T[Gaussianization learns representations relating

to biology 20] T --> U[Pixel space distances don't

correlate with perception 21] T --> V[Gaussianized representation aligns with

perceptual expectations 22] V --> W[Gaussianized distances correlate strongly

with human judgments 23] T --> X[Unsupervised Gaussianized representation outperforms

industry standard 24] T --> Y[Correlation merits further research 25] Main --> Z[Gaussianization: density modeling and

representation learning 26] Z --> AA[Generalized divisive normalization applies

joint nonlinearities 27] Z --> AB[One layer outperforms multiple

marginal layers 28] Z --> AC[Unsupervised representation accounts for

human judgments 29] Z --> AD[Understanding Gaussianization's perceptual relevance

needs more work 30] class A unsupervised; class B,C density; class D,E,F,H,I,J,K,P gaussianization; class L,M,N,O,Q,R,S,T,U,V,W,X,Y,Z,AA,AB,AC,AD representation; class U,V,W,X,Y perception;

ICLR 2016] --> A[Unsupervised learning: structure in

unlabeled data 1] Main --> B[Density estimation: fitting positive-valued

function parameters 2] B --> C[Normalizing requires intractable integration 3] Main --> D[Gaussianization: transforming data to

standard normal 4] D --> E[Input density modeled by

inverse transformation 5] D --> F[Parameters fit by maximizing

likelihood 6] Main --> G[Heavy-tailed symmetric marginal response

distributions 7] G --> H[Logistic Gaussianization: poor fit,

discontinuities 8] G --> I[Alternative Gaussianization: affine, exponentiation,

division 9] D --> J[Marginal Gaussianization doesn't guarantee

joint Gaussianity 10] J --> K[Older approaches: repeatedly Gaussianize

new directions 11] Main --> L[Divisive normalization Gaussianizes joint

density 12] L --> M[Cross-filter terms create shared

joint nonlinearity 13] L --> N[Model captures joint density

shapes 14] L --> O[Extension to multiple dimensions:

generalized divisive normalization 15] D --> P[Previous models are special

cases 16] Main --> Q[Log-likelihood determinant breaks into

additive terms 17] Main --> R[Deep network with joint

normalization nonlinearities 18] R --> S[One layer joint normalization

outperforms multiple marginal 19] Main --> T[Gaussianization learns representations relating

to biology 20] T --> U[Pixel space distances don't

correlate with perception 21] T --> V[Gaussianized representation aligns with

perceptual expectations 22] V --> W[Gaussianized distances correlate strongly

with human judgments 23] T --> X[Unsupervised Gaussianized representation outperforms

industry standard 24] T --> Y[Correlation merits further research 25] Main --> Z[Gaussianization: density modeling and

representation learning 26] Z --> AA[Generalized divisive normalization applies

joint nonlinearities 27] Z --> AB[One layer outperforms multiple

marginal layers 28] Z --> AC[Unsupervised representation accounts for

human judgments 29] Z --> AD[Understanding Gaussianization's perceptual relevance

needs more work 30] class A unsupervised; class B,C density; class D,E,F,H,I,J,K,P gaussianization; class L,M,N,O,Q,R,S,T,U,V,W,X,Y,Z,AA,AB,AC,AD representation; class U,V,W,X,Y perception;

**Resume: **

**1.-**Unsupervised learning aims to find structure in unlabeled data and may help understand how sensory representations are learned in the brain.

**2.-**Density estimation is a classic unsupervised learning approach, typically fitting parameters of a positive-valued function to data.

**3.-**Normalizing parametric density functions requires intractable integration over the data. An alternative is to find a parametric transformation to Gaussianize data.

**4.-**Gaussianization transforms data into a standard normal density. The input density can be modeled by pushing the Gaussian through the inverse transformation.

**5.-**Computing the input density from the Gaussianized data only requires taking derivatives, which is more efficient than integration, especially with modern hardware.

**6.-**Parameters of the Gaussianizing transformation are fit by maximizing the likelihood, taking derivatives, and using stochastic gradient descent.

**7.-**Images filtered with linear filters yield heavy-tailed symmetric marginal response distributions. Gaussianization aims to expand the center and contract the tails.

**8.-**A logistic function Gaussianization has poor fit at the center and discontinuities at the tails due to saturation.

**9.-**An alternative Gaussianization using an affine function, exponentiation, and division fits the data better without discontinuities.

**10.-**Marginal Gaussianization of individual filters does not guarantee joint Gaussianity. Rotated marginals reveal non-Gaussian structure.

**11.-**Older approaches repeatedly find new Gaussian directions and Gaussianize. The process is similar to a deep neural network with many layers.

**12.-**Divisive normalization, inspired by biological neurons, Gaussianizes the joint density of multiple filters in one step.

**13.-**Divisive normalization introduces cross-filter terms in the denominator, creating a shared joint nonlinearity across feature maps.

**14.-**The model captures the continuum of shapes observed in joint densities of pairs of linear filters, from elliptical to marginally independent.

**15.-**The model is extended to multiple dimensions by learning both the filters and normalization parameters jointly. This is called generalized divisive normalization.

**16.-**Several previous image models can be seen as special cases of Gaussianization and generalized divisive normalization.

**17.-**Under certain conditions, the log-likelihood determinant term breaks down into additive terms, enabling fitting of convolutional and stacked versions.

**18.-**A deep neural network can be built using the joint divisive normalization nonlinearities instead of the typical pointwise nonlinearities.

**19.-**One layer of joint normalization Gaussianizes the data much more effectively than multiple layers of marginal pointwise nonlinearities.

**20.-**Beyond density modeling, Gaussianization learns representations that may relate to biology. Distances in the representation may predict human perceptual judgments.

**21.-**Euclidean distances in pixel space do not correlate well with human perception of image distortion and visual quality.

**22.-**Reordering distorted images by Euclidean distance in the Gaussianized representation aligns better with perceptual expectations compared to pixel distance.

**23.-**Euclidean distances in the Gaussianized representation correlate much more strongly (0.84) with human distortion judgments than pixel distances (0.40).

**24.-**The unsupervised Gaussianized representation outperforms the industry standard (0.74 correlation) for measuring perceptual image quality, without supervised fitting to human responses.

**25.-**The strong correlation between Gaussianized representation distances and human perceptual judgments is surprising and merits further research.

**26.-**Gaussianization serves as a vehicle for both density modeling and unsupervised representation learning.

**27.-**Generalized divisive normalization applies joint nonlinearities across feature maps, inspired by and generalizing biological neural nonlinearities.

**28.-**One layer of generalized divisive normalization Gaussianizes image data much better than multiple layers of marginal pointwise nonlinearities.

**29.-**The learned representation accounts for human image quality judgments better than the industry standard, despite being unsupervised.

**30.-**More work is needed to understand why Gaussianization yields perceptually relevant representations.

Knowledge Vault built byDavid Vivancos 2024