The End Of Knowledge - Vault 5/24 - CVPR - 2017 - Densely Connected Convolutional Networks

graph LR classDef main fill:#f9d4d4, font-weight:bold, font-size:14px classDef connectivity fill:#d4f9d4, font-weight:bold, font-size:14px classDef efficiency fill:#d4d4f9, font-weight:bold, font-size:14px classDef structure fill:#f9f9d4, font-weight:bold, font-size:14px classDef performance fill:#f9d4f9, font-weight:bold, font-size:14px classDef extensions fill:#d4f9f9, font-weight:bold, font-size:14px A[Densely Connected Convolutional
Networks] --> B[Dense connectivity: Layer-to-layer
connection, feature reuse. 1] A --> C[Low bottleneck: Direct connections,
less information loss. 2] A --> D[Compact models: Thinner layers,
compact architectures. 3] A --> E[Efficiency: Fewer parameters,
less computation. 4] A --> F[Growth rate k: Feature maps
per layer, small. 5] A --> G[Feature concatenation: Preceding features
input to subsequent layers. 6] G --> H[1x1 convolutions: Reduce input
maps, improve efficiency. 7] A --> I[Dense blocks: Multiple blocks,
pooling/convolution between. 8] B --> J[Implicit supervision: Early layers
receive direct loss supervision. 9] B --> K[Diversified features: Aggregate information
from all preceding layers. 10] B --> L[Low complexity features: Maintain
simple and complex features. 11] L --> M[Smooth decision boundaries: Improved
generalization, varying complexity. 12] A --> N[CIFAR-10: Lower error,
fewer parameters than ResNets. 13] A --> O[CIFAR-100: Similar trends,
dense nets outperform ResNets. 14] A --> P[ImageNet: Similar accuracy,
half parameters/computation of ResNets. 15] A --> Q[Reduced overfitting: Less prone,
especially with limited data. 16] A --> R[State-of-the-art: Top performance
on CIFAR at publication. 17] A --> S[Multi-scale DenseNet: Extension,
learns multi-scale features. 18] S --> T[Dense connections at
each scale. 19] S --> U[Multiple classifiers: Attached to
intermediate layers, early exiting. 20] U --> V[Confidence thresholding: Easy examples
exit early, confidence-based. 21] S --> W[Faster inference: 2.6x ResNets,
1.3x regular dense nets. 22] A --> X[Open-source: Code and
models released on GitHub. 23] A --> Y[Third-party implementations: Many
available after publication. 24] A --> Z[Memory-efficient implementation: Technical
report on improved memory usage. 25] class A main class B,C connectivity class D,E,F,H efficiency class G,I,J,K,L,M structure class N,O,P,Q,R performance class S,T,U,V,W,X,Y,Z extensions

Resume:

1.- Dense connectivity: Connects each layer to every other layer in a network, allowing for feature reuse and improved information flow.

2.- Low information bottleneck: Direct connections between layers reduce information loss as data passes through the network.

3.- Compact models: Dense connectivity allows for thinner layers and more compact models compared to traditional architectures.

4.- Computational and parameter efficiency: Dense networks require fewer parameters and less computation per layer.

5.- Growth rate (k): The number of feature maps each layer in a dense block generates. Typically kept small.

6.- Feature concatenation: Features from preceding layers are concatenated together as input to subsequent layers in a dense block.

7.- 1x1 convolutions: Used to reduce the number of input feature maps and improve parameter efficiency in deeper layers.

8.- Dense blocks: Dense networks are split into multiple dense blocks, with pooling or convolution layers in between.

9.- Implicit deep supervision: Earlier layers receive more direct supervision from the loss function due to dense connectivity.

10.- Diversified features: Feature maps in dense nets tend to be more diverse as they aggregate information from all preceding layers.

11.- Low complexity features: Dense nets maintain features of varying complexity, allowing classifiers to use both simple and complex features.

12.- Smooth decision boundaries: Using features of varying complexity tends to result in smoother decision boundaries and improved generalization.

13.- Performance on CIFAR-10: Dense nets achieve lower error rates with fewer parameters compared to ResNets.

14.- Performance on CIFAR-100: Similar trends as CIFAR-10, with dense nets outperforming ResNets.

15.- Performance on ImageNet: Dense nets achieve similar accuracy to ResNets with roughly half the parameters and computation.

16.- Reduced overfitting: Dense nets are less prone to overfitting, especially when training data is limited.

17.- State-of-the-art results: At the time of publication, dense nets achieved state-of-the-art performance on CIFAR datasets.

18.- Multi-scale DenseNet: An extension of dense nets that learns features at multiple scales for faster inference.

19.- Dense connections at each scale: Multi-scale dense nets introduce dense connectivity within each scale of features.

20.- Multiple classifiers: Multi-scale dense nets attach classifiers to intermediate layers to enable early exiting.

21.- Confidence thresholding: During inference, easier examples can exit early based on the confidence of intermediate classifiers.

22.- Faster inference: Multi-scale dense nets achieve 2.6x faster inference than ResNets and 1.3x faster than regular dense nets.

23.- Open-source code and models: The authors released their code and pre-trained models on GitHub.

24.- Third-party implementations: Many independent implementations of dense nets became available after publication.

25.- Memory-efficient implementation: The authors published a technical report detailing how to implement dense nets in a more memory-efficient manner.

Knowledge Vault built byDavid Vivancos 2024