Knowledge Vault 2/40 - ICLR 2014-2023
Jonathon Cai, Richard Shin, Dawn Song ICLR 2017 -Making Neural Programming Architectures Generalize via Recursion
<Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4 | Gemini Adv | Llama 3:

graph LR classDef neural fill:#f9d4d4, font-weight:bold, font-size:14px; classDef recursion fill:#d4f9d4, font-weight:bold, font-size:14px; classDef npi fill:#d4d4f9, font-weight:bold, font-size:14px; classDef verification fill:#f9f9d4, font-weight:bold, font-size:14px; classDef results fill:#f9d4f9, font-weight:bold, font-size:14px; A[Jonathon Cai et al.
ICLR 2017] --> B[Neural program synthesis
generates programs from examples 1] A --> C[Existing architectures lack
generalization proof 2] A --> D[Recursion introduced to
address generalization challenges 3] A --> E[NPI executes programs
by selecting actions 4] E --> F[NPI trained on
execution traces 5] D --> G[Recursive NPI programs
have variable-length traces 6] G --> H[Example task:
grade-school addition 7] G --> I[Non-recursive addition requires
fixed-length trace 8] D --> J[Recursion enables generalization
via base cases 9] D --> K[Recursion enables tractable
verification set 10] D --> L[Recursive NPI trained
on recursive traces 11] K --> M[Learned program verified
by matching oracle 12] K --> N[Verification set covers
base cases and calls 13] K --> O[Non-recursive verification set
intractable for all lengths 14] A --> P[Experiments on addition,
sorting, topological sort, quicksort 15] P --> Q[Recursive NPIs achieve
100% test accuracy 16] P --> R[Recursive programs verified
for perfect generalization 17] A --> S[First paper introducing
recursion in neural programs 18] S --> T[Recursive programs generalize
better to complex inputs 19] S --> U[Recursive programs enable
proof of generalization 20] S --> V[Recursion important for
handling complex problems 21] S --> W[NPI architecture used
as first step 22] W --> X[Explicit recursive traces
used for training 23] S --> Y[Future work: extend
to other architectures 24] Y --> Z[Less supervised training
methods to be explored 25] Y --> AA[Input-output examples to
induce recursive programs 26] Y --> AB[Domains beyond program
synthesis could benefit 27] S --> AC[Strong empirical generalization
results demonstrated 28] S --> AD[Novel verification procedure
proves perfect generalization 29] S --> AE[Recursion is important
step for synthesis 30] class A,B,C,D,S,T,U,V,W,X,Y,Z,AA,AB,AC,AD,AE neural; class D,G,H,I,J,K,L recursion; class E,F,W,X npi; class K,M,N,O,AD verification; class P,Q,R,AC results;

Resume:

1.-Neural program synthesis aims to generate programs from input-output examples using neural networks.

2.-Existing neural program architectures like NPI have challenges with generalization to more complex inputs and lack proof of generalization.

3.-The paper introduces recursion into neural program architectures to address the generalization challenges.

4.-NPI consists of a controller LSTM that executes programs by selecting actions to modify the environment or call functions.

5.-NPI is trained on execution traces showing the sequence of actions, not just input-output examples.

6.-A recursive NPI program contains function calls that invoke the same function, allowing the trace length to vary with input complexity.

7.-Grade-school addition is used as an example task, with the ADD1, LSHIFT and CARRY functions.

8.-Non-recursive addition requires a fixed-length trace proportional to input length, while recursive addition has a variable-length trace with recursive calls.

9.-Recursion enables generalization because the learned network only needs to handle the base cases and recursive calls.

10.-Recursion also enables a tractable verification set to prove generalization, since only base cases and recursive calls are needed.

11.-To learn recursive NPI programs, the same architecture is used but trained on recursive traces instead of non-recursive traces.

12.-Perfect generalization of the learned program is verified by matching its execution to an oracle on a verification set.

13.-The verification set covers all base cases and recursive function calls to achieve full coverage with a finite set.

14.-Without recursion, the verification set would need to cover all input lengths, which is intractable.

15.-Experiments are conducted on tasks like addition, bubble sort, topological sort, and quicksort.

16.-The recursive NPI programs achieve 100% accuracy on all test problems, while non-recursive versions degrade on longer inputs.

17.-The learned recursive programs are successfully verified to have provably perfect generalization using the oracle matching procedure.

18.-This paper is the first to introduce and demonstrate the benefits of recursion in neural program architectures.

19.-Recursive neural programs can generalize better to more complex inputs compared to non-recursive programs.

20.-Recursive neural programs also enable a proof of generalization, which was not possible with non-recursive programs.

21.-The paper's results show recursion is an important concept for neural program architectures to handle more complex problems.

22.-As a first step, the paper incorporated recursion into the NPI architecture specifically.

23.-Explicit recursive execution traces were used to train the recursive NPI programs in a supervised way.

24.-Future work could extend the recursive approach to other neural architectures beyond just NPI.

25.-Less supervised training methods that don't require explicit recursive traces could be explored to learn recursive programs.

26.-Input-output examples could potentially be used to induce recursive programs instead of full execution traces.

27.-Domains beyond program synthesis, such as perception and control problems, could benefit from the recursive neural program approach.

28.-The paper demonstrated strong empirical generalization results from using recursive neural programs compared to non-recursive baselines.

29.-The paper also provided a novel verification procedure to prove the perfect generalization of recursive neural programs.

30.-Incorporating recursion is an important step forward for neural program synthesis to solve more complex and realistic problems.

Knowledge Vault built byDavid Vivancos 2024