Knowledge Vault 4 /97 - AI For Good 2024
Distorting or enhancing realities using Generative AI
Hao Li
< Resume Image >
Link to IA4Good VideoView Youtube Video

Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:

graph LR classDef ai fill:#f9d4d4, font-weight:bold, font-size:14px classDef cgi fill:#d4f9d4, font-weight:bold, font-size:14px classDef faces fill:#d4d4f9, font-weight:bold, font-size:14px classDef deepfake fill:#f9f9d4, font-weight:bold, font-size:14px classDef tech fill:#f9d4f9, font-weight:bold, font-size:14px A[Distorting or enhancing
realities using Generative
AI] --> B[AI creates
realistic images. 1] A --> C[CGI revolutionized
movie storytelling. 2] C --> D[CGI faces struggled
with uncanny valley. 3] D --> E[Actor scans solved
uncanny valley issue. 4] A --> F[Automated real-time
facial tracking developed. 5] A --> G[Recreating faces
manually costly. 6] A --> H[Neural networks
enhanced facial analysis. 7] H --> I[GANs generate
realistic images. 8] H --> J[Real-time face swapping
enables deep fakes. 9] J --> K[Deep fakes used
maliciously. 10] J --> L[Deep fake tech
now accessible. 11] L --> M[Detection of fake
content advancing. 12] L --> N[Raising awareness
of deep fakes. 13] B --> O[Company uses AI
for visual effects. 14] O --> P[AI enhances puppeteering
and de-aging. 15] O --> Q[AI lip-syncs foreign
films accurately. 16] Q --> R[Pinscreen AI translates
and lip-syncs videos. 17] B --> S[AI generates
any content. 18] S --> T[Diffusion models guide
image generation. 19] T --> U[Notable text-to-image
AI examples. 20] S --> V[T-1000 inspired
speakers interest. 21] V --> W[Films use
VFX shots. 22] S --> X[Early face capture
costly. 23] X --> Y[Face tracking tech
acquired by Apple. 24] Y --> Z[AI simulates expressions
from photos. 25] J --> AA[AI tech enables
malicious deepfake use. 26] AA --> AB[AI media needs
detection awareness. 27] O --> AC[AI enhances avatars
and dubbing. 28] AC --> AD[Pinscreen.ai for
face-driven translation. 29] S --> AE[Diffusion models generate
varied content. 30] class A,B,S AI class C,D,E,F,G,H,V,X,Y,Z,AC,AE faces class J,K,L,M,N,AA,AB deepfake class O,P,Q,R,AC,AD tech

Resume:

1.- Generative AI can create and manipulate realistic images, including human faces, enabling things not possible in physical reality.

2.- CGI and visual effects have transformed storytelling in movies, with heavy usage in films like Avatar.

3.- Realistic CGI faces were difficult due to the uncanny valley effect - appearing creepy when close to but not fully photorealistic.

4.- Data capture and computer vision, like high-res actor scans, helped solve the uncanny valley problem but required complex systems.

5.- The speaker developed more deployable, automated real-time facial tracking systems, later used in technologies like iPhone Animoji.

6.- Facial reenactment, like recreating Paul Walker's face in Furious 7, required intensive manual labor and high costs.

7.- Deep neural networks, especially convolutional neural networks, enabled more robust computer vision for facial analysis.

8.- Generative adversarial networks (GANs) allowed generating realistic images, like fake faces, by pitting generator and discriminator networks against each other.

9.- The speaker's company developed real-time face swapping and facial reenactment using a single photo, enabling "deep fakes".

10.- Deep fakes led to nonconsensual celebrity pornography, disinformation campaigns, scams using fake identities, and other malicious uses.

11.- The accessibility of deep fake tech has made extremely convincing generated content easy for anyone to produce and spread.

12.- Deep fake detection based on biometrics or deep learning can help identify fake content. Commercial and research solutions are advancing.

13.- Raising awareness of deep fakes' potential is important. The speaker demoed real-time deep faked conversations to showcase the tech.

14.- The speaker's company uses generative AI to enhance digital avatars and for positive use cases in visual effects.

15.- Examples: Puppeteering a speaking baby in a movie, de-aging/face replacement for actors in shows like Slumberland and Fallout.

16.- AI lip-syncing foreign language films is a major use case, translating actor performances without typical bad dubbing.

17.- Pinscreen AI is a new product to let users upload videos and AI translate/lip-sync them into any language.

18.- Beyond faces, generative AI aims to generate any desired content using techniques like diffusion models.

19.- Diffusion models like Stable Diffusion break image generation into denoising steps using text prompts to guide the generation.

20.- Examples of impressive text-to-image AI include DALL-E 2, Midjourney, and Runway ML's text-to-video model.

21.- 30 years ago, the T-1000 in Terminator 2 inspired the speaker by showing CGI could create anything imaginable.

22.- Breakthroughs in realistic CGI led to films with thousands of VFX shots and only a couple fully real shots.

23.- Early face capture required expensive multi-camera rigs. The speaker worked on more practical real-time face tracking systems.

24.- One of the speaker's face tracking technologies was acquired by Apple and became the basis for Animoji.

25.- Generative AI can convincingly simulate new facial expressions from a single still photo, having many entertainment/visual effects applications.

26.- However, the same AI technologies have enabled nonconsensual deepfake pornography, fraud, political disinformation and other malicious uses.

27.- With AI-generated media now accessible to anyone, not just VFX studios, detection and public awareness are critical.

28.- The speaker's company uses AI face rendering to enhance digital avatars, de-age actors, and automatically dub foreign films.

29.- Pinscreen.ai will be a productized version of their AI dubbing tech for automated face-driven translation of any video.

30.- Beyond faces, advanced generative AI techniques like diffusion models can now generate almost any content imaginable just from text.

Knowledge Vault built byDavid Vivancos 2024