Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:
Resume:
1.- Panelists introduced: Devi Parikh, Jason Salavon, Aaron Hertzmann, Michal Irani - experts in AI, computer vision, computational creativity, and art.
2.- Devi Parikh's research: Attributes for human-machine communication, vision-language models, systems for enhancing human creativity, multimodal foundation models, generative AI.
3.- Jason Salavon's art: Software-based fine arts at intersection of art, culture, technology. Generative artworks from culturally loaded material.
4.- Aaron Hertzmann's work: Simulating artistic creativity computationally, style transfer, motion style learning, bringing historical perspective to AI art.
5.- Michal Irani's view: Machines memorize better than humans but can't generalize outside training distribution. Humans generalize from few examples.
6.- Deep internal learning: Abundant patch recurrence in an image/video provides enough info to learn. Adapts to image-specific data/degradation.
7.- Deep external learning: Training extensively on large datasets. True intelligence/creativity lies between the two extremes.
8.- Humor and AI: Current vision-language models lack sense of humor and cognition of what's funny in images.
9.- Vision-language model limitations: Models perform bag-of-words captioning without understanding object relations/compositionality. An open challenge.
10.- Training data concerns: Many artists feel work is "stolen" when ingested for training. Complex issue touching ownership, copyright, compensation.
11.- Transformative use: Historically, artists learn by copying. Once art is public, hard to control its use. Cultural adjustment needed.
12.- Threatened artists: Those already in precarious positions are most concerned about AI displacing their creative work and profit.
13.- Commodification of pixels: Concern that creative visual work may become a commodity, with pricing based on output resolution. Complex to navigate.
14.- Evaluation and metrics: Over-reliance on quantitative benchmarks. Reviewers should allow for qualitative arguments. Numbers can encourage lazy reviewing.
15.- Limitations of metrics: Difficult to quantitatively measure aspects like creativity and humor. Human evaluation still most useful.
16.- Future vision systems: Will likely be multimodal, leveraging knowledge across vision, language, speech, audio to expand capabilities.
17.- Data limitations: Expanding multimodal systems requires new high-quality datasets. A key challenge for the vision community.
18.- Engaging artists: CV community should collaborate with artists/designers. Open-source models, make them easy to use by non-programmers.
19.- Rapid iteration tools: Artists want ability to rapidly experiment with new techniques. Waiting for model output can be part of creative process.
20.- Copyright questions: As in other domains like music, complex issues around data ownership and usage rights must be navigated.
21.- Intersection of vision/language: Many unsolved problems remain, e.g. visual compositionality, relations between objects, moving beyond bag-of-words.
22.- Human vs machine intelligence: Differences not fully understood. Adversarial examples fool AI but not humans. An area for study.
23.- Mind-reading research: Decoding visual experiences from brain activity (fMRI). Requires learning from limited data, an interesting challenge.
24.- Modeling curiosity/creativity: Can aspects of human creative process, like open-ended exploration and serendipity, be captured computationally?
25.- Practical model utility: Making generative tools controllable and useful for end-users' actual creative needs is important, e.g. ControlNet.
26.- Democratizing creation: AI tools have potential to make high-quality creative production accessible to the masses. New art forms may result.
27.- Next generation's creativity: Children growing up with these tools as natives will likely produce innovative, hard-to-predict new works and genres.
28.- Control and editability: Giving users more control over generative model outputs is an underexplored but important research direction.
29.- Advice for students: Embrace large models as enabling infrastructure. Pursue research you're personally excited about and rally others around it.
30.- Interdisciplinary collaboration: Combining expertise across computer science, art, cognitive science, etc. can lead to novel insights and impactful work.
Knowledge Vault built byDavid Vivancos 2024