Knowledge Vault 5 /23 - CVPR 2017
Commercializing computer vision: Success stories and lessons learned
Harry Shum
< Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:

graph LR classDef microsoft fill:#f9d4d4, font-weight:bold, font-size:14px classDef research fill:#d4f9d4, font-weight:bold, font-size:14px classDef products fill:#d4d4f9, font-weight:bold, font-size:14px classDef hololens fill:#f9f9d4, font-weight:bold, font-size:14px classDef future fill:#f9d4f9, font-weight:bold, font-size:14px A[Commercializing computer vision:
Success stories and
lessons learned] --> B[Microsoft sponsors CVPR,
25-year commitment. 1] A --> C[Microsoft Research created,
computer vision challenge posed. 2] C --> D[Research key to commercializing
computer vision. 3] C --> E[Researchers freedom to
pursue various styles. 4] C --> F[Curiosity-driven research: panoramas,
mosaics, video. 5] C --> G[Microsoft wins ImageNet,
COCO with ResNets, FCNs. 6] A --> H[Deployment-driven research: customer
needs, fast iterations. 7] H --> I[Microsoft Pix: AI-powered
iOS camera app. 8] I --> J[Pix updates: style transfer
from CVPR paper. 9] A --> K[Products for various
customer segments. 10] K --> L[Cognitive Services: cloud
APIs for intelligent apps. 11] L --> M[Image captioning from
COCO challenge winner. 12] M --> N[Captionbot.ai: user feedback,
Office deployment. 13] L --> O[Custom Vision Service:
easy image classifiers. 14] O --> P[Custom Vision APIs for
programmatic model improvement. 15] A --> Q[Mission: empower every
person and organization. 16] Q --> R[HoloLens: untethered
holographic computer. 17] R --> S[HoloLens used for
training, customer satisfaction. 18] R --> T[HoloLens incorporates years
of MS computer vision research. 19] R --> U[HoloLens: on-board CV for
tracking, mapping, gestures. 20] U --> V[Latest HoloLens research:
deep learning for tracking, gestures. 21] R --> W[Custom silicon HPU for
deep learning on HoloLens. 22] W --> X[HPU 2.0: AI coprocessor,
programmable by Microsoft. 23] X --> Y[Demo: HPU 2.0 real-time
hand segmentation, tracking. 24] A --> Z[Opportunities span cloud
services, edge devices. 25] A --> AA[Lessons: human elements,
research-product iteration. 26] A --> AB[More companies recognize
computer vision importance. 27] AB --> AC[Fundamental research investment
makes world worth defending. 28] A --> AD[Microsoft Research: successful
investments, field advancement. 29] A --> AE[Pride in contributions,
gratitude for CVPR community. 30] class A,B,K,Q,Z,AA,AB,AC,AD,AE microsoft class C,D,E,F,G research class H,I,J,L,M,N,O,P products class R,S,T,U,V,W,X,Y hololens

Resume:

1.- Microsoft has sponsored CVPR for the past 25 years, showing their long-term commitment to computer vision research.

2.- In 1991, Bill Gates created Microsoft Research and posed the challenge of creating computers that could see, hear, talk and understand humans.

3.- Microsoft believes research is key to commercializing computer vision, seeing it as a cycle from research to product to business.

4.- Microsoft gives researchers freedom to pursue curiosity-driven research, deployment-driven research, or other styles based on the impact they want to have.

5.- Examples of curiosity-driven research at Microsoft include work on panoramas, concentric mosaics, and panoramic video in the early days of CVPR.

6.- In 2015-2016, Microsoft won major categories at ImageNet and COCO challenges with very deep 152-layer residual networks (ResNets) and region-based fully convolutional networks.

7.- Deployment-driven research involves understanding customer segments, needs and pain points, then designing systems and products to address them through fast iterations.

8.- Microsoft Pix is an AI-powered iOS camera app incorporating technology from 20+ CVPR/ICCV/ECCV papers to deliver features like best shot selection.

9.- The latest Pix update added artistic style transfer based on a CVPR 2017 paper, showing fast deployment of new research into products.

10.- Microsoft develops products for many customer segments including consumers, developers, information workers and business users.

11.- Microsoft Cognitive Services provides a set of cloud-based APIs for vision, speech, language, knowledge and search to allow any developer to build intelligent apps.

12.- Image captioning in Cognitive Services started from Microsoft's 1st place algorithm in the 2015 COCO captioning challenge which passed the Turing test 32% of the time.

13.- Launching captionbot.ai allowed collecting user feedback to improve the captioning models, increasing user satisfaction and enabling deployment into Office products.

14.- Custom Vision Service allows developers to easily build their own robust image classifiers with a small number of training images.

15.- Custom Vision Service exposes all its APIs so developers can programmatically improve models, such as using 3rd party data labeling services.

16.- Microsoft's mission is to empower every person and organization to achieve more, including first-line and frontline workers.

17.- Microsoft HoloLens is an untethered holographic computer allowing interaction with digital content and the real world.

18.- HoloLens has been used by companies like Japan Airlines to innovate training and boost customer satisfaction.

19.- HoloLens incorporates many years of Microsoft computer vision research, from Kinect to Kinect Fusion to Holoportation.

20.- HoloLens uses on-board computer vision for robust head tracking, 3D environment mapping, and gesture recognition.

21.- The latest HoloLens research improves tracking and gestures using state-of-the-art deep neural networks running locally on the device.

22.- Microsoft developed custom silicon, the Holographic Processing Unit, to run deep neural networks with high speed and low power on HoloLens.

23.- The 2nd version of the HPU incorporates an AI coprocessor to natively and flexibly implement deep neural networks fully programmable by Microsoft.

24.- A live demo showed the HPU 2.0 AI coprocessor performing real-time hand segmentation and tracking using deep learning models like ResNet-18.

25.- Opportunities to commercialize computer vision span intelligent cloud services and intelligent edge devices which are increasingly powerful.

26.- Lessons learned at Microsoft about accelerating commercialization include the importance of human elements and iteration between research and product teams.

27.- More companies are recognizing the importance of computer vision but need to invest further in research to build great products.

28.- An anecdote illustrated how investment in fundamental research, even if not directly applicable, makes the world more worth defending.

29.- Microsoft Research has made incredibly successful investments that have benefited Microsoft while advancing the field of computer vision.

30.- The speaker expressed pride in Microsoft's contributions and gratitude for the CVPR community and its founders in enabling this progress over decades.

Knowledge Vault built byDavid Vivancos 2024