#206 – Ishan Misra: Self-Supervised Deep Learning in Computer Vision

By Lex Fridman

Lex Fridman Podcast

31/07/21

Ishan Misra is a research scientist at FAIR working on self-supervised visual learning. Please support this podcast by checking out our sponsors: - Onnit: https://lexfridman.com/onnit to get up to 10% off - The Information: https://theinformation.com/lex to get 75% off first month - Grammarly: https://grammarly.com/lex to get 20% off premium - Athletic Greens: https://athleticgreens.com/lex and use code LEX to get 1 month of fish oil EPISODE LINKS: Ishan's twitter: https://twitter.com/imisra_ Ishan's website: https://imisra.github.io Ishan's FAIR page: https://ai.facebook.com/people/ishan-misra/ PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (07:49) - Self-supervised learning (16:24) - Self-supervised learning is the dark matter of intelligence (20:17) - Categorization (28:50) - Is computer vision still really hard? (32:35) - Understanding Language (42:14) - Harder to solve: vision or language (48:59) - Contrastive learning & energy-based models (52:59) - Data augmentation (57:19) - Fixed audio spike by lowering sound with pen tool (1:05:33) - Real data vs. augmented data (1:09:16) - Non-contrastive learning energy based self supervised learning methods (1:12:54) - Unsupervised learning (SwAV) (1:15:37) - Self-supervised Pretraining (SEER) (1:20:44) - Self-supervised learning (SSL) architectures (1:26:43) - VISSL pytorch-based SSL library (1:29:38) - Multi-modal (1:37:06) - Active learning (1:42:45) - Autonomous driving (1:54:12) - Limits of deep learning (1:58:19) - Difference between learning and reasoning (2:03:26) - Building super-human AI (2:11:14) - Most beautiful idea in self-supervised learning (2:15:02) - Simulation for training AI (2:18:27) - Video games replacing reality (2:19:40) - How to write a good research paper (2:24:08) - Best programming language for beginners (2:25:01) - PyTorch vs TensorFlow (2:28:26) - Advice for getting into machine learning (2:30:31) - Advice for young people (2:32:58) - Meaning of life