Unpacking Scattering Vision Transformer: A New Dawn in Computer Vision

Transformers are reshaping computer vision. But with their prowess come inherent challenges, particularly with attention complexity and the nuances of image details. Dr. Agneeswaran introduces the Scattering Vision Transformer (SVT), an innovative approach that promises to address these challenges head-on.

In this session, attendees will:

  • Dive deep into the architecture and principles behind SVT, understanding how it differs and rises above traditional transformers.
  • Explore the groundbreaking spectrally scattering network of SVT and its novel spectral mixing technique.
  • Discover real-world applications: From setting records on the ImageNet dataset to its potential in medical imaging and collaborations with industry stalwarts.
  • Understand the practical implications of SVT for developers, architects, and businesses.

By the end of the talk, attendees will not only grasp the academic brilliance behind SVT but will also appreciate its tangible impact on the world of computer vision, gaining insights that are immediately applicable in their own projects and endeavors.

About the speaker

Vijay Srinivas Agneeswaran

ML/AI Research Leader, Microsoft

Dr. Vijay Srinivas Agneeswaran has a Bachelor’s degree in Computer Science & Engineering from SVCE, Madras University (1998), an MS (By Research) from IIT Madras in 2001, a PhD from IIT Madras (2008) and a post-doctoral research fellowship in the LSIR Labs, Swiss Federal Institute of Technology, Lausanne (EPFL). He has spent the last twenty plus years creating intellectual property and building data-based products in Industry and academia. He is currently head of cloud + AI Research team at Microsoft Inc., Bangalore. He is building a world-class R&D team to help in creating IP for Microsoft in AI/ML. He is also working closely with Microsoft businesses and cloud customers to help unearth full value of AI for Azure as well as other key products. In addition, he is the responsible AI champion for the Cloud + AI team. He was heading the ML platform and the data sciences foundations teams at Walmart in his previous role. He has five granted US patents as well as numerous other disclosures that have been filed in Indian and US Patent offices.

Badri Narayana Patro

Senior Research Scientist, Microsoft

Badri Narayana Patro holds a Ph.D. in electrical engineering from the Indian Institute of Technology, Kanpur, and an M.Tech. in electrical engineering from the Indian Institute of Technology, Bombay. Currently, he serves as a Senior Research Scientist at Microsoft. Previously, he was a Postdoctoral Research Fellow at KU Leuven, Belgium, and before that, a Postdoctoral Researcher at Google Research, India. His industry experience includes roles as a Lead Engineer at Samsung R&D Institute, India, an Associate Software Engineer at Harman International, India, and an Assistant Software Engineer at Larsen & Toubro Limited. Dr. Patro's research focuses on computer vision and natural language processing, leveraging Deep Learning techniques such as Transformers and LLMs. He has contributed as a reviewer for numerous prestigious conferences and journals in the field and is a member of CVF and ACL. His research contributions span conferences like CVPR, ICCV, NeurIPS,  AAAI, EMNLP, BMVC, WACV, MM, and ICASSP, as well as journals including TIP, PR, Neurocomputing, and Image Vision Computing. He has served as a reviewer for CVPR, ICCV, ECCV, AAAI, NeurIPS, ICLR, ACL, EMNLP, NAACL, BMVC, WACV, ICVGIP, T-PAMI, TIP,TMM and PR.