I build multimodal AI systems, leading the model development.
At Ai2, I develop foundation models for vision, robotics, and multimodal reasoning, with a focus on large-scale modeling, training and data curation - Molmo, Molmo2 and MolmoBot. Previously, at Amazon Research, I led projects spanning image and video generation, video understanding, and visual perception, taking models from research to production.
I hold a Master’s in Computer Science from Cornell Tech and a B.Tech. in Computer Science from IIT Kanpur.
I enjoy mentoring researchers and engineers and collaborating on ambitious AI projects. Additionally, I am passionate about exploring the intersection of AI and plant based food.