Adithya Narayan

MSc in Computer Vision @ Carnegie Mellon University · Human Sensing Lab · Pittsburgh, PA
CV Scholar GitHub Email

prof_pic.jpg

I’m Adithya Narayan, a Graduate Research Assistant at the Human Sensing Lab, advised by Prof. Fernando De la Torre, and an MSCV student at the Robotics Institute at Carnegie Mellon University.

My research interests center around 3D vision and geometry-aware learning—especially multi-view reasoning, robust reconstruction, and understanding how vision(-language) models build 3D scene representations.

Currently, I’m exploring:

  • Multi-view reasoning / view selection for 2D VLMs, and how 3D understanding emerges (Gaussian Splatting + depth representations; mechanistic interpretability).
  • Adversarial scene exploration on SE(3) with ordinal objectives to expose geometry/depth failure modes (CVPR 2026, under review).

Previously, I’ve worked across research engineering and applied ML:

  • HeyGen (Research Engineering Intern): camera-motion conditioned video diffusion (ControlNet), large-scale SfM + pose extraction, and data filtering with flow-based signals.
  • Arintra (ML Engineer): RAG for medical coding, semantic retrieval (SapBERT + Qdrant), and ML deployment/versioning (MLFlow + FastAPI + GCP).
  • Klothed (ML Engineer) (Advisor: Prof. James O’Brien): 3D human reconstruction and fast FEM-based warping pipelines for AR.
  • Origin Health (Research Engineer) (Advisor: Dr. Sripad Devalla): fetal imaging systems (segmentation / measurement), including a co-authored ISBI 2022 paper.

I completed my undergrad at Manipal Institute of Technology (B.Tech ECE) in 2021.


Timeline

My journey so far (research + industry).

selected publications

  1. Breaking Depth Estimation Models with Semantic Adversarial Attacks
    Adithya Narayan, Ujjwal Ojha , Jonas Theiss , and 2 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026 , 2026
    Under Review
  2. Towards-a-device-independent-deep-learning.png
    Towards a device-independent deep learning approach for the automated segmentation of sonographic fetal brain structures: a multi-center and multi-device validation
    Abhi Lad , Adithya Narayan, Hari Shankar , and 8 more authors
    In Medical Imaging 2022: Computer-Aided Diagnosis , 2022
  3. Leveraging-clinically-relevant.png
    Leveraging clinically relevant biometric constraints to supervise a deep learning model for the accurate caliper placement to obtain sonographic measurements of the fetal brain
    Hari Shankar , Adithya Narayan, Shefali Jain , and 8 more authors
    In 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI) , 2022
  4. firstauthorISUOG.png
    OC11. 02: A multicentre, multi-device validation of a deep learning system for the automated segmentation of fetal brain structures from two-dimensional ultrasound images.
    A. Narayan, S Kaushik , H Shankar , and 8 more authors
    Ultrasound in Obstetrics & Gynecology, 2021

selected projects

AvatarsFTW: 3D Human Avatars From The Wild

AvatarsFTW: 3D Human Avatars From The Wild

Authors: Adithya Narayan , Kaustav Mukherjee , Shaurye Aggarwal

We propose a two-part, inpainting and body fitting pipeline that alleviates 3D human reconstruction issues with human-object interactions, occlusions, and dynamic poses. The inpainting pipeline uses keypoint detection and a novel keypoint estimation technique, uses LaMa for occluding object removal, Stable Diffusion with ControlNets for generation of missing areas, and a GAN inversion step to create a seamless, plausible human reconstruction. The body fitting pipeline uses an improved regressor and adds more losses to the iterative fitting stage to achieve a better human mesh fit in dynamic poses. The figure above demonstrates our work's ability to inpaint human images, generate improved meshes for incomplete images, and fit better human meshes to a variety of highly dynamic poses.

Robust Point Tracking with Epipolar Constraints

Robust Point Tracking with Epipolar Constraints

Authors: Adithya Narayan , Tanisha Gupta , Lamia Alsalloom

Geometry-driven point tracking that reduces long-horizon drift by enforcing epipolar consistency via post-processing refinement and weakly-supervised finetuning.

Warm Start and Knot Point Interpolation for Whole Body MPPI

Warm Start and Knot Point Interpolation for Whole Body MPPI

Authors: Anoushka Alavilli , Adithya Narayan , Alyn Kirsch Tornell , Colleen Que , Lamia Alsalloom

We build upon the results from “Real-Time Whole-Body Control of Legged Robots with Model-Predictive Path Integral Control,” by Alvarez-Padilla, et al., which introduces whole-body sampling-based control for locomotion and manipulation on quadrupedal robots. Our contributions include the implementation of warm-starting to speed up the control trajectory optimization process, an ablation of spine interpolation order and knotpoint density, and tests of these improvements on newly designed terrains. We see that warm-starting and changes to spline order can help in some cases, such as stair-climbing and uneven terrain traversal, whereas the nominal implementation of MPPI is best for more basic tasks such as trotting in place.

Simple Face Tracker with Smoothing

Simple Face Tracker with Smoothing

Authors: Adithya Narayan

Developed a face tracking system that uses Multi-task Cascaded Convolutional Networks (MTCNN) for face detection and applies trajectory smoothing to keep the tracking stable across video frames. This Python-based tool processes videos to identify and track faces, outputting annotated videos that demonstrate its effectiveness. It's user-friendly, with simple setup instructions and options to customize with your own videos and reference face images, making it versatile for various applications.

Optimizing Fabrics in 2D

Optimizing Fabrics in 2D

Authors: Adithya Narayan

Leveraging vectorized operations in Numpy to find a quick solution to optical flow systems in 2D.

Simple Face Tracker with Smoothing

Authors: Adithya Narayan

Developed a way to memorize information between chunks of video generated using a video diffusion model using point-clouds. Using a fast rendering approach, we update the latent chunks to allow for long range video generation without error accumulation.

news

Dec 29, 2024 Will be serving as a reviewer for IJCV 2025!
Sep 15, 2024 Joined the Human Sensing Lab under Prof. Fernando De la Torre .
Aug 24, 2024 Started the Master of Science in Computer Vision (MSCV) program at Carnegie Mellon University!
Oct 01, 2022 Presenting two papers at SPIE and ISBI!
Oct 01, 2021 Presenting an oral and poster paper at ISUOG 2021!

latest posts