Sanjoy Chowdhury

I am a first year CS PhD student at University of Maryland College Park under Prof. Dinesh Manocha. Prior to this I was working as a Machine Learning Scientist with the Camera and Video AI team at ShareChat, India. I was also a visiting researcher at the Computer Vision and Pattern Recognition Unit at Indian Statistical Institute Kolkata under Prof. Ujjwal Bhattacharya.

Previously, I was a Senior Research Engineer with the Vision Intelligence Group at Samsung R&D Institute Bangalore. I primarily worked on developing novel AI powered solutions for different smart devices of Samsung.

I received my MTech in Computer Science & Engineering from IIIT Hyderabad where I was fortunate to be advised by Prof. C V Jawahar. During my undergrad, I had worked as research interns under Prof. Pabitra Mitra at IIT Kharagpur and at the CVPR Unit at ISI Kolkata under Prof. Ujjwal Bhattacharya.

Email  /  GitHub  /  Google Scholar  /  LinkedIn  /  Twitter

profile photo

                   Iribe #5116, 8125 Paint Branch Dr
                          College Park, MD 20742


[May 2023] Joined Adobe Research as a research intern.
[April 2023] Serving as a reviewer for ICCV 2023.
[Dec 2022] Serving as a reviewer for CVPR 2023.
[Aug 2022] Joined as a CS PhD student at University of Maryland College Park .
[July 2022] Serving as a reviewer for WACV 2023.
[Oct 2021] Paper on audio-visual summarization accepted in BMVC 2021.
[Sep 2021] Blog on on Video Quality Enhancement released at Tech @ ShareChat.
[July 2021] Paper on reflection removal got accepted in ICCV 2021.
[June 2021] Joined ShareChat Data Science team.
[May 2021] Paper on audio-visual joint segmentation accepted in ICIP 2021.
[Dec 2018] Accepted Samsung Research offer. Will be joining in June'19.
[Sep 2018] Received Dean's merit list award for academic excellence at IIIT Hyderabad.
[Oct 2017] Our work on multi-scale, low-latency face detection framework received Best Paper Award at NGCT-2017.


My research is at the intersection of Computer Vision, Deep Learning with a focus on Multi-modal learning, Video understanding and their various applications. I'm broadly interested in problems involving holistic scene understanding with minimal supervision, generative modeling.

project image

AudViSum: Self-Supervised Deep Reinforcement Learning for Diverse Audio-Visual Summary Generation

Sanjoy Chowdhury*, Aditya P. Patra*, Subhrajyoti Dasgupta, Ujjwal Bhattacharya
British Machine Vision Conference (BMVC), 2021
Paper / Code / Presentation

Introduced a novel deep reinforcement learning based self-supervised audio-visual summarization model that leverages both audio and visual information to generate diverse yet semantically meaningful summaries.

project image

V-DESIRR: Very Fast Deep Embedded Single Image Reflection Removal

B H Pawan Prasad, Green Rosh K S, Lokesh R B, Kaushik Mitra, Sanjoy Chowdhury
International Conference on Computer Vision (ICCV), 2021
Paper / Code

We have proposed a multi-scale end to end architecture for detecting and removing weak, medium and strong reflections from naturally occurring images.

project image

Listen to the Pixels

Sanjoy Chowdhury, Subhrajyoti Dasgupta, Sudip Das, Ujjwal Bhattacharya
International Conference on Image Processing (ICIP), 2021
Paper / Code / Presentation

In this study, we exploited the concurrency between audio and visual modalities in an attempt to solve the joint audio-visual segmentation problem in a self-supervised manner.

project image

A Survey on Fuzzy Set Theoretic Approaches for Image Segmentation

Ajoy Mondal*, Sanjoy Chowdhury*
Soft Computing, 2022 (Under review)

The survey paper performs an in depth comparison and analysis on fuzzy set theory based image segmentation techniques.

project image

Not Too Deep CNN for Face Detection in Real Life Scenario

Sanjoy Chowdhury, Parthasarathi Mukherjee, Ujjwal Bhattacharya
International Conference on Next Generation Computing Technologies, Springer, 2017 (Best paper award, Oral)
Paper / Code

Proposed a multi-scale face detection framework that is capable of detecting faces of multiple size and different orientations in low resolution images while achieving sufficiently low latency and modest detection rates in the wild.

project image

Classification of Citation in Scientific Articles

Sanjoy Chowdhury, Harsh Vardhan, Pabitra Mitra, Dinabandhu Bhandari
National Conference on Recent Advances in Science and Technology, 2016 (Oral)
Abstract / Code

Designed a multi-class classification system to find out the type of citation i.e. a citation belongs to which facet. We aimed to achieve this by extracting and analysing citation information from the text.


Have tried my hand at writing technical blogs.

project image

The devil is in the details: Video Quality Enhancement Approaches


The blog contextualizes the problem of video enhancement in present day scenario and talks about a couple of interesting approaches to handle this challenging task.


IIT Kharagpur
Apr-Sep 2016

ISI Kolkata
Feb-July 2017

IIIT Hyderabad
Aug 2017 - May 2019

Mentor Graphics Hyderabad
May - July 2018

Samsung Research Bangalore
June 2019 - June 2021

ShareChat Bangalore
June 2021 - May 2022

UMD College Park
Aug 2022 - Present

Adobe Research
May 2023 - Present

Template credits: Jon Barron