Publications
Record Once, Post Everywhere: Automatic Shortening of Audio Stories for Social Media
Bryan Wang, Zeyu Jin, Gautham J. Mysore,
ACM Symposium on User Interface Software and Technology (UIST), 2022
​
​
Emotion Embedding Spaces for Matching Music to Stories
Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra
Society for Music Information Retrieval Conference (ISMIR), 2021 paper
Best Student Paper Award
​
Audio-Based Music Structure Analysis: Current Trends, Open Challenges, and Applications
Oriol Nieto, Gautham J. Mysore, Cheng-i Wang, Jordan B. L. Smith, Jan Schlüter, Thomas Grill, Brian McFee
Transactions of the International Society for Music Information Retrieval (TISMIR), 2020 paper
​
Controllable Neural Prosody Synthesis
Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore
Interspeech, 2020 paper
​
A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences
Pranay Manocha, Adam Finkelstein, Richard Zhang, Nicholas J. Bryan, Gautham J. Mysore, Zeyu Jin
Interspeech, 2020 paper | webpage
Nominated for a Best Student Paper Award
​
Recent Advances in Music Signal Processing
Meinard Muller, Bryan Pardo, Gautham J. Mysore, Vesa Valimiki
IEEE Signal Processing Magazine, 2019 paper
​
B-Script: Transcript-based B-roll Video Editing with Recommendations
Bernd Huber, Hijung Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore
ACM Conference on Human Factors in Computing Systems (CHI), 2019
​
VoiceAssist: Guiding Users to High-Quality Voice Recordings
Prem Seetharaman, Gautham J. Mysore, Bryan Pardo, Paris Smaragdis, Celso Gomes
ACM Conference on Human Factors in Computing Systems (CHI), 2019 paper
​
Blind Estimation of the Speech Transmission Index for Speech Quality Prediction
Prem Seetharaman, Gautham J. Mysore, Paris Smaragdis, Bryan Pardo
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018 paper
​
FFTNet: A Real-time Speaker-dependent Neural Vocoder
Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018 paper
​
Crowdsourced Pairwise-comparison for Source Separation Evaluation
Mark Cartwright, Bryan Pardo, Gautham J. Mysore,
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018 paper
​
LoopMaker: Automatic Creation of Music Loops from Pre-recorded Music
Zhengshan Shi, Gautham J. Mysore
ACM Conference on Human Factors in Computing Systems (CHI), 2018 paper
​
MedleyAssistant – A System for Personalized Music Medley Creation
Zhengshan Shi, Gautham J. Mysore,
ACM IUI Workshop on Intelligent Music Interfaces for Listening and Creation (MILC), 2018 paper
​
Re-visiting the Music Segmentation Problem with Crowdsourcing
Cheng-i Wang, Gautham J. Mysore, Shlomo Dubnov
International Society of Music Information Retrieval Conference (ISMIR), 2017
​
Dynamic Non-negative Models for Audio Source Separation
Paris Smaragdis, Gautham J. Mysore, Nasser Mohammadiha
Book Chapter in Audio Source Separation, Springer, 2018
​
Temporal extensions of Nonnegative Matrix Factorization
Cédric Févotte, Paris Smaragdis, Nasser Mohammadiha ,Gautham J. Mysore,
Book Chapter in Audio Source Separation and Speech Enhancement, Wiley, 2018
​
AutoDub: Automatic Redubbing for Voiceover Editing
Shrikant Venkataramani, Paris Smaragdis, Gautham J. Mysore, “
ACM Symposium on User Interface Software and Technology (UIST), 2017 paper
​
VoCo: Text-based Insertion and Replacement in Audio Narration
Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
Proceedings of SIGGRAPH, 2017 paper | webpage | demo video | Adobe MAX video Extensive Press Coverage
​
Eulerian Video Magnification and Analysis
Neal Wadhwa, Hao-Yu Wu, Abe Davis, Michael Rubinstein, Eugene Shih, Gautham J. Mysore, Justin G. Chen, Oral Buyukozturk, John V. Guttag, William T. Freeman, and Frédo Durand
Communications of the ACM, January 2017 paper
​
Analysis of Prosody Increment Induced by Pitch Accents for Automatic Emphasis Correction
Yang Zhang, Gautham J. Mysore, , Floraine Berthouzoz, Mark Hasegawa-Johnson
Speech Prosody, 2016 paper
​
Equalization Matching of Speech Recordings in Real-World Environments
François G. Germain, Gautham J. Mysore, Takako Fujioka
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
​
Structural Segmentation with the Variable Markov Oracle and Boundary Adjustment
Cheng-i Wang, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
​
Fast and Easy Crowdsourced Perceptual Audio Evaluation
Mark Cartwright, Bryan Pardo, Gautham J. Mysore, Matt Hoffman
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
​
CUTE: A Concatenative Method for Voice Conversion using Exemplar based Unit Selection
Zeyu Jin, Adam Finkelstein, Stephen DiVerdi, Jingwan Lu, Gautham J. Mysore, “
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
​
Capture-Time Feedback for Recording Scripted Narration
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Maneesh Agrawala
ACM Symposium on User Interface Software and Technology (UIST), 2015 paper | webpage | video | audio results
​
Speaker and Noise Independent Online Single Channel Speech Enhancement
François G. Germain,, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 paper
​
Speech Dereverberation using a Learned Speech Model
Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 paper
​
Efficient Manifold Preserving Audio Source Separation using Locality Sensitive Hashing
Minje Kim, Paris Smaragdis, Gautham J. Mysore, “
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 paper
​
Lamello: Passive Acoustic Sensing for Tangible Input Components
Valkyrie Savage, Andrew Head, Björn Hartmann, Dan Goldman, Gautham J. Mysore, Wilmot Li
ACM Conference on Human Factors in Computing Systems (CHI), 2015
​
Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges
Gautham J. Mysore
IEEE Signal Processing Letters, August 2015 paper | webpage | dataset
​
The Visual Microphone: Passive Recovery of Sound from Video
Abe Davis, Michael Rubinstein, Neal Wadhwa, Gautham J. Mysore, Frédo Durand, William T. Freeman
SIGGRAPH, 2014 paper | webpage | video
​
Stopping Criteria for Non-negative Matrix Factorization Based Supervised and Semi-Supervised Source Separation
François G. Germain, Gautham J. Mysore
IEEE Signal Processing Letters, October 2014 paper
​
Exploiting Long-Term Temporal Dependencies in NMF using Recurrent Neural Networks with Application to Source Separation
Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014 paper
​
Speech Decoloration based on the Product-of-Filters Model
Dawen Liang, Daniel P. W. Ellis, Matthew Hoffman, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014 paper
​
Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view
Paris Smaragdis, Cédric Févotte, Gautham J. Mysore, Nasser Mohammadiha, Matthew Hoffman
IEEE Signal Processing Magazine Special Issue on Source Separation and Applications, May 2014 paper
​
A Generative Product of Filter Model of Audio
Dawen Liang, Matthew Hoffman, Gautham J. Mysore
International Conference on Learning Representations (ICLR), 2014 paper | code
​
ISSE: An Interactive Source Separation Editor
Nicholas J. Bryan, Gautham J. Mysore, Ge Wang
ACM Conference on Human Factors in Computing Systems (CHI), 2014
paper | webpage | demo video | demos | code
​
Source Separation of Polyphonic Music with Interactive User-Feedback on a Piano Roll Display
Nicholas J. Bryan, Gautham J. Mysore, Ge Wang
International Society of Music Information Retrieval Conference (ISMIR), 2013 paper
​
Combining Modeling of Singing Voice and Background Music for Automatic Separation of Musical Mixtures
Zafar Rafii, François G. Germain, Dennis L. Sun, Gautham J. Mysore
International Society of Music Information Retrieval Conference (ISMIR), 2013 paper
​
Content-Based Tools for Editing Audio Stories
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, Maneesh Agrawala
ACM Symposium on User Interface Software and Technology (UIST), 2013
paper | webpage | video | code | web app | audio results
​
Speaker and Noise Independent Voice Activity Detection
François G. Germain, Dennis L. Sun, Gautham J. Mysore
Interspeech, 2013 paper
Best Student Paper Award
​
An Efficient Posterior Regularized Latent Variable Model for Interactive Source Separation
Nicholas J. Bryan, Gautham J. Mysore
International Conference on Machine Learning (ICML), 2013
AES Student Design Competition Gold Award
paper | webpage | demo video | demos | code
​
Universal Speech Models for Speaker Independent Single Channel Source Separation
Dennis L. Sun, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 paper
​
Interactive Refinement of Supervised and Semi-supervised Sound Source Separation Estimates
Nicholas J. Bryan, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 paper | webpage | sound examples | code
​
UnderScore: Musical Underlays for Audio Stories
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, Maneesh Agrawala
ACM Symposium on User Interface Software and Technology (UIST), 2012
paper | webpage | video | code
​
Language Informed Bandwidth Expansion
Jinyu Han, Gautham J. Mysore, Bryan Pardo
IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2012 paper
​
Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments
Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis
Interspeech, 2012 paper
​
Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation
Gautham J. Mysore, Maneesh Sahani
International Conference on Machine Learning (ICML), 2012 paper
​
Following Musical Sources by Example
Paris Smaragdis, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012 paper
​
Clustering and Synchronizing Multi-camera Video via Landmark Cross-correlation
Nicholas J. Bryan, Paris Smaragdis, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012 paper
​
Noise-Robust Dynamic Time Warping Using PLCA Features
Brian King, Paris Smaragdis, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012 paper
​
A Non-negative Approach to Language Informed Speech Separation
Gautham J. Mysore, Paris Smaragdis
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper | webpage | sound examples
​
Audio Imputation Using the Non-negative Hidden Markov Model
Jinyu Han, Gautham J. Mysore, Bryan Pardo
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper | webpage
​
Sound Recognition in Mixtures
Juhan Nam, Gautham J. Mysore, Paris Smaragdis
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper
​
Online PLCA for Real-Time Semi-supervised Source Separation
Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper
​
A Convolutive Spectral Decomposition Approach to the Separation of Feedback from Target Speech
Gautham J. Mysore, Paris Smaragdis
IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2011 paper
​
A Non-negative Approach to Semi-supervised Separation of Speech from Noise with the use of Temporal Dynamics
Gautham J. Mysore, Paris Smaragdis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011 paper
​
Non-negative Hidden Markov Modeling of Audio with Application to Source Separation
Gautham J. Mysore, Paris Smaragdis, Bhiksha Raj
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2010 paper | webpage | sound examples
Best Student Paper Award
​
A Super-Resolution Spectrogram Using Coupled PLCA
Juhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel, Interspeech, 2010 paper | webpage
​
Evaluation of a Score-Informed Source Separation System
Joachim Ganseman, Paul Scheunders, Gautham J. Mysore, Jonathan S. Abel
International Society of Music Information Retrieval Conference (ISMIR), 2010
​
Source Separation by Score Synthesis
Joachim Ganseman, Paul Scheunders, Gautham J. Mysore, Jonathan S. Abel
International Computer Music Conference (ICMC), 2010 paper | webpage
​
"Separation by Humming”: User Guided Sound Extraction from Monophonic Mixtures
Paris Smaragdis, Gautham J. Mysore
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2009 paper
​
Relative Pitch Estimation of Multiple Instruments
Gautham J. Mysore, Paris Smaragdis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 paper | webpage
​
Probabilistic Factorization of Non-Negative Data with Entropic Co-occurrence Constraints
Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj, Gautham J. Mysore
International Conference on Independent Component Analysis and Signal Separation (ICA), 2009 paper
​
Singer-Dependent Falsetto Detection for Live Vocal Processing Based on Support Vector Classification
Gautham J. Mysore, Ryan J. Cassidy, Julius O. Smith III
IEEE Asilomar Conference on Signals, Systems, and Computers, 2006 paper
​
SCUBA: The Self-Contained Unified Bass Augmenter
Juan-Pablo Cáceres, Gautham J. Mysore, Jeffrey Treviño
International Conference on New Interfaces for Musical Expression (NIME), 2005 paper
​
​
Ph.D. Thesis
​
A Non-negative Framework for Joint Modeling of Spectral Structure and Temporal Dynamics in Sound Mixtures
Advisor: Julius O. Smith III
Reading Committee: Paris Smaragdis, Malcolm Slaney, Robert Tibshirani
Stanford University. June 2010
​
​
Patents​
​
Sound Quality Prediction and Interface to Facilitate High Quality Voice Recordings 

Prem Seetharaman, Gautham J. Mysore, Bryan Pardo
U.S. Patent #11138989 issued in October 2021
​
Transcript-based Insertion of Secondary Video Content into Primary Video Content
Bernd Huber, Hijung Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore
U.S. Patent #11049525 issued in June 2021
​
Real-time Speaker Dependent Neural Vocoder
Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu
U.S. Patent #10770063 issued in September 2020
​
Time Interval Sound Alignment
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #10638221 issued in April 2020
​
Generating Audio Loops from an Audio Track
Zhengshan Shi, Gautham J. Mysore
U.S. Patent #10460763 issued in October 2019
​
Automatic Voiceover Correction System
Shrikant Venkataramani, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #10453475 issued in October 2019
​
Text-based Insertion and Replacement in Audio Narration
Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
U.S. Patent #10347238 issued in July 2019
​
Variable Sound Decomposition Masks
Gautham J. Mysore, Paris Smaragdis
U.S. Patent #10262680 issued in April 2019
​
Sound Rate Modification
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #10249321 issued in April 2019
Sound Processing using a Product-of-Filters Model
Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore
U.S. Patent #10176818 issued in January 2019
​
Reverberation Matching of Speech
Ramin Anushiravani, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #10069028 issued in September 2018
​
Intuitive Music Visualization using Efficient Structural Segmentation
Cheng-i Wang, Gautham J. Mysore
U.S. Patent#10074350 issued in September 2018
​
Irregular Pattern Identification using Landmark based Convolution
Minje Kim, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #10002622 issued in June 2018
​
Performance Metric Based Stopping Criteria for Iterative Algorithms
François G. Germain, Gautham J. Mysore
U.S. Patent #9866954 issued in January 2018
​
Automatic Emphasis of Spoken Words
Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
U.S. Patent #9852743 issued in December 2017
​
Irregularity Detection in Music
Minje Kim, Gautham J. Mysore, Paris Smaragdis, Peter Merrill
U.S. Patent #9734844 issued in in August 2017
​
Non-negative Matrix Factorization Regularized by Recurrent Neural Networks for Audio Processing
Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman
U.S. patent #9721202 issued in August 2017
​
Dereverberation Using a Learned Speech Model
Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore
U.S. Patent #9607627 issued in March 2017
​
Acoustic Matching and Splicing of Sound Tracks
François G. Germain, Gautham J. Mysore
U.S. Patent #9601124 issued in March 2017
​
Automatic Detection of Dense Ornamentation in Music
Minje Kim, Gautham J. Mysore, Paris Smaragdis, Peter Merrill
U.S. Patent #9514722 issued in December 2016
​
Sound Feature Priority Alignment
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #9451304 issued in September 2016
​
Pattern Matching of Sound Data using Hashing
Minje Kim, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #9449085 issued in September 2016
​
General Sound Decomposition Models
Dennis L. Sun, Gautham J. Mysore
U.S. Patent #9437208 issued in September 2016
​
Sound Alignment Using Timing Information
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #9355649 issued in May 2016
​
Multichannel Sound Source Identification and Localization
Minje Kim, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #9351093 issued in May 2016
​
Joint Sound Model Generation Techniques
Dennis L. Sun, Gautham J. Mysore
U.S. Patent #9318106 issued in April 2016
​
Sound Alignment User Interface
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S Patent #9201580 issued in December 2015
​
Sound Mixture Recognition
Gautham J. Mysore, Paris Smaragdis, Juhan Nam
U.S Patent #9165565 issued in October 2015
​
Feature Estimation in Sound Sources
Paris Smaragdis, Gautham J. Mysore
U.S Patent #8965832 issued in February 2015
​
User-Guided Audio Selection from Complex Sound Mixtures
Paris Smaragdis, Gautham J. Mysore
U.S Patent #8954175 issued in February 2015
​
Clustering and Synchronizing Content
Nicholas J. Bryan, Paris Smaragdis, Gautham J. Mysore
U.S Patent #8924345 issued in December 2014
​
Language Informed Source Separation
Gautham J. Mysore, Paris Smaragdis
U.S Patent #8843364 issued in September 2014
​
Semi-supervised Source Separation using Non-negative Techniques
Gautham J. Mysore, Paris Smaragdis
U.S Patent #8812322 issued in August 2014
​
Noise Robust Template Matching
Gautham J. Mysore, Paris Smaragdis, Brian King
U.S Patent #8775167 issued in July 2014
​
System and Method for Acoustic Echo Cancellation using Spectral Decomposition
Paris Smaragdis, Gautham J. Mysore
U.S. Patent #8724798 issued in May 2014
​
Non-negative Hidden Markov Modeling of Signals
Gautham J. Mysore, Paris Smaragdis
U.S. Patent #8554553 issued in October 2013
​
Method and Apparatus for Relative Pitch Tracking of Multiple Arbitrary Sounds
Paris Smaragdis, Gautham J. Mysore
U.S. Patent #8380331 issued in February 2013
​
​
​