Publications
Record Once, Post Everywhere: Automatic Shortening of Audio Stories for Social Media
Bryan Wang, Zeyu Jin, Gautham J. Mysore,
ACM Symposium on User Interface Software and Technology (UIST), 2022
Emotion Embedding Spaces for Matching Music to Stories
Minz Won, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore, Xavier Serra
Society for Music Information Retrieval Conference (ISMIR), 2021 paper
Best Student Paper Award
Audio-Based Music Structure Analysis: Current Trends, Open Challenges, and Applications
Oriol Nieto, Gautham J. Mysore, Cheng-i Wang, Jordan B. L. Smith, Jan Schlüter, Thomas Grill, Brian McFee
Transactions of the International Society for Music Information Retrieval (TISMIR), 2020 paper
Controllable Neural Prosody Synthesis
Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore
Interspeech, 2020 paper
A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences
Pranay Manocha, Adam Finkelstein, Richard Zhang, Nicholas J. Bryan, Gautham J. Mysore, Zeyu Jin
Interspeech, 2020 paper | webpage
Nominated for a Best Student Paper Award
Recent Advances in Music Signal Processing
Meinard Muller, Bryan Pardo, Gautham J. Mysore, Vesa Valimiki
IEEE Signal Processing Magazine, 2019 paper
B-Script: Transcript-based B-roll Video Editing with Recommendations
Bernd Huber, Hijung Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore
ACM Conference on Human Factors in Computing Systems (CHI), 2019
VoiceAssist: Guiding Users to High-Quality Voice Recordings
Prem Seetharaman, Gautham J. Mysore, Bryan Pardo, Paris Smaragdis, Celso Gomes
ACM Conference on Human Factors in Computing Systems (CHI), 2019 paper
Blind Estimation of the Speech Transmission Index for Speech Quality Prediction
Prem Seetharaman, Gautham J. Mysore, Paris Smaragdis, Bryan Pardo
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018 paper
FFTNet: A Real-time Speaker-dependent Neural Vocoder
Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018 paper
Crowdsourced Pairwise-comparison for Source Separation Evaluation
Mark Cartwright, Bryan Pardo, Gautham J. Mysore,
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018 paper
LoopMaker: Automatic Creation of Music Loops from Pre-recorded Music
Zhengshan Shi, Gautham J. Mysore
ACM Conference on Human Factors in Computing Systems (CHI), 2018 paper
MedleyAssistant – A System for Personalized Music Medley Creation
Zhengshan Shi, Gautham J. Mysore,
ACM IUI Workshop on Intelligent Music Interfaces for Listening and Creation (MILC), 2018 paper
Re-visiting the Music Segmentation Problem with Crowdsourcing
Cheng-i Wang, Gautham J. Mysore, Shlomo Dubnov
International Society of Music Information Retrieval Conference (ISMIR), 2017
Dynamic Non-negative Models for Audio Source Separation
Paris Smaragdis, Gautham J. Mysore, Nasser Mohammadiha
Book Chapter in Audio Source Separation, Springer, 2018
Temporal extensions of Nonnegative Matrix Factorization
Cédric Févotte, Paris Smaragdis, Nasser Mohammadiha ,Gautham J. Mysore,
Book Chapter in Audio Source Separation and Speech Enhancement, Wiley, 2018
AutoDub: Automatic Redubbing for Voiceover Editing
Shrikant Venkataramani, Paris Smaragdis, Gautham J. Mysore, “
ACM Symposium on User Interface Software and Technology (UIST), 2017 paper
VoCo: Text-based Insertion and Replacement in Audio Narration
Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
Proceedings of SIGGRAPH, 2017 paper | webpage | demo video | Adobe MAX video Extensive Press Coverage
Eulerian Video Magnification and Analysis
Neal Wadhwa, Hao-Yu Wu, Abe Davis, Michael Rubinstein, Eugene Shih, Gautham J. Mysore, Justin G. Chen, Oral Buyukozturk, John V. Guttag, William T. Freeman, and Frédo Durand
Communications of the ACM, January 2017 paper
Analysis of Prosody Increment Induced by Pitch Accents for Automatic Emphasis Correction
Yang Zhang, Gautham J. Mysore, , Floraine Berthouzoz, Mark Hasegawa-Johnson
Speech Prosody, 2016 paper
Equalization Matching of Speech Recordings in Real-World Environments
François G. Germain, Gautham J. Mysore, Takako Fujioka
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
Structural Segmentation with the Variable Markov Oracle and Boundary Adjustment
Cheng-i Wang, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
Fast and Easy Crowdsourced Perceptual Audio Evaluation
Mark Cartwright, Bryan Pardo, Gautham J. Mysore, Matt Hoffman
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
CUTE: A Concatenative Method for Voice Conversion using Exemplar based Unit Selection
Zeyu Jin, Adam Finkelstein, Stephen DiVerdi, Jingwan Lu, Gautham J. Mysore, “
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016 paper
Capture-Time Feedback for Recording Scripted Narration
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Maneesh Agrawala
ACM Symposium on User Interface Software and Technology (UIST), 2015 paper | webpage | video | audio results
Speaker and Noise Independent Online Single Channel Speech Enhancement
François G. Germain,, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 paper
Speech Dereverberation using a Learned Speech Model
Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 paper
Efficient Manifold Preserving Audio Source Separation using Locality Sensitive Hashing
Minje Kim, Paris Smaragdis, Gautham J. Mysore, “
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015 paper
Lamello: Passive Acoustic Sensing for Tangible Input Components
Valkyrie Savage, Andrew Head, Björn Hartmann, Dan Goldman, Gautham J. Mysore, Wilmot Li
ACM Conference on Human Factors in Computing Systems (CHI), 2015
Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges
Gautham J. Mysore
IEEE Signal Processing Letters, August 2015 paper | webpage | dataset
The Visual Microphone: Passive Recovery of Sound from Video
Abe Davis, Michael Rubinstein, Neal Wadhwa, Gautham J. Mysore, Frédo Durand, William T. Freeman
SIGGRAPH, 2014 paper | webpage | video
Stopping Criteria for Non-negative Matrix Factorization Based Supervised and Semi-Supervised Source Separation
François G. Germain, Gautham J. Mysore
IEEE Signal Processing Letters, October 2014 paper
Exploiting Long-Term Temporal Dependencies in NMF using Recurrent Neural Networks with Application to Source Separation
Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014 paper
Speech Decoloration based on the Product-of-Filters Model
Dawen Liang, Daniel P. W. Ellis, Matthew Hoffman, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014 paper
Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view
Paris Smaragdis, Cédric Févotte, Gautham J. Mysore, Nasser Mohammadiha, Matthew Hoffman
IEEE Signal Processing Magazine Special Issue on Source Separation and Applications, May 2014 paper
A Generative Product of Filter Model of Audio
Dawen Liang, Matthew Hoffman, Gautham J. Mysore
International Conference on Learning Representations (ICLR), 2014 paper | code
ISSE: An Interactive Source Separation Editor
Nicholas J. Bryan, Gautham J. Mysore, Ge Wang
ACM Conference on Human Factors in Computing Systems (CHI), 2014
paper | webpage | demo video | demos | code
Source Separation of Polyphonic Music with Interactive User-Feedback on a Piano Roll Display
Nicholas J. Bryan, Gautham J. Mysore, Ge Wang
International Society of Music Information Retrieval Conference (ISMIR), 2013 paper
Combining Modeling of Singing Voice and Background Music for Automatic Separation of Musical Mixtures
Zafar Rafii, François G. Germain, Dennis L. Sun, Gautham J. Mysore
International Society of Music Information Retrieval Conference (ISMIR), 2013 paper
Content-Based Tools for Editing Audio Stories
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, Maneesh Agrawala
ACM Symposium on User Interface Software and Technology (UIST), 2013
paper | webpage | video | code | web app | audio results
Speaker and Noise Independent Voice Activity Detection
François G. Germain, Dennis L. Sun, Gautham J. Mysore
Interspeech, 2013 paper
Best Student Paper Award
An Efficient Posterior Regularized Latent Variable Model for Interactive Source Separation
Nicholas J. Bryan, Gautham J. Mysore
International Conference on Machine Learning (ICML), 2013
AES Student Design Competition Gold Award
paper | webpage | demo video | demos | code
Universal Speech Models for Speaker Independent Single Channel Source Separation
Dennis L. Sun, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 paper
Interactive Refinement of Supervised and Semi-supervised Sound Source Separation Estimates
Nicholas J. Bryan, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013 paper | webpage | sound examples | code
UnderScore: Musical Underlays for Audio Stories
Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, Maneesh Agrawala
ACM Symposium on User Interface Software and Technology (UIST), 2012
paper | webpage | video | code
Language Informed Bandwidth Expansion
Jinyu Han, Gautham J. Mysore, Bryan Pardo
IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2012 paper
Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments
Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis
Interspeech, 2012 paper
Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation
Gautham J. Mysore, Maneesh Sahani
International Conference on Machine Learning (ICML), 2012 paper
Following Musical Sources by Example
Paris Smaragdis, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012 paper
Clustering and Synchronizing Multi-camera Video via Landmark Cross-correlation
Nicholas J. Bryan, Paris Smaragdis, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012 paper
Noise-Robust Dynamic Time Warping Using PLCA Features
Brian King, Paris Smaragdis, Gautham J. Mysore
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012 paper
A Non-negative Approach to Language Informed Speech Separation
Gautham J. Mysore, Paris Smaragdis
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper | webpage | sound examples
Audio Imputation Using the Non-negative Hidden Markov Model
Jinyu Han, Gautham J. Mysore, Bryan Pardo
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper | webpage
Sound Recognition in Mixtures
Juhan Nam, Gautham J. Mysore, Paris Smaragdis
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper
Online PLCA for Real-Time Semi-supervised Source Separation
Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2012 paper
A Convolutive Spectral Decomposition Approach to the Separation of Feedback from Target Speech
Gautham J. Mysore, Paris Smaragdis
IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2011 paper
A Non-negative Approach to Semi-supervised Separation of Speech from Noise with the use of Temporal Dynamics
Gautham J. Mysore, Paris Smaragdis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011 paper
Non-negative Hidden Markov Modeling of Audio with Application to Source Separation
Gautham J. Mysore, Paris Smaragdis, Bhiksha Raj
International Conference on Latent Variable Analysis and Signal Separation
(LVA / ICA), 2010 paper | webpage | sound examples
Best Student Paper Award
A Super-Resolution Spectrogram Using Coupled PLCA
Juhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel, Interspeech, 2010 paper | webpage
Evaluation of a Score-Informed Source Separation System
Joachim Ganseman, Paul Scheunders, Gautham J. Mysore, Jonathan S. Abel
International Society of Music Information Retrieval Conference (ISMIR), 2010
Source Separation by Score Synthesis
Joachim Ganseman, Paul Scheunders, Gautham J. Mysore, Jonathan S. Abel
International Computer Music Conference (ICMC), 2010 paper | webpage
"Separation by Humming”: User Guided Sound Extraction from Monophonic Mixtures
Paris Smaragdis, Gautham J. Mysore
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2009 paper
Relative Pitch Estimation of Multiple Instruments
Gautham J. Mysore, Paris Smaragdis
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009 paper | webpage
Probabilistic Factorization of Non-Negative Data with Entropic Co-occurrence Constraints
Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj, Gautham J. Mysore
International Conference on Independent Component Analysis and Signal Separation (ICA), 2009 paper
Singer-Dependent Falsetto Detection for Live Vocal Processing Based on Support Vector Classification
Gautham J. Mysore, Ryan J. Cassidy, Julius O. Smith III
IEEE Asilomar Conference on Signals, Systems, and Computers, 2006 paper
SCUBA: The Self-Contained Unified Bass Augmenter
Juan-Pablo Cáceres, Gautham J. Mysore, Jeffrey Treviño
International Conference on New Interfaces for Musical Expression (NIME), 2005 paper
Ph.D. Thesis
A Non-negative Framework for Joint Modeling of Spectral Structure and Temporal Dynamics in Sound Mixtures
Advisor: Julius O. Smith III
Reading Committee: Paris Smaragdis, Malcolm Slaney, Robert Tibshirani
Stanford University. June 2010
Patents
Sound Quality Prediction and Interface to Facilitate High Quality Voice Recordings
Prem Seetharaman, Gautham J. Mysore, Bryan Pardo
U.S. Patent #11138989 issued in October 2021
Transcript-based Insertion of Secondary Video Content into Primary Video Content
Bernd Huber, Hijung Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore
U.S. Patent #11049525 issued in June 2021
Real-time Speaker Dependent Neural Vocoder
Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu
U.S. Patent #10770063 issued in September 2020
Time Interval Sound Alignment
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #10638221 issued in April 2020
Generating Audio Loops from an Audio Track
Zhengshan Shi, Gautham J. Mysore
U.S. Patent #10460763 issued in October 2019
Automatic Voiceover Correction System
Shrikant Venkataramani, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #10453475 issued in October 2019
Text-based Insertion and Replacement in Audio Narration
Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
U.S. Patent #10347238 issued in July 2019
Variable Sound Decomposition Masks
Gautham J. Mysore, Paris Smaragdis
U.S. Patent #10262680 issued in April 2019
Sound Rate Modification
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #10249321 issued in April 2019
Sound Processing using a Product-of-Filters Model
Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore
U.S. Patent #10176818 issued in January 2019
Reverberation Matching of Speech
Ramin Anushiravani, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #10069028 issued in September 2018
Intuitive Music Visualization using Efficient Structural Segmentation
Cheng-i Wang, Gautham J. Mysore
U.S. Patent#10074350 issued in September 2018
Irregular Pattern Identification using Landmark based Convolution
Minje Kim, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #10002622 issued in June 2018
Performance Metric Based Stopping Criteria for Iterative Algorithms
François G. Germain, Gautham J. Mysore
U.S. Patent #9866954 issued in January 2018
Automatic Emphasis of Spoken Words
Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
U.S. Patent #9852743 issued in December 2017
Irregularity Detection in Music
Minje Kim, Gautham J. Mysore, Paris Smaragdis, Peter Merrill
U.S. Patent #9734844 issued in in August 2017
Non-negative Matrix Factorization Regularized by Recurrent Neural Networks for Audio Processing
Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman
U.S. patent #9721202 issued in August 2017
Dereverberation Using a Learned Speech Model
Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore
U.S. Patent #9607627 issued in March 2017
Acoustic Matching and Splicing of Sound Tracks
François G. Germain, Gautham J. Mysore
U.S. Patent #9601124 issued in March 2017
Automatic Detection of Dense Ornamentation in Music
Minje Kim, Gautham J. Mysore, Paris Smaragdis, Peter Merrill
U.S. Patent #9514722 issued in December 2016
Sound Feature Priority Alignment
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #9451304 issued in September 2016
Pattern Matching of Sound Data using Hashing
Minje Kim, Paris Smaragdis, Gautham J. Mysore
U.S. Patent #9449085 issued in September 2016
General Sound Decomposition Models
Dennis L. Sun, Gautham J. Mysore
U.S. Patent #9437208 issued in September 2016
Sound Alignment Using Timing Information
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #9355649 issued in May 2016
Multichannel Sound Source Identification and Localization
Minje Kim, Gautham J. Mysore, Paris Smaragdis
U.S. Patent #9351093 issued in May 2016
Joint Sound Model Generation Techniques
Dennis L. Sun, Gautham J. Mysore
U.S. Patent #9318106 issued in April 2016
Sound Alignment User Interface
Brian King, Gautham J. Mysore, Paris Smaragdis
U.S Patent #9201580 issued in December 2015
Sound Mixture Recognition
Gautham J. Mysore, Paris Smaragdis, Juhan Nam
U.S Patent #9165565 issued in October 2015
Feature Estimation in Sound Sources
Paris Smaragdis, Gautham J. Mysore
U.S Patent #8965832 issued in February 2015
User-Guided Audio Selection from Complex Sound Mixtures
Paris Smaragdis, Gautham J. Mysore
U.S Patent #8954175 issued in February 2015
Clustering and Synchronizing Content
Nicholas J. Bryan, Paris Smaragdis, Gautham J. Mysore
U.S Patent #8924345 issued in December 2014
Language Informed Source Separation
Gautham J. Mysore, Paris Smaragdis
U.S Patent #8843364 issued in September 2014
Semi-supervised Source Separation using Non-negative Techniques
Gautham J. Mysore, Paris Smaragdis
U.S Patent #8812322 issued in August 2014
Noise Robust Template Matching
Gautham J. Mysore, Paris Smaragdis, Brian King
U.S Patent #8775167 issued in July 2014
System and Method for Acoustic Echo Cancellation using Spectral Decomposition
Paris Smaragdis, Gautham J. Mysore
U.S. Patent #8724798 issued in May 2014
Non-negative Hidden Markov Modeling of Signals
Gautham J. Mysore, Paris Smaragdis
U.S. Patent #8554553 issued in October 2013
Method and Apparatus for Relative Pitch Tracking of Multiple Arbitrary Sounds
Paris Smaragdis, Gautham J. Mysore
U.S. Patent #8380331 issued in February 2013