top of page

Publications

Record Once, Post Everywhere: Automatic Shortening of Audio Stories for Social Media

Bryan Wang, Zeyu Jin,  Gautham J. Mysore,

ACM Symposium on User Interface Software and Technology (UIST), 2022

paper | webpage

​

​

Emotion Embedding Spaces for Matching Music to Stories

Minz Won, Justin Salamon, Nicholas J. Bryan,  Gautham J. Mysore, Xavier Serra

Society for Music Information Retrieval Conference (ISMIR), 2021   paper

Best Student Paper Award

​

Audio-Based Music Structure Analysis: Current Trends, Open Challenges, and Applications

Oriol Nieto, Gautham J. Mysore, Cheng-i Wang, Jordan B. L. Smith, Jan Schlüter, Thomas Grill, Brian McFee

Transactions of the International Society for Music Information Retrieval (TISMIR), 2020   paper

​

Controllable Neural Prosody Synthesis

Max Morrison, Zeyu Jin, Justin Salamon, Nicholas J. Bryan, Gautham J. Mysore

Interspeech, 2020   paper

​

A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences

Pranay Manocha, Adam Finkelstein, Richard Zhang, Nicholas J. Bryan,  Gautham J. Mysore, Zeyu Jin

Interspeech, 2020   paper | webpage

Nominated for a Best Student Paper Award

​

Recent Advances in Music Signal Processing

Meinard Muller, Bryan Pardo, Gautham J. Mysore, Vesa Valimiki

IEEE Signal Processing Magazine, 2019   paper

​

B-Script: Transcript-based B-roll Video Editing with Recommendations

Bernd Huber, Hijung Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore

ACM Conference on Human Factors in Computing Systems (CHI), 2019   

paper | webpage | blog post

​

VoiceAssist: Guiding Users to High-Quality Voice Recordings

Prem Seetharaman, Gautham J. Mysore, Bryan Pardo, Paris Smaragdis, Celso Gomes

ACM Conference on Human Factors in Computing Systems (CHI), 2019   paper

​

Blind Estimation of the Speech Transmission Index for Speech Quality Prediction

Prem Seetharaman, Gautham J. Mysore, Paris Smaragdis, Bryan Pardo

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018   paper

​

FFTNet: A Real-time Speaker-dependent Neural Vocoder

Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018   paper

​

Crowdsourced Pairwise-comparison for Source Separation Evaluation

Mark Cartwright, Bryan Pardo, Gautham J. Mysore,

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018   paper

​

LoopMaker: Automatic Creation of Music Loops from Pre-recorded Music

Zhengshan Shi, Gautham J. Mysore

ACM Conference on Human Factors in Computing Systems (CHI), 2018   paper

​

MedleyAssistant – A System for Personalized Music Medley Creation

Zhengshan Shi, Gautham J. Mysore,

ACM IUI Workshop on Intelligent Music Interfaces for Listening and Creation (MILC), 2018   paper

​

Re-visiting the Music Segmentation Problem with Crowdsourcing

Cheng-i Wang, Gautham J. Mysore, Shlomo Dubnov

International Society of Music Information Retrieval Conference (ISMIR), 2017

paper

​

Dynamic Non-negative Models for Audio Source Separation

Paris Smaragdis, Gautham J. Mysore, Nasser Mohammadiha

Book Chapter in Audio Source Separation, Springer, 2018

​

Temporal extensions of Nonnegative Matrix Factorization

Cédric Févotte, Paris Smaragdis, Nasser Mohammadiha ,Gautham J. Mysore,

Book Chapter in Audio Source Separation and Speech Enhancement, Wiley, 2018

​

AutoDub: Automatic Redubbing for Voiceover Editing

Shrikant Venkataramani, Paris Smaragdis, Gautham J. Mysore, “

ACM Symposium on User Interface Software and Technology (UIST), 2017  paper

​

VoCo: Text-based Insertion and Replacement in Audio Narration

Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein

Proceedings of SIGGRAPH, 2017  paper | webpage | demo video | Adobe MAX video  Extensive Press Coverage

​

Eulerian Video Magnification and Analysis

Neal Wadhwa, Hao-Yu Wu, Abe Davis, Michael Rubinstein, Eugene Shih, Gautham J. Mysore, Justin G. Chen, Oral Buyukozturk, John V. Guttag, William T. Freeman, and Frédo Durand

Communications of the ACM, January 2017   paper

​

Analysis of Prosody Increment Induced by Pitch Accents for Automatic Emphasis Correction

Yang Zhang, Gautham J. Mysore, , Floraine Berthouzoz, Mark Hasegawa-Johnson

Speech Prosody, 2016   paper

​

Equalization Matching of Speech Recordings in Real-World Environments

François G. Germain, Gautham J. Mysore, Takako Fujioka

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016   paper

​

Structural Segmentation with the Variable Markov Oracle and Boundary Adjustment

Cheng-i Wang, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016   paper

​

Fast and Easy Crowdsourced Perceptual Audio Evaluation

Mark Cartwright, Bryan Pardo, Gautham J. Mysore, Matt Hoffman

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016   paper

​

CUTE: A Concatenative Method for Voice Conversion using Exemplar based Unit Selection

Zeyu Jin, Adam Finkelstein, Stephen DiVerdi, Jingwan Lu, Gautham J. Mysore, “

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2016   paper

​

Capture-Time Feedback for Recording Scripted Narration

Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Maneesh Agrawala

ACM Symposium on User Interface Software and Technology (UIST), 2015     paper webpage | video | audio results

​

Speaker and Noise Independent Online Single Channel Speech Enhancement

François G. Germain,, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015   paper

​

Speech Dereverberation using a Learned Speech Model

Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015   paper

​

Efficient Manifold Preserving Audio Source Separation using Locality Sensitive Hashing

Minje Kim, Paris Smaragdis, Gautham J. Mysore, “

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2015   paper

​

Lamello: Passive Acoustic Sensing for Tangible Input Components

Valkyrie Savage, Andrew Head, Björn Hartmann, Dan Goldman, Gautham J. Mysore, Wilmot Li

ACM Conference on Human Factors in Computing Systems (CHI), 2015   

papervideo

​

Can We Automatically Transform Speech Recorded on Common Consumer Devices in Real-World Environments into Professional Production Quality Speech? - A Dataset, Insights, and Challenges

Gautham J. Mysore

IEEE Signal Processing Letters, August 2015   paper | webpage | dataset

​

The Visual Microphone: Passive Recovery of Sound from Video

Abe Davis, Michael Rubinstein, Neal Wadhwa, Gautham J. Mysore, Frédo Durand, William T. Freeman

SIGGRAPH, 2014   paper | webpage | video

Extensive Press Coverage

​

Stopping Criteria for Non-negative Matrix Factorization Based Supervised and Semi-Supervised Source Separation

François G. Germain, Gautham J. Mysore

IEEE Signal Processing Letters, October 2014   paper

​

Exploiting Long-Term Temporal Dependencies in NMF using Recurrent Neural Networks with Application to Source Separation

Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014   paper

​

Speech Decoloration based on the Product-of-Filters Model

Dawen Liang, Daniel P. W. Ellis, Matthew Hoffman, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2014   paper

​

Static and Dynamic Source Separation Using Nonnegative Factorizations: A unified view

Paris Smaragdis, Cédric Févotte, Gautham J. Mysore, Nasser Mohammadiha, Matthew Hoffman

IEEE Signal Processing Magazine Special Issue on Source Separation and Applications, May 2014   paper

​

A Generative Product of Filter Model of Audio

Dawen Liang, Matthew Hoffman, Gautham J. Mysore

International Conference on Learning Representations (ICLR), 2014    paper | code

​

ISSE: An Interactive Source Separation Editor

Nicholas J. Bryan, Gautham J. Mysore, Ge Wang

ACM Conference on Human Factors in Computing Systems (CHI), 2014

paper | webpage | demo video | demos | code

​

Source Separation of Polyphonic Music with Interactive User-Feedback on a Piano Roll Display

Nicholas J. Bryan, Gautham J. Mysore, Ge Wang

International Society of Music Information Retrieval Conference (ISMIR), 2013   paper

​

Combining Modeling of Singing Voice and Background Music for Automatic Separation of Musical Mixtures

Zafar Rafii, François G. Germain, Dennis L. Sun, Gautham J. Mysore

International Society of Music Information Retrieval Conference (ISMIR), 2013  paper

​

Content-Based Tools for Editing Audio Stories

Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, Maneesh Agrawala

ACM Symposium on User Interface Software and Technology (UIST), 2013

paper | webpage | video | code | web app | audio results

​

Speaker and Noise Independent Voice Activity Detection

François G. Germain, Dennis L. Sun, Gautham J. Mysore

Interspeech, 2013   paper

Best Student Paper Award

​

An Efficient Posterior Regularized Latent Variable Model for Interactive Source Separation

Nicholas J. Bryan, Gautham J. Mysore

International Conference on Machine Learning (ICML), 2013

AES Student Design Competition Gold Award

paper | webpage | demo video | demos | code

​

Universal Speech Models for Speaker Independent Single Channel Source Separation

Dennis L. Sun, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013   paper

​

Interactive Refinement of Supervised and Semi-supervised Sound Source Separation Estimates

Nicholas J. Bryan, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2013   paper | webpage | sound examples | code

​

UnderScore: Musical Underlays for Audio Stories

Steve Rubin, Floraine Berthouzoz, Gautham J. Mysore, Wilmot Li, Maneesh Agrawala

ACM Symposium on User Interface Software and Technology (UIST), 2012

paper | webpage | video | code

​

Language Informed Bandwidth Expansion

Jinyu Han, Gautham J. Mysore, Bryan Pardo

IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2012   paper

​

Speech Enhancement by Online Non-negative Spectrogram Decomposition in Non-stationary Noise Environments

Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis

Interspeech, 2012   paper

​

Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

Gautham J. Mysore, Maneesh Sahani

International Conference on Machine Learning (ICML), 2012   paper

​

Following Musical Sources by Example

Paris Smaragdis, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012   paper

​

Clustering and Synchronizing Multi-camera Video via Landmark Cross-correlation

Nicholas J. Bryan, Paris Smaragdis, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012   paper

​

Noise-Robust Dynamic Time Warping Using PLCA Features

Brian King, Paris Smaragdis, Gautham J. Mysore

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012   paper

​

A Non-negative Approach to Language Informed Speech Separation

Gautham J. Mysore, Paris Smaragdis

International Conference on Latent Variable Analysis and Signal Separation

(LVA / ICA), 2012   paper | webpage | sound examples

​

Audio Imputation Using the Non-negative Hidden Markov Model

Jinyu Han, Gautham J. Mysore, Bryan Pardo

International Conference on Latent Variable Analysis and Signal Separation

(LVA / ICA), 2012   paper | webpage

​

Sound Recognition in Mixtures

Juhan Nam, Gautham J. Mysore, Paris Smaragdis

International Conference on Latent Variable Analysis and Signal Separation

(LVA / ICA), 2012   paper

​

Online PLCA for Real-Time Semi-supervised Source Separation

Zhiyao Duan, Gautham J. Mysore, Paris Smaragdis

International Conference on Latent Variable Analysis and Signal Separation

(LVA / ICA), 2012   paper

​

A Convolutive Spectral Decomposition Approach to the Separation of Feedback from Target Speech

Gautham J. Mysore, Paris Smaragdis

IEEE International Workshop on Machine Learning for Signal Processing (MLSP), 2011  paper

​

A Non-negative Approach to Semi-supervised Separation of Speech from Noise with the use of Temporal Dynamics

Gautham J. Mysore, Paris Smaragdis

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2011   paper

​

Non-negative Hidden Markov Modeling of Audio with Application to Source Separation

Gautham J. Mysore, Paris Smaragdis, Bhiksha Raj

International Conference on Latent Variable Analysis and Signal Separation

(LVA / ICA), 2010   paper | webpage | sound examples

Best Student Paper Award

​

A Super-Resolution Spectrogram Using Coupled PLCA

Juhan Nam, Gautham J. Mysore, Joachim Ganseman, Kyogu Lee, Jonathan S. Abel, Interspeech, 2010   paper | webpage

​

Evaluation of a Score-Informed Source Separation System

Joachim Ganseman, Paul Scheunders, Gautham J. Mysore, Jonathan S. Abel

International Society of Music Information Retrieval Conference (ISMIR), 2010

paper | webpage

​

Source Separation by Score Synthesis

Joachim Ganseman, Paul Scheunders, Gautham J. Mysore, Jonathan S. Abel

International Computer Music Conference (ICMC), 2010   paper | webpage

​

"Separation by Humming”: User Guided Sound Extraction from Monophonic Mixtures

Paris Smaragdis, Gautham J. Mysore

IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2009   paper

​

Relative Pitch Estimation of Multiple Instruments

Gautham J. Mysore, Paris Smaragdis

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2009   paper | webpage

​

Probabilistic Factorization of Non-Negative Data with Entropic Co-occurrence Constraints

Paris Smaragdis, Madhusudana Shashanka, Bhiksha Raj, Gautham J. Mysore

International Conference on Independent Component Analysis and Signal Separation (ICA), 2009   paper

​

Singer-Dependent Falsetto Detection for Live Vocal Processing Based on Support Vector Classification

Gautham J. Mysore, Ryan J. Cassidy, Julius O. Smith III

IEEE Asilomar Conference on Signals, Systems, and Computers, 2006   paper

​

SCUBA: The Self-Contained Unified Bass Augmenter

Juan-Pablo Cáceres, Gautham J. Mysore, Jeffrey Treviño

International Conference on New Interfaces for Musical Expression (NIME), 2005  paper

​

​

Ph.D. Thesis

​

A Non-negative Framework for Joint Modeling of Spectral Structure and Temporal Dynamics in Sound Mixtures

Advisor: Julius O. Smith III

Reading Committee: Paris Smaragdis, Malcolm Slaney, Robert Tibshirani  

Stanford University. June 2010

thesis | webpage

​

​

Patents​

​

Sound Quality Prediction and Interface to Facilitate High Quality Voice Recordings 


Prem Seetharaman, Gautham J. Mysore, Bryan Pardo
U.S. Patent #11138989 issued in October 2021

​

Transcript-based Insertion of Secondary Video Content into Primary Video Content

Bernd Huber, Hijung Shin, Bryan Russell, Oliver Wang, Gautham J. Mysore

U.S. Patent #11049525 issued in June 2021

​

Real-time Speaker Dependent Neural Vocoder

Zeyu Jin, Adam Finkelstein, Gautham J. Mysore, Jingwan Lu

U.S. Patent #10770063 issued in September 2020

​

Time Interval Sound Alignment

Brian King, Gautham J. Mysore, Paris Smaragdis

U.S. Patent #10638221 issued in April 2020 

​

Generating Audio Loops from an Audio Track

Zhengshan Shi, Gautham J. Mysore

U.S. Patent #10460763 issued in October 2019 

​

Automatic Voiceover Correction System

Shrikant Venkataramani, Paris Smaragdis, Gautham J. Mysore

U.S. Patent #10453475 issued in October 2019 

​

Text-based Insertion and Replacement in Audio Narration

Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein

U.S. Patent #10347238 issued in July 2019 

​

Variable Sound Decomposition Masks

Gautham J. Mysore, Paris Smaragdis

U.S. Patent #10262680 issued in April 2019

​

Sound Rate Modification

Brian King, Gautham J. Mysore, Paris Smaragdis

U.S. Patent #10249321 issued in April 2019

 

Sound Processing using a Product-of-Filters Model

Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore

U.S. Patent #10176818 issued in January 2019

​

Reverberation Matching of Speech

Ramin Anushiravani, Paris Smaragdis, Gautham J. Mysore

U.S. Patent #10069028 issued in September 2018

​

Intuitive Music Visualization using Efficient Structural Segmentation

Cheng-i Wang, Gautham J. Mysore

U.S. Patent#10074350 issued in September 2018

​

Irregular Pattern Identification using Landmark based Convolution

Minje Kim, Paris Smaragdis, Gautham J. Mysore

U.S. Patent #10002622 issued in June 2018

​

Performance Metric Based Stopping Criteria for Iterative Algorithms

François G. Germain, Gautham J. Mysore

U.S. Patent #9866954 issued in January 2018 

​

Automatic Emphasis of Spoken Words

Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz

U.S. Patent #9852743 issued in December 2017

​

Irregularity Detection in Music

Minje Kim, Gautham J. Mysore, Paris Smaragdis, Peter Merrill

U.S. Patent #9734844 issued in in August 2017

​

Non-negative Matrix Factorization Regularized by Recurrent Neural Networks for Audio Processing

Nicolas Boulanger-Lewandowski, Gautham J. Mysore, Matthew Hoffman

U.S. patent #9721202 issued in August 2017 

​

Dereverberation Using a Learned Speech Model

Dawen Liang, Matthew D. Hoffman, Gautham J. Mysore

U.S. Patent #9607627 issued in March 2017

​

Acoustic Matching and Splicing of Sound Tracks

François G. Germain, Gautham J. Mysore

U.S. Patent #9601124 issued in March 2017

​

Automatic Detection of Dense Ornamentation in Music

Minje Kim, Gautham J. Mysore, Paris Smaragdis, Peter Merrill

U.S. Patent #9514722 issued in December 2016

​

Sound Feature Priority Alignment

Brian King, Gautham J. Mysore, Paris Smaragdis

U.S. Patent #9451304 issued in September 2016

​

Pattern Matching of Sound Data using Hashing

Minje Kim, Paris Smaragdis, Gautham J. Mysore

U.S. Patent #9449085 issued in September 2016

​

General Sound Decomposition Models

Dennis L. Sun, Gautham J. Mysore

U.S. Patent #9437208 issued in September 2016

​

Sound Alignment Using Timing Information

Brian King, Gautham J. Mysore, Paris Smaragdis

U.S. Patent #9355649 issued in May 2016

​

Multichannel Sound Source Identification and Localization

Minje Kim, Gautham J. Mysore, Paris Smaragdis

U.S. Patent #9351093 issued in May 2016

​

Joint Sound Model Generation Techniques

Dennis L. Sun, Gautham J. Mysore

U.S. Patent #9318106 issued in April 2016

​

Sound Alignment User Interface

Brian King, Gautham J. Mysore, Paris Smaragdis

U.S Patent #9201580 issued in December 2015

​

Sound Mixture Recognition

Gautham J. Mysore, Paris Smaragdis, Juhan Nam

U.S Patent #9165565 issued in October 2015

​

Feature Estimation in Sound Sources

Paris Smaragdis, Gautham J. Mysore

U.S Patent #8965832 issued in February 2015

​

User-Guided Audio Selection from Complex Sound Mixtures

Paris Smaragdis, Gautham J. Mysore

U.S Patent #8954175 issued in February 2015 

​

Clustering and Synchronizing Content

Nicholas J. Bryan, Paris Smaragdis, Gautham J. Mysore

U.S Patent #8924345 issued in December 2014 

​

Language Informed Source Separation

Gautham J. Mysore, Paris Smaragdis

U.S Patent #8843364 issued in September 2014 

​

Semi-supervised Source Separation using Non-negative Techniques

Gautham J. Mysore, Paris Smaragdis

U.S Patent #8812322 issued in August 2014 

​

Noise Robust Template Matching

Gautham J. Mysore, Paris Smaragdis, Brian King

U.S Patent #8775167 issued in July 2014

​

System and Method for Acoustic Echo Cancellation using Spectral Decomposition

Paris Smaragdis, Gautham J. Mysore

U.S. Patent #8724798 issued in May 2014

​

Non-negative Hidden Markov Modeling of Signals

Gautham J. Mysore, Paris Smaragdis

U.S. Patent #8554553 issued in October 2013

​

Method and Apparatus for Relative Pitch Tracking of Multiple Arbitrary Sounds

Paris Smaragdis, Gautham J. Mysore
U.S. Patent #8380331 issued in February 2013

​

​

​

bottom of page