Introduction to Deep Learning
This is a collection of homework and course projects for CMU 11785: Introduction to Deep Learning (Term: 18 Fall).
- Homework 1: Frame Level Classification of Speech (spec)
- Build from scratch a MLP class supporting
backprob
,batchnorm
,softmax
andmomentum
, using only Numpy. - Identify the phoneme state label for WSJ utterance frames using MLP.
- Build from scratch a MLP class supporting
- Homework 2: Speaker Veriļ¬cation via Convolutional Neural Networks (spec)
- Determining whether two speech segments were uttered by the same speaker.
- Extract speaker embeddings from utterances using
CNN
, followed by dense layers to train withN-way classification
.
- Homework 3: Seq2Seq Phonemes Prediction (spec)
- Build a
Seq2Seq
model for phonemes prediction of unaligned utterance data. - Incorporate into the model the
CTC loss
andbeam search
decoder.
- Build a
- Homework 4: Attention-based End-to-End Speech-to-Text Deep Neural Network (spec)
- Implement the character based Listen, Attend and Spell (LAS) model to translate utterances to corresponding text transcripts.
- The listener consists of a
pyramidal bi-LSTM
network that produce attention keys and values. The decoder is anLSTM
that yields sequential outputs.
- Course Project: Dynamic fleet management with deep reinforcement learning (report)
- Apply online deep reinforcement learning models for dynamic taxi dispatch in NYC.
- Implement a CNN-based
deep Q network
and a refineddiffusion-CNN
model.
Technical tools: Python, PyTorch, TensorFlow, AWS