
Deep Learning Internals
Training Description
You’ve probably heard that Deep Learning is making news across the world as one of the most promising techniques in machine learning, especially for analyzing image data. With every industry dedicating resources to unlock the deep learning potential, to be competitive, you will want to use these models in tasks such as image tagging, object recognition, speech recognition, and text analysis.
- In this training session you will build deep learning models using neural networks, explore what they are, what they do, and how. 
- To remove the barrier introduced by designing, training, and tuning networks, and to be able to achieve high performance with less labeled data, you will also build deep learning classifiers tailored to your specific task using pre-trained models, which we call deep features. 
- Also, you’ll develop a clear understanding of the motivation for deep learning, and design intelligent systems that learn from complex and/or large-scale datasets. 
- MNIST 
- Object Recognition 
- Sequence Modelling 
- Picture Generation 
- Simple Atari 
Case studies
- Artificial Neural Networks 
- Convolutional Neural Networks 
- Recurrent Neural Networks 
- Generative Adversarial Network 
- Reinforcement Learning 
- SVM 
Techniques
- Combine different types of layers and activation functions to obtain better performance 
- Describe how these models can be applied in computer vision, text analytics and speech recognition 
- Describe how a neural network model is represented and how it encodes non-linear features 
- Use pretrained models, such as deep features, for new classification tasks 
- You will learn how to Prototype ideas and then productionize 
- Explore a dataset of products, reviews and images 
- Use Tensorflow and Keras for building detailed CNN based models 
- Use Tensorflow and Keras for building detailed RNN based models 
- Value of GPU in Deep Learning computation and provisioning rules 
Key skills
- This is an advanced level session and it assumes that you have good familiarity with 
- Machine learning. 
- Machine Learning Internals 
- Working Knowledge of python 
Pre-requisites
This is an instructor led course provides lecture topics and the practical application of Deep Learning and the underlying technologies. It pictorially presents most concepts and there is a detailed case study that strings together the technologies, patterns and design.
Instructional Method
Introduction to Deep Learning
- Parameter Hyperspace 
- Minimizing Cost Entropy 
- Normalized Inputs And Initial Weights 
- Measuring Performance 
- Transition Into Practical Aspects Of Learning 
- Stochastic Gradient Descent 
- Training your Logistic Classifier 
- Transition: Overfitting -> Dataset Size 
- Momentum And Learning Rate Decay 
- Supervised Classification 
- Solving Problems 
- Lather Rinse Repeat 
- Optimizing A Logistic Classifier 
- Cross Entropy 
- What is Deep Learning 
Deep Neural Network
- "2-layer" neural network 
- Network Of ReLUs 
- Dropout 
- Intro to Deep Neural Network 
- No Neurons 
- Backprop 
- Regularization Intro 
- Linear Models Are Limited 
- The Chain Rule 
- Dropout Pt-2 
- Regularization 
- Training A Deep Learning Network 
Deep Learning Internals
- How They Work 
- A Simple Predicting Machine 
- Following Signals Through A Neural Network 
- Sometimes One Classifier Is Not Enough 
- Learning Weights From More Than One Node 
- A Three Layer Example with Matrix Multiplication 
- Training A Simple Classifier 
- Backpropagating Errors To More Layers 
- Classifying is Not Very Different from Predicting 
- Making it easy by looking at logic and math 
- Preparing Data 
- Matrix Multiplication is Useful Honest! 
- Neurons, Nature’s Computing Machines 
- How Do We Actually Update Weights? 
- Weight Update Worked Example 
- Backpropagating Errors with Matrix Multiplication 
- Backpropagating Errors From More Output Nodes 
- DIY with Python 
- Interactive Python = IPython 
- A Very Gentle Start with Python 
- The MNIST Dataset of Handwritten Numbers 
- Python 
- Neural Network with Python 
- Hand rolled Neural Network 
- Creating New Training Data: Rotations 
- Your Own Handwriting 
- Inside the Mind of a Neural Network 
Software Tools
- Tensorflow 
- Installation 
- Sharing Variables 
- Creating Your First Graph and Running It in a Session 
- Managing Graphs 
- Visualizing the Graph and Training Curves Using TensorBoard 
- Implementing Gradient Descent 
- Lifecycle of a Node Value 
- Linear Regression with TensorFlow 
- Modularity 
- Saving and Restoring Models 
- Name Scopes 
- Feeding Data to the Training Algorithm 
- Keras 
Artificial Neural Networks Internals
- Training a DNN Using Plain TensorFlow 
- Fine-Tuning Neural Network Hyperparameters 
- Training an MLP with TensorFlow’s High-Level API 
- From Biological to Artificial Neurons 
- Deep Feedforward Networks 
- Hidden Units 
- Architecture Design 
- Back-Propagation and Other Differentiation Algorithms 
- Gradient-Based Learning 
- Learning XOR 
- Optimization for Training Deep Models 
- Random or Unsupervised Features 
- Structured Outputs 
- Convolutional Networks 
- Challenges in Neural Network Optimizatio 
- Efficient Convolution Algorithms 
- Variants of the Basic Convolution Function 
- The Convolution Operation 
- Pooling 
- Parameter Initialization Strategies 
- Motivation 
- How Learning Differs from Pure Optimization 
- Basic Algorithms 
- Optimization Strategies and Meta-Algorithms 
- The Neuroscientific Basis for Convolutional Networks 
- Convolution and Pooling as an Infinitely Strong Prior 
- Data Types 
- Approximate Second-Order Methods 
- Algorithms with Adaptive Learning Rates 
- Regularization for Deep Learning 
- Bagging and Other Ensemble Methods 
- Dataset Augmentation 
- Tangent Distance, Tangent Prop, and Manifold Tangent Classifier 
- Parameter Tying and Parameter Sharing 
- Semi-Supervised Learning 
- Early Stopping 
- Sparse Representations 
- Regularization and Under-Constrained Problems 
- Multi-Task Learning 
- Dropout 
- Norm Penalties as Constrained Optimization 
- Noise Robustness 
- Parameter Norm Penalties 
- Adversarial Training 
- Convolutional Neural Networks Internals 
- Pooling layer 
- Image augmentation 
- Convolutional layer 
- History of CNNs 
- Convolutional layers in Keras 
- Code for visualizing an image 
- Input layer 
- How do computers interpret images? 
- Practical example image classification 
- Convolutional neural networks 
- Dropout 
- Attention Mechanism for CNN and Visual Models 
- Types of Attention 
- Glimpse Sensor in code 
- Attention mechanism for image captioning 
- Hard Attention 
- Applying the RAM on a noisy MNIST sample 
- Recurrent models of visual attention 
- Using attention to improve visual models 
- Reasons for sub-optimal performance of visual CNN models 
- Soft Attention 
- Build Your First CNN and Performance Optimization 
- Convolution and pooling operations in TensorFlow 
- Convolutional operations 
- Using tanh 
- Convolution operations in TensorFlow 
- Regularization 
- Fully connected layer 
- Weight and bias initialization 
- Pooling, stride, and padding operations 
- CNN architectures and drawbacks of DNNs 
- Applying pooling operations in TensorFlow 
- Using sigmoid 
- Training a CNN 
- Using ReLU 
- Activation functions 
- Building, training, and evaluating our first CNN 
- Creating a CNN model 
- Defining CNN hyperparameters 
- Model evaluation 
- Dataset description 
- Loading the required packages 
- Running the TensorFlow graph to train the CNN model 
- Preparing the TensorFlow graph 
- Loading the training/test images to generate train/test set 
- Constructing the CNN layers 
- Model performance optimization 
- Applying dropout operations with TensorFlow 
- Building the second CNN by putting everything together 
- Appropriate layer placement 
- Which optimizer to use? 
- Creating the CNN model 
- Dataset description and preprocessing 
- Number of neurons per hidden layer 
- Number of hidden layers 
- Batch normalization 
- Memory tuning 
- Training and evaluating the network 
- Advanced regularization and avoiding overfitting 
- Popular CNN Model Architectures 
- Architecture insights 
- ResNet architecture 
- AlexNet architecture 
- VGG image classification code example 
- Introduction to ImageNet 
- VGGNet architecture 
- GoogLeNet architecture 
- LeNet 
- Traffic sign classifiers using AlexNet 
- Inception module 
- Transfer Learning 
- Multi-task learning 
- Target dataset is small but different from the original training 
- dataset 
- Autoencoders for CNN 
- Applications 
- Target dataset is large and similar to the original training dataset 
- Introducing to autoencoders 
- Convolutional autoencoder 
- Target dataset is large and different from the original training 
- dataset 
- Transfer learning example 
- Feature extraction approach 
- Target dataset is small and is similar to the original training 
- dataset 
- An example of compression 
- GAN: Generating New Images with CNN 
- Feature matching 
- GAN code example 
- Deep convolutional GAN 
- Adding the optimizer 
- Training a GAN model 
- Semi-supervised learning and GAN 
- Pixpix - Image-to-Image translation GAN 
- Calculating loss 
- Semi-supervised classification using a GAN example 
- CycleGAN 
- Batch normalization 
- Object Detection and Instance Segmentation with CNN 
- Creating the environment 
- Fast R-CNN (fast region-based CNN) 
- The differences between object detection and image classification 
- Mask R-CNN (Instance segmentation with CNN) 
- Cascading classifiers 
- Haar Features 
- Faster R-CNN (faster region proposal network-based CNN) 
- Traditional, nonCNN approaches to object detection 
- R-CNN (Regions with CNN features) 
- Running the pre-trained model on the COCO dataset 
- Why is object detection much more challenging than image 
- classification? 
- The Viola-Jones algorithm 
- Preparing the COCO dataset folder structure 
- Downloading and installing the COCO API and detectron library 
- (OS shell commands) 
- Instance segmentation in code 
- Haar features, cascading classifiers, and the Viola-Jones algorithm 
- Installing Python dependencies (Python environment) 
- Popular CNN Model Architectures 
- Introduction to ImageNet 
- VGG image classification code example 
- GoogLeNet architecture 
- Architecture insights 
- Inception module 
- AlexNet architecture 
- VGGNet architecture 
- LeNet 
- ResNet architecture 
- Traffic sign classifiers using AlexNet 
Recurrent Neural Network and Sequence Modelling
- Concrete Recurrent Neural Network Architectures 
- Simple RNN 
- Gated Architectures:LSTM 
- Gated Architectures:GRU 
- CBOW as an RNN 
- Dropout in RNNs 
- Gated Architectures:Other Variants 
- Recurrent Neural Networks: Modeling Sequences and Stacks 
- Transducer 
- RNN Training 
- RNN Abstraction 
- Common RNN Usage-patterns 
- A Note on Reading the Literature 
- Encoder 
- Multi-layer (stacked) RNNs 
- RNNs for Representing Stacks 
- Acceptor 
- Bidirectional RNNs (biRNN) 
- Modeling with Recurrent Networks 
- Acceptors 
- RNN–CNN Document Classification 
- RNNs as Feature Extractors 
- Subject-verb Agreement Grammaticality Detection 
- Arc-factored Dependency Parsing 
- Part-of-speech Tagging 
- Sentiment Classification 
- Conditioned Generation 
- Applications 
- Sequence to Sequence Models 
- Syntactic Parsing 
- Morphological Inflection 
- Attention-based Models in NLP 
- Computational Complexity 
- Conditioned Generation with Attention 
- Machine Translation 
- Training Generators 
- Interpretability 
- Other Conditioning Contexts 
- Conditioned Generation (Encoder-Decoder) 
- Unsupervised Sentence Similarity 
- RNN Generators 
- Models for Sequence Analysis 
- Dissecting a Neural Translation Network 
- Beam Search and Global Normalization 
- Tackling seqseq with Neural N-Grams 
- Implementing a Sentiment Analysis Model 
- Long Short-Term Memory (LSTM) Units 
- Recurrent Neural Networks 
- Implementing a Part-of-Speech Tagger 
- A Case for Stateful Deep Learning Models 
- Solving seqseq Tasks with Recurrent Neural Networks 
- The Challenges with Vanishing Gradients 
- Augmenting Recurrent Networks with Attention 
- Dependency Parsing and SyntaxNet 
- TensorFlow Primitives for RNN Models 
- Analyzing Variable-Length Inputs 
Reinforcement Learning
- Policy Gradient Methods 
- Integrating Learning and Planning 
- Model-Free Control 
- Exploration and Exploitation 
- Markov Decision Processes 
- Case Study: RL in Classic Games 
- Introduction to Reinforcement Learning 
- Planning by Dynamic Programming 
- Model-Free Prediction 
- Value Function Approximation 
Deep Generative Models
- Deep Boltzmann Machines 
- Back-Propagation through Random Operations 
- Restricted Boltzmann Machines 
- Boltzmann Machines for Structured or Sequential Outputs 
- Generative Stochastic Networks 
- Boltzmann Machines 
- Other Boltzmann Machines 
- Other Generation Schemes 
- Directed Generative Nets 
- Boltzmann Machines for Real-Valued Data 
- Evaluating Generative Models 
- Drawing Samples from Autoencoders 
- Deep Belief Networks 
- Convolutional Boltzmann Machines 
Crash Course in GPU
- Introduction to CUDA and OpenCL 
- Fundamentals of GPU Algorithms(Applications of Sort and Scan) 
- Dynamic Parallelism 
- Optimizing GPU Programs 
- The GPU Hardware and Parallel Communication Patterns 
- Parallel Computation Patterns 
- The GPU programming Model 
- Parallel Optimization Patterns 
- Fundamentals of GPU Algorithms(Reduce,Scan,Histograms) 
- Deep Learning use of GPU 
Applications
- Other Applications 
- Computer Vision 
- Natural Language Processing 
- Large Scale Deep Learning 
- Speech Recognition 
Practical Methodology
- Selecting Hyperparameters 
- Debugging Strategies 
- Example : Facial Recognition 
- Performance Metrics 
- Determining Whether to Gather More Data 
- Default Baseline Models 
Topics
