**STATISTICS**

1.Exploratory Data Analysis

Elements of Structured Data

Further Reading

Rectangular Data

Data Frames and Indexes

Nonrectangular Data Structures

Further Reading

Estimates of Location

Mean

Median and Robust Estimates

Further Reading

Estimates of Variability

Standard Deviation and Related Estimates

Estimates Based on Percentiles

Further Reading

Exploring the Data Distribution

Percentiles and Boxplots

Frequency Table and Histograms

Density Estimates

Further Reading

Exploring Binary and Categorical Data

Mode

Expected Value

Further Reading

Correlation

Scatterplots

Further Reading

Exploring Two or More Variables

Hexagonal Binning and Contours (Plotting Numeric versus Numeric Data)

Two Categorical Variables

Categorical and Numeric Data

Visualizing Multiple Variables

2. Data and Sampling Distributions

Random Sampling and Sample Bias

Bias

Random Selection

Size versus Quality: When Does Size Matter?

Sample Mean versus Population Mean

Further Reading

Selection Bias

Regression to the Mean

Further Reading

Sampling Distribution of a Statistic

Central Limit Theorem

Standard Error

Further Reading

The Bootstrap

Resampling versus Bootstrapping

Further Reading

Confidence Intervals

Further Reading

Normal Distribution

Standard Normal and QQ-Plots

Long-Tailed Distributions

Further Reading

Student’s t-Distribution

Further Reading

Binomial Distribution

Further Reading

Poisson and Related Distributions

Poisson Distributions

Exponential Distribution

Estimating the Failure Rate

Weibull Distribution

3. Statistical Experiments and Significance Testing

A/B Testing

Why Have a Control Group?

Why Just A/B? Why Not C, D…?

For Further Reading

Hypothesis Tests

The Null Hypothesis

Alternative Hypothesis

One-Way, Two-Way Hypothesis Test

Further Reading

Resampling

Permutation Test

Exhaustive and Bootstrap Permutation Test

Permutation Tests: The Bottom Line for Data Science

For Further Reading

Statistical Significance and P-Values

P-Value

Alpha

Type 1 and Type 2 Errors

Data Science and P-Values

Further Reading

t-Tests

Further Reading

Multiple Testing

Further Reading

Degrees of Freedom

Further Reading

ANOVA

F-Statistic

Two-Way ANOVA

Further Reading

Chi-Square Test

Chi-Square Test: A Resampling Approach

Chi-Square Test: Statistical Theory

Fisher’s Exact Test

Relevance for Data Science

Further Reading

Multi-Arm Bandit Algorithm

Further Reading

Power and Sample Size

Sample Size

4. Regression and Prediction

Simple Linear Regression

The Regression Equation

Fitted Values and Residuals

Least Squares

Prediction versus Explanation (Profiling)

Further Reading

Multiple Linear Regression

Assessing the Model

Cross-Validation

Model Selection and Stepwise Regression

Weighted Regression

Further Reading

Prediction Using Regression

The Dangers of Extrapolation

Confidence and Prediction Intervals

Factor Variables in Regression

Dummy Variables Representation

Factor Variables with Many Levels

Ordered Factor Variables

Interpreting the Regression Equation

Correlated Predictors

Multicollinearity

Confounding Variables

Interactions and Main Effects

Testing the Assumptions: Regression Diagnostics

Outliers

Influential Values

Heteroskedasticity, Non-Normality and Correlated Errors

Partial Residual Plots and Nonlinearity

Polynomial and Spline Regression

Polynomial

Splines

Generalized Additive Models

5. Classification

Naive Bayes

Why Exact Bayesian Classification Is Impractical

The Naive Solution

Numeric Predictor Variables

Further Reading

Discriminant Analysis

Covariance Matrix

Fisher’s Linear Discriminant

A Simple Example

Further Reading

Logistic Regression

Logistic Response Function and Logit

Logistic Regression and the GLM

Generalized Linear Models

Predicted Values from Logistic Regression

Interpreting the Coefficients and Odds Ratios

Linear and Logistic Regression: Similarities and Differences

Assessing the Model

Further Reading

Evaluating Classification Models

Confusion Matrix

The Rare Class Problem

Precision, Recall, and Specificity

ROC Curve

AUC

Lift

Further Reading

Strategies for Imbalanced Data

Undersampling

Oversampling and Up/Down Weighting

Data Generation

Cost-Based Classification

Exploring the Predictions

6. Statistical Machine Learning

K-Nearest Neighbors

A Small Example: Predicting Loan Default

Distance Metrics

One Hot Encoder

Standardization (Normalization, Z-Scores)

Choosing K

KNN as a Feature Engine

Tree Models

A Simple Example

The Recursive Partitioning Algorithm

Measuring Homogeneity or Impurity

Stopping the Tree from Growing

Predicting a Continuous Value

How Trees Are Used

Further Reading

Bagging and the Random Forest

Bagging

Random Forest

Variable Importance

Hyperparameters

Boosting

The Boosting Algorithm

XGBoost

Regularization: Avoiding Overfitting

Hyperparameters and Cross-Validation

7. Unsupervised Learning

Principal Components Analysis

A Simple Example

Computing the Principal Components

Interpreting Principal Components

Further Reading

K-Means Clustering

A Simple Example

K-Means Algorithm

Interpreting the Clusters

Selecting the Number of Clusters

Hierarchical Clustering

A Simple Example

The Dendrogram

The Agglomerative Algorithm

Measures of Dissimilarity

Model-Based Clustering

Multivariate Normal Distribution

Mixtures of Normals

Selecting the Number of Clusters

Further Reading

Scaling and Categorical Variables

Scaling the Variables

Dominant Variables

Categorical Data and Gower’s Distance

Problems with Clustering Mixed Data

**MACHINE LEARNING**

1. The Machine Learning Landscape

What Is Machine Learning?

Why Use Machine Learning?

Types of Machine Learning Systems

Supervised/Unsupervised Learning

Batch and Online Learning

Instance-Based Versus Model-Based Learning

Main Challenges of Machine Learning

Insufficient Quantity of Training Data

Nonrepresentative Training Data

Poor-Quality Data

Irrelevant Features

Overfitting the Training Data

Underfitting the Training Data

Stepping Back

Testing and Validating

2. End-to-End Machine Learning Project

Working with Real Data

Look at the Big Picture

Frame the Problem

Select a Performance Measure

Check the Assumptions

Get the Data

Create the Workspace

Download the Data

Take a Quick Look at the Data Structure

Create a Test Set

Discover and Visualize the Data to Gain Insights

Visualizing Geographical Data

Looking for Correlations

Experimenting with Attribute Combinations

Prepare the Data for Machine Learning Algorithms

Data Cleaning

Handling Text and Categorical Attributes

Custom Transformers

Feature Scaling

Transformation Pipelines

Select and Train a Model

Training and Evaluating on the Training Set

Better Evaluation Using Cross-Validation

Fine-Tune Your Model

Grid Search

Randomized Search

Ensemble Methods

Analyze the Best Models and Their Errors

Evaluate Your System on the Test Set

Launch, Monitor, and Maintain Your System

3. Classification

MNIST

Training a Binary Classifier

Performance Measures

Measuring Accuracy Using Cross-Validation

Confusion Matrix

Precision and Recall

Precision/Recall Tradeoff

The ROC Curve

Multiclass Classification

Error Analysis

Multilabel Classification

Multioutput Classification

4. Training Models

Linear Regression

The Normal Equation

Computational Complexity

Gradient Descent

Batch Gradient Descent

Stochastic Gradient Descent

Mini-batch Gradient Descent

Polynomial Regression

Learning Curves

Regularized Linear Models

Ridge Regression

Lasso Regression

Elastic Net

Early Stopping

Logistic Regression

Estimating Probabilities

Training and Cost Function

Decision Boundaries

Softmax Regression

5. Support Vector Machines

Linear SVM Classification

Soft Margin Classification

Nonlinear SVM Classification

Polynomial Kernel

Adding Similarity Features

Gaussian RBF Kernel

Computational Complexity

SVM Regression

Under the Hood

Decision Function and Predictions

Training Objective

Quadratic Programming

The Dual Problem

Kernelized SVM

Online SVMs

6. Decision Trees

Training and Visualizing a Decision Tree

Making Predictions

Estimating Class Probabilities

The CART Training Algorithm

Computational Complexity

Gini Impurity or Entropy?

Regularization Hyperparameters

Regression

Instability

7. Ensemble Learning and Random Forests

Voting Classifiers

Bagging and Pasting

Bagging and Pasting in Scikit-Learn

Out-of-Bag Evaluation

Random Patches and Random Subspaces

Random Forests

Extra-Trees

Feature Importance

Boosting

AdaBoost

Gradient Boosting

Stacking

8. Dimensionality Reduction

The Curse of Dimensionality

Main Approaches for Dimensionality Reduction

Projection

Manifold Learning

PCA

Preserving the Variance

Principal Components

Projecting Down to d Dimensions

Using Scikit-Learn

Explained Variance Ratio

Choosing the Right Number of Dimensions

PCA for Compression

Incremental PCA

Randomized PCA

Kernel PCA

Selecting a Kernel and Tuning Hyperparameters

LLE

9. Up and Running with TensorFlow

Installation

Creating Your First Graph and Running It in a Session

Managing Graphs

Lifecycle of a Node Value

Linear Regression with TensorFlow

Implementing Gradient Descent

Manually Computing the Gradients

Using autodiff

Using an Optimizer

Feeding Data to the Training Algorithm

Saving and Restoring Models

Visualizing the Graph and Training Curves Using TensorBoard

Name Scopes

Modularity

Sharing Variables

10. Artificial Neural Networks

From Biological to Artificial Neurons

Biological Neurons

Logical Computations with Neurons

The Perceptron

Multi-Layer Perceptron and Backpropagation

Training an MLP with TensorFlow’s High-Level API

Training a DNN Using Plain TensorFlow

Construction Phase

Execution Phase

Using the Neural Network

Fine-Tuning Neural Network Hyperparameters

Number of Hidden Layers

Number of Neurons per Hidden Layer

Activation Functions

11. Training Deep Neural Nets

Vanishing/Exploding Gradients Problems

Xavier and He Initialization

Nonsaturating Activation Functions

Batch Normalization

Gradient Clipping

Reusing Pretrained Layers

Reusing a TensorFlow Model

Reusing Models from Other Frameworks

Freezing the Lower Layers

Caching the Frozen Layers

Tweaking, Dropping, or Replacing the Upper Layers

Model Zoos

Unsupervised Pretraining

Pretraining on an Auxiliary Task

Faster Optimizers

Momentum Optimization

Nesterov Accelerated Gradient

AdaGrad

RMSProp

Adam Optimization

Learning Rate Scheduling

Avoiding Overfitting Through Regularization

Early Stopping

ℓ1 and ℓ2 Regularization

Dropout

Max-Norm Regularization

Data Augmentation

Practical Guidelines

12. Distributing TensorFlow Across Devices and Servers

Multiple Devices on a Single Machine

Installation

Managing the GPU RAM

Placing Operations on Devices

Parallel Execution

Control Dependencies

Multiple Devices Across Multiple Servers

Opening a Session

The Master and Worker Services

Pinning Operations Across Tasks

Sharding Variables Across Multiple Parameter Servers

Sharing State Across Sessions Using Resource Containers

Asynchronous Communication Using TensorFlow Queues

Loading Data Directly from the Graph

Parallelizing Neural Networks on a TensorFlow Cluster

One Neural Network per Device

In-Graph Versus Between-Graph Replication

Model Parallelism

Data Parallelism

13. Convolutional Neural Networks

The Architecture of the Visual Cortex

Convolutional Layer

Filters

Stacking Multiple Feature Maps

TensorFlow Implementation

Memory Requirements

Pooling Layer

CNN Architectures

LeNet-5

AlexNet

GoogLeNet

ResNet

14. Recurrent Neural Networks

Recurrent Neurons

Memory Cells

Input and Output Sequences

Basic RNNs in TensorFlow

Static Unrolling Through Time

Dynamic Unrolling Through Time

Handling Variable Length Input Sequences

Handling Variable-Length Output Sequences

Training RNNs

Training a Sequence Classifier

Training to Predict Time Series

Creative RNN

Deep RNNs

Distributing a Deep RNN Across Multiple GPUs

Applying Dropout

The Difficulty of Training over Many Time Steps

LSTM Cell

Peephole Connections

GRU Cell

Natural Language Processing

Word Embeddings

An Encoder–Decoder Network for Machine Translation

15. Autoencoders

Efficient Data Representations

Performing PCA with an Undercomplete Linear Autoencoder

Stacked Autoencoders

TensorFlow Implementation

Tying Weights

Training One Autoencoder at a Time

Visualizing the Reconstructions

Visualizing Features

Unsupervised Pretraining Using Stacked Autoencoders

Denoising Autoencoders

TensorFlow Implementation

Sparse Autoencoders

TensorFlow Implementation

Variational Autoencoders

Generating Digits

Other Autoencoders

16. Reinforcement Learning

Learning to Optimize Rewards

Policy Search

Introduction to OpenAI Gym

Neural Network Policies

Evaluating Actions: The Credit Assignment Problem

Policy Gradients

Markov Decision Processes

Temporal Difference Learning and Q-Learning

Exploration Policies

Approximate Q-Learning and Deep Q-Learning

Learning to Play Ms. Pac-Man Using the DQN Algorithm

**DEEP LEARNING**

1 Linear Algebra

Scalars, Vectors, Matrices and Tensors

Multiplying Matrices and Vectors

Identity and Inverse Matrices

Linear Dependence and Span

Norms

Special Kinds of Matrices and Vectors

Eigendecomposition

Singular Value Decomposition

The Moore-Penrose Pseudoinverse

The Trace Operator

The Determinant

2 Probability and Information Theory

Why Probability?

Random Variables

Probability Distributions

Marginal Probability

Conditional Probability

The Chain Rule of Conditional Probabilities

Independence and Conditional Independence

Expectation, Variance and Covariance

Common Probability Distributions

Useful Properties of Common Functions

Bayes’ Rule

Technical Details of Continuous Variables

Information Theory

Structured Probabilistic Models

3 Numerical Computation

Overflow and Underflow

Poor Conditioning

Gradient-Based Optimization

Constrained Optimization

4 Machine Learning Basics

Learning Algorithms

Capacity, Overfitting and Underfitting

Hyperparameters and Validation Sets

Estimators, Bias and Variance

Maximum Likelihood Estimation

Bayesian Statistics

Supervised Learning Algorithms

Unsupervised Learning Algorithms

Stochastic Gradient Descent

Building a Machine Learning Algorithm

Challenges Motivating Deep Learning

5 Deep Feedforward Networks

Gradient-Based Learning

Hidden Units

Architecture Design

Back-Propagation and Other Differentiation Algorithms

Historical Notes

6 Regularization for Deep Learning

Parameter Norm Penalties

Norm Penalties as Constrained Optimization

Regularization and Under-Constrained Problems

Dataset Augmentation

Noise Robustness

Semi-Supervised Learning

Multi-Task Learning

Early Stopping

Parameter Tying and Parameter Sharing

Sparse Representations

Bagging and Other Ensemble Methods

Dropout

Adversarial Training

Tangent Distance, Tangent Prop, and Manifold Tangent Classifier

7 Optimization for Training Deep Models

How Learning Differs from Pure Optimization

Challenges in Neural Network Optimization

Basic Algorithms

Parameter Initialization Strategies

Algorithms with Adaptive Learning Rates

Approximate Second-Order Methods

Optimization Strategies and Meta-Algorithms

8 Convolutional Networks

The Convolution Operation

Motivation

Pooling

Convolution and Pooling as an Infinitely Strong Prior

Variants of the Basic Convolution Function

Structured Outputs

Data Types

Efficient Convolution Algorithms

Random or Unsupervised Features

The Neuroscientific Basis for Convolutional Networks

Convolutional Networks and the History of Deep Learning

9 Sequence Modeling: Recurrent and Recursive Nets

Unfolding Computational Graphs

Recurrent Neural Networks

Bidirectional RNNs

Encoder-Decoder Sequence-to-Sequence Architectures

Deep Recurrent Networks

Recursive Neural Networks

The Challenge of Long-Term Dependencies

Echo State Networks

Leaky Units and Other Strategies for Multiple Time Scales

The Long Short-Term Memory and Other Gated RNNs

Optimization for Long-Term Dependencies

Explicit Memory

10 Practical Methodology

Performance Metrics

Default Baseline Models

Determining Whether to Gather More Data

Selecting Hyperparameters

Debugging Strategies

11 Applications

Large-Scale Deep Learning

Computer Vision

Speech Recognition

Natural Language Processing

12 Linear Factor Models

Probabilistic PCA and Factor Analysis

Independent Component Analysis (ICA)

Slow Feature Analysis

Sparse Coding

Manifold Interpretation of PCA

13 Autoencoders

Undercomplete Autoencoders

Regularized Autoencoders

Representational Power, Layer Size and Depth

Stochastic Encoders and Decoders

Denoising Autoencoders

Learning Manifolds with Autoencoders

Contractive Autoencoders

Predictive Sparse Decomposition

Applications of Autoencoders

14 Representation Learning

Greedy Layer-Wise Unsupervised Pretraining

Transfer Learning and Domain Adaptation

Semi-Supervised Disentangling of Causal Factors

Distributed Representation

Exponential Gains from Depth

Providing Clues to Discover Underlying Causes

15 Structured Probabilistic Models for Deep Learning

The Challenge of Unstructured Modeling

Using Graphs to Describe Model Structure

Sampling from Graphical Models

Advantages of Structured Modeling

Learning about Dependencies

Inference and Approximate Inference

The Deep Learning Approach to Structured Probabilistic Models

16 Monte Carlo Methods

Sampling and Monte Carlo Methods

Importance Sampling

Markov Chain Monte Carlo Methods

Gibbs Sampling

The Challenge of Mixing between Separated Modes

17 Confronting the Partition Function

The Log-Likelihood Gradient

Stochastic Maximum Likelihood and Contrastive Divergence

Pseudolikelihood

Score Matching and Ratio Matching

Denoising Score Matching

Noise-Contrastive Estimation

Estimating the Partition Function

18 Approximate Inference

Inference as Optimization

Expectation Maximization

MAP Inference and Sparse Coding

Variational Inference and Learning

Learned Approximate Inference

19 Deep Generative Models

Boltzmann Machines

Restricted Boltzmann Machines

Deep Belief Networks

Deep Boltzmann Machines

Boltzmann Machines for Real-Valued Data

Convolutional Boltzmann Machines

Boltzmann Machines for Structured or Sequential Outputs

Other Boltzmann Machines

Back-Propagation through Random Operations

Directed Generative Nets

Drawing Samples from Autoencoders

Generative Stochastic Networks

Other Generation Schemes

Evaluating Generative Models

**COMPUTER VISION**

1 Basic Image Handling and Processing

PIL – the Python Imaging Library

Matplotlib

NumPy

SciPy

2 Local Image Descriptors

Harris corner detector

SIFT – Scale-Invariant Feature Transform

Matching Geotagged Images

3 Image to Image Mappings

Homographies

Warping images

Creating Panoramas

4 Camera Models and Augmented Reality

The Pin-hole Camera Model

Camera Calibration

Pose Estimation from Planes and Markers

Augmented Reality

5 Multiple View Geometry

Epipolar Geometry

Computing with Cameras and 3D Structure

Multiple View Reconstruction

Stereo Images

6 Clustering Images

K-means Clustering

Hierarchical Clustering

Spectral Clustering

7 Searching Images

Content-based Image Retrieval

Visual Words

Indexing Images

Searching the Database for Images

Ranking Results using Geometry

Building Demos and Web Applications

8 Classifying Image Content

K-Nearest Neighbors

Bayes Classifier

Support Vector Machines

Optical Character Recognition

9 Image Segmentation

Graph Cuts

Segmentation using Clustering

Variational Methods

10 OpenCV

The OpenCV Python Interface

OpenCV Basics

Processing Video

Tracking

**TENSORFLOW**

1. Up and Running with TensorFlow

2. Understanding TensorFlow Basics

3. Convolutional Neural Networks

4. Working with Text and Sequences, and TensorBoard Visualization

5. Word vectors, Advanced RNN and Embedding Visualization

6. TensorFlow Abstractions and Simplifications

7. Queues, Threads and Reading Data

8. Distributed TensorFlow

9. Exporting and Serving Models with TensorFlow

**NATURAL LANGUAGE PROCESSING**

1. Language Processing and Python

2. Accessing Text Corpora and Lexical Resources

3. Processing Raw Text

4. Writing Structured Programs

5. Categorizing and Tagging Words

6. Learning to Classify Text

7. Extracting Information from Text

8. Analyzing Sentence Structure

9. Building Feature-Based Grammars

10.Analyzing the Meaning of Sentences

11.Managing Linguistic Data

**CONVOLUTIONAL NEURAL NETWORKS(CNN)**

1. Rosenblatt’s Perceptron

2. Model Building through Regression

3. The Least-Mean-Square Algorithm

4. Multilayer Perceptrons

5. Kernel Methods and Radial-Basis Function Networks

6. Support Vector Machines

7. Regularization Theory

8. Principal-Components Analysis

9. Self-Organizing Maps

10.Information-Theoretic Learning Models

11.Stochastic Methods Rooted in Statistical Machinanics

12.Dynamic Programming

13.Neurodynamics

14.Bayesian Filtering for State Estimation of Dynamic Systems

15.Dynamically Driven Recurrent Networks

**POC**

© 2018. All Rights Reserved Designed & Developed by Chennai Creative Solutions