Enroll in Master Data Science Course Program in San Jose CA US designed by industry experts. Become a master in statistics, analytics, data science, big data, Ai, machine learning, and deep learning. The Data Science training helps you master data mining, management, and exploration.

Course Overview

Data science is an interdisciplinary field of scientific methods, processes, algorithms, and systems applied for extracting knowledge or insights from data. Data Science training equips candidates with fundamental data mining and extraction skills.

Learning Objectives

Following are the skills you acquired from the Data Science Course:-

  • Statistics
  • Hypothesis testing
  • Clustering
  • Decision trees
  • Linear and Logistic regression
  • R Studio
  • Data Visualization
  • Regression models
  • Hadoop
  • Spark
  • SAS Macros
  • Statistical procedures
  • Advanced analytics
  • Matplotlib
  • Excel analytics functions
  • Zookeeper
  • Kafka interfaces


The program provides access to high-quality eLearning content, simulation exams, a community moderated by experts, and other resources that ensure you follow the optimal path to your dream of a data scientist role.


Basic knowledge of statistics and Basic understanding of any programming language

Course Curriculum

  • Topic Covered:

    Analytics Overview

    • Introduction
    • Introduction to Business Analytics
    • Types of Analytics
    • Areas of Analytics
    • Analytical Tools
    • Analytical Techniques

    Introduction to SAS

    • Introduction
    • What is SAS
    • Navigating in the SAS Console
    • SAS Language Input Files
    • DATA Step
    • PROC Step and DATA Step
    • DATA Step Processing
    • SAS Libraries
    • Importing Data
    • Exporting Data

    Combining and Modifying Datasets

    • Introduction
    • Why Combine or Modify Data
    • Concatenating Datasets
    • Interleaving Method
    • One – to – one Reading
    • One – to – one Merging
    • Data Manipulation
    • Modifying Variable Attributes


    • Introduction
    • What is PROC SQL
    • Retrieving Data from a Table
    • Selecting Columns in a Table
    • Retrieving Data from Multiple Tables
    • Selecting Data from Multiple Tables
    • Concatenating Query Results
    • Activity

    SAS Macros

    • Introduction
    • Need for SAS Macros
    • Macro Functions
    • Macro Functions Examples
    • SQL Clauses for Macros
    • The % Macro Statement
    • The Conditional Statement

    Basics of Statistics

    • Introduction to Statistics
    • Statistical Terms
    • Procedures in SAS for Descriptive Statistics
    • Descriptive Statistics
    • Hypothesis Testing
    • Variable Types
    • Hypothesis Testing
    • Process
    • Parametric and Non – parametric Tests
    • Parametric Tests
    • Non – parametric Tests
    • Parametric Tests – Advantages and Disadvantages

    Statistical Procedures

    • Introduction o Statistical Procedures
    • PROC Means
    • PROC CORR Options
    • PROC REG
    • PROC REG Options

    Data Exploration

    • Introduction
    • Data Preparation
    • General Comments and Observations on Data Cleaning
    • Data Type Conversion
    • Character Functions
    • SCAN Function
    • Date/Time Functions
    • Missing Value Treatment
    • Various Functions to Handle Missing Value
    • Data Summarization

    Advanced Statistics

    • Introduction
    • Introduction to Cluster
    • Clustering Methodologies
    • K Means Clustering
    • Decision Tree
    • Regression
    • Logistic Regression

    Working with Time Series Data

    • Introduction
    • Need for Time Series Analysis
    • Time Series Analysis — Options
    • Reading Date and DDateTimeValues
    • White Noise Process
    • Stationarity of a Time Series
    • Stages of ARIMA Modelling
    • Transform Transpose and Interpolating Time Series Data

    Designing Optimization Models

    • Introduction
    • Need for Optimization
    • Optimization Problems
  • Topic Covered:

    Introduction to Business Analytics

    • Introduction
    • Objectives
    • Need of Business Analytics
    • Business Decisions
    • Introduction to Business Analytics
    • Features of Business Analytics
    • Types of Business Analytics
    • Descriptive Analytics
    • Predictive Analytics
    • Prescriptive Analytics
    • Supply Chain Analytics
    • Health Care Analytics
    • Marketing Analytics
    • Human Resource Analytics
    • Web Analytics
    • Application of Business Analytics
    • Business Decisions
    • Business Intelligence (BI)
    • Data Science
    • Importance of Data Science
    • Data Science as a Strategic Asset
    • Big Data
    • Analytical Tools

    Introduction to R

    • Introduction
    • Objectives
    • An Introduction to R
    • Comprehensive R Archive Network (CRAN)
    • Cons of R
    • Companies Using R
    • Understanding R
    • Installing R on Various Operating Systems
    • Installing R on Windows from CRAN Website
    • Install R
    • IDEs for R
    • Installing RStudio on Various Operating Systems
    • Install R-Studio
    • Steps in R Initiation
    • Benefits of R Workspace
    • Setting the Workplace
    • Functions and Help in R
    • Access the Help Document
    • R Packages o Installing an R Package
    • Install and Load a Package

    R Data Structure

    • Introduction
    • Objectives
    • Types of Data Structures in R
    • Vectors
    • Create a Vector
    • Scalars
    • Colon Operator
    • Accessing Vector Elements
    • Matrices
    • Accessing Matrix Elements
    • Create a Matrix
    • Arrays
    • Accessing Array Elements
    • Create an Array
    • Data Frames
    • Elements of Data Frames
    • Create a Data Frame
    • Factors
    • Create a Factor
    • Lists
    • Create a List
    • Importing Files in R
    • Importing an Excel File
    • Importing a Minitab File
    • Importing a Table File
    • Importing a CSV File
    • Read Data from a File
    • Read Data from a File
    • Exporting Files from R

    Apply Functions

    • Introduction
    • Objectives
    • Types of Apply Functions
    • Apply() Function
    • Lapply() Function
    • Sapply() Function
    • Tapply() Function
    • Vapply() Function
    • Mapply() Function
    • Dplyr Package
    • Installing the Dplyr Package
    • Functions of the Dplyr Package
    • Functions of the Dplyr Package – Select()
    • Use the Select() Function
    • Functions of Dplyr-Package – Filter()
    • Use the Filter() Function
    • Use Select Function
    • Functions of Dplyr Package – Arrange()
    • Use Arrange Function
    • Functions of Dplyr Package – Mutate()
    • Functions of Dply Package – Summarise()
    • Use Summarise Function

    Data Visualization

    • Introduction
    • Objectives
    • Graphics in R
    • Types of Graphics
    • Bar Charts
    • Creating Simple Bar Charts
    • Editing a Simple Bar Chart
    • Create a Stacked Bar Plot and Grouped Bar Plot
    • Pie Charts
    • Editing a Pie Chart
    • Create a Pie Chart
    • Histograms
    • Creating a Histogram
    • Kernel Density Plots
    • Creating a Kernel Density Plot
    • Create Histograms and a Density Plot
    • Line Charts
    • Creating a Line Chart
    • Box Plots
    • Creating a Box Plot
    • Create Line Graphs and a Box Plot
    • Heat Maps o Creating a Heat Map
    • Create a Heatmap
    • Word Clouds
    • Creating a Word Cloud
    • File Formats for Graphics Outputs
    • Saving a Graphic Output as a File
    • Save Graphics to a File
    • Exporting Graphs in RStudio
    • Exporting Graphs as PDFs in RStudio
    • Save Graphics Using RStudio

    Introduction to Statistics

    • Introduction
    • Objectives
    • Basics of Statistics
    • Types of Data
    • Qualitative vs. Quantitative Analysis
    • Types of Measurements in Order
    • Nominal Measurement
    • Ordinal Measurement
    • Interval Measurement
    • Ratio Measurement
    • Statistical Investigation
    • Normal Distribution
    • Example of Normal Distribution
    • Importance of Normal Distribution in Statistics
    • Use of the Symmetry Property of Normal Distribution
    • Standard Normal Distribution
    • Use Probability Distribution Functions
    • Distance Measures
    • Distance Measures – A Comparison
    • Euclidean Distance
    • Example of Euclidean Distance
    • Manhattan Distance
    • Minkowski Distance
    • Mahalanobis Distance
    • Cosine Similarity
    • Correlation
    • Correlation Measures Explained
    • Pearson Product Moment Correlation (PPMC)
    • Pearson Correlation
    • Dist() Function in R
    • Perform the Distance Matrix Computations

    Hypothesis Testing I

    • Introduction
    • Objectives
    • Hypothesis
    • Need of Hypothesis Testing in Businesses
    • Null Hypothesis
    • Alternate Hypothesis
    • Null vs. Alternate Hypothesis
    • Chances of Errors in Sampling
    • Types of Errors
    • Contingency Table
    • Decision Making
    • Critical Region
    • Level of Significance
    • Confidence Coefficient
    • Bita Risk
    • Power of Test
    • Factors Affecting the Power of Test
    • Types of Statistical Hypothesis Tests
    • Upper Tail Test
    • Test Statistic
    • Factors Affecting Test Statistic
    • Critical Value Using Normal Probability Table

    Hypothesis Testing II

    • Introduction
    • Objectives
    • Parametric Tests
    • Z-Test
    • Z-Test in R
    • T-Test
    • T-Test in R
    • Use Normal and Student Probability Distribution Functions
    • Testing Null Hypothesis
    • Objectives of Null Hypothesis Test
    • Three Types of Hypothesis Tests
    • Hypothesis Tests About Population Means
    • Decision Rules
    • Hypothesis Tests About Population Means
    • Hypothesis Tests About Population Proportions
    • Chi-Square Test
    • Steps of Chi-Square Test
    • Degree of Freedom
    • Chi-Square Test for Independence
    • Chi-Square Test for Goodness of Fit
    • Chi-Square Test for Independence
    • Chi-Square Test in R
    • Use Chi-Squared Test Statistics
    • Introduction to ANOVA Test
    • One-Way ANOVA Test
    • The F-Distribution and F-Ratio
    • F-Ratio Test
    • F-Ratio Test in R
    • One-Way ANOVA Test
    • One-Way ANOVA Test in R
    • Perform ANOVA

    Regression Analysis

    • Introduction
    • Objectives
    • Introduction to Regression Analysis
    • Use of Regression Analysis
    • Types Regression Analysis
    • Simple Regression Analysis
    • Multiple Regression Models
    • Simple Linear Regression Model
    • Perform Simple Linear Regression
    • Correlation
    • Correlation Between X and Y
    • Find Correlation
    • Method of Least Squares Regression Model
    • Coefficient of Multiple Determination Regression Model
    • Standard Error of the Estimate Regression Model
    • Dummy Variable Regression Model
    • Interaction Regression Model
    • Non-Linear Regression
    • Non-Linear Regression Models
    • Perform Regression Analysis with Multiple Variables
    • Non-Linear Models to Linear Models
    • Algorithms for Complex Non-Linear Models


    • Objectives
    • Introduction to Classification
    • Examples of Classification
    • Classification vs. Prediction
    • Classification System
    • Classification Process
    • Classification Process – Model Construction
    • Classification Process – Model Usage in Prediction
    • Issues Regarding Classification and Prediction
    • Data Preparation Issues
    • Evaluating Classification Methods Issues
    • Decision Tree
    • Decision Tree – Dataset
    • Classification Rules of Trees
    • Overfitting in Classification
    • Tips to Find the Final Tree Size
    • Basic Algorithm for a Decision Tree
    • Statistical Measure – Information Gain
    • Calculating Information Gain for Continuous-Value Attributes
    • Enhancing a Basic Tree
    • Decision Trees in Data Mining
    • Model a Decision Tree
    • Naive Bayes Classifier Model
    • Features of Naive Bayes Classifier Model
    • Bayesian Theorem
    • Naive Bayes Classifier
    • Applying Naive Bayes Classifier
    • Naive Bayes Classifier – Advantages and Disadvantages
    • Perform Classification Using the Naive Bayes Method
    • Nearest Neighbor Classifiers
    • Computing Distance and Determining Class
    • Choosing the Value of K
    • Scaling Issues in Nearest Neighbor Classification
    • Support Vector Machines
    • Advantages of Support Vector Machines
    • Geometric Margin in SVMs
    • Linear SVMs
    • Non-Linear SVMs
    • Support a Vector Machine


    • Introduction
    • Objectives
    • Introduction to Clustering
    • Clustering vs. Classification
    • Use Cases of Clustering
    • Clustering Models
    • K-means Clustering
    • K-means Clustering Algorithm
    • Pseudo Code of K-means
    • K-means Clustering Using R
    • K-means Clustering
    • Perform Clustering Using K-means
    • Hierarchical Clustering
    • Hierarchical Clustering Algorithms
    • Requirements of Hierarchical Clustering Algorithms
    • Agglomerative Clustering Process
    • Perform Hierarchical Clustering
    • DBSCAN Clustering
    • Concepts of DBSCAN
    • DBSCAN Clustering Algorithm
    • DBSCAN in R
    • DBSCAN Clustering


    • Introduction
    • Objectives
    • Association Rule Mining
    • Application Areas of Association Rule Mining
    • Parameters of Interesting Relationships
    • Association Rules
    • Association Rule Strength Measures
    • Limitations of Support and Confidence
    • Apriori Algorithm
    • Applying Apriori Algorithm
    • Step 1 – Mine All Frequent Item Sets
    • Algorithm to Find Frequent Item Set
    • Ordering Items
    • Candidate Generation
    • Step 2 – Generate Rules from Frequent Item Sets
    • Perform Association Using the Apriori Algorithm
    • Perform Visualization on Associated Rules
    • Problems with Association Mining
  • Topic Covered:

    Introduction to Big data and Hadoop Ecosystem

    • Introduction
    • Overview to Big Data and Hadoop
    • Hadoop Ecosystem

    HDFS and YARN

    • Introduction
    • HDFS Architecture and Components
    • Block Replication Architecture
    • YARN Introduction

    MapReduce and Sqoop

    • Introduction
    • Why Mapreduce
    • Small Data and Big Data
    • Data Types in Hadoop
    • Joins in MapReduce
    • What is Sqoop

    Basics of Hive and Impala

    • Introduction
    • Interacting with Hive and Impala
    • Working with Hive and Impala
    • Data Types in Hive
    • Validation of Data
    • What is Catalog and Its Uses

    Types of Data Formats

    • Introduction
    • Types of File Format
    • Data Serialization
    • Importing MySql and Creating hive to
    • Parquet With Sqoop

    Advanced Hive Concept and Data File Partitioning

    • Introduction
    • Overview of the Hive Query Language

    Apache Flume and HBase

    • Introduction
    • Introduction to HBase


    • Introduction
    • Getting Datasets for Pig Development

    Basics of Apache Spark

    • Introduction
    • Spark – Architecture, Execution, and Related Concepts
    • RDD Operations
    • Functional Programming in Spark

    RDDs in Spark

    • Introduction
    • RDD Data Types and RDD Creation
    • Operations in RDDs

    Implementation of Spark Applications

    • Introduction
    • Running Spark on YARN
    • Running a Spark Application
    • Dynamic Resource Allocation
    • Configuring Your Spark Application

    Spark Parallel Processing

    • Introduction
    • Parallel Operations on Partitions

    Spark RDD Optimization Techniques

    • Introduction
    • RDD Persistence

    Spark RDD Optimization Techniques

    • Spark Algorithm
    • Introduction
    • Spark: An Iterative Algorithm
    • Introduction To Graph Parallel System
    • Introduction To Machine Learning
    • Introduction To Three C’s

    Spark SQL

    • Introduction
    • Interoperating with RDDs

    Apache Kafka

    Core Java

  • Topic Covered:

    Data Science

    • Introduction to Data Science
    • Different Sectors Using Data Science
    • Purpose and Components of Python

    Data Analytics

    • Data Analytics Process
    • Exploratory Data Analysis(EDA)
    • EDA-Quantitative Technique
    • EDA – Graphical Technique
    • Data Analytics Conclusion or Predictions
    • Data Analytics Communication
    • Data Types for Plotting
    • Data Types and Plotting

    Statistical Analysis and Business Applications

    • Introduction to Statistics
    • Statistical and Non-statistical Analysis
    • Major Categories of Statistics
    • Statistical Analysis Considerations
    • Population and Sample
    • Statistical Analysis Process
    • Data Distribution
    • Dispersion o Histogram
    • Testing

    Python Environment Setup and Essentials

    • Anaconda
    • Installation of Anaconda Python Distribution (contd.)
    • Data Types with Python
    • Basic Operators and Functions

    Mathematical Computing with Python (NumPy)

    • Introduction to Numpy
    • Activity-Sequence it Right
    • Creating and Printing an array
    • Class and Attributes of array
    • Basic Operations
    • Activity-Slice It
    • Copy and Views
    • Mathematical Functions of Numpy

    Scientific computing with Python (Scipy)

    • Introduction to SciPy
    • SciPy Sub Package – Integration and Optimization
    • SciPy sub package
    • Calculate Eigenvalues and Eigenvector
    • SciPy Sub Package – Statistics, Weave and IO

    Data Manipulation with Pandas

    • Introduction to Pandas
    • Understanding DataFrame
    • View and Select Data
    • Missing Values
    • Data Operations
    • File Read and Write Support
    • Pandas Sql Operation

    Machine Learning with Scikit–Learn

    • Machine Learning Approach
    • How it Works
    • Supervised Learning Model Considerations
    • Scikit-Learn
    • Supervised Learning Models – Linear Regression
    • Supervised Learning Models – Logistic Regression
    • Unsupervised Learning Models
    • Pipeline
    • Model Persistence and Evaluation

    Natural Language Processing with Scikit Learn

    • NLP Overview
    • NLP Applications
    • NLP Libraries-Scikit
    • Extraction Considerations
    • Scikit Learn-Model Training and Grid Search

    Data Visualization in Python using matplotlib

    • Introduction to Data Visualization
    • Line Properties
    • (x,y) Plot and Subplots
    • Types of Plots

    Web Scraping with BeautifulSoup

    • Web Scraping and Parsing
    • Understanding and Searching the Tree
    • Navigating options
    • Navigating a Tree
    • Modifying the Tree
    • Parsing and Printing the Document

    Python integration with Hadoop MapReduce and Spark

    • Why Big Data Solutions are Provided for Python
    • Hadoop Core Components
    • Python Integration with HDFS using Hadoop Streaming
    • Using Hadoop Streaming for Calculating Word Count
    • Python Integration with Spark using PySpark
    • Using PySpark to Determine Word Count

    Python Basics

  • Topic Covered:

    Introduction to Business Analytics

    • Introduction
    • What Is in It for Me
    • Types of Analytics
    • Areas of Analytics

    Formatting Conditional Formatting and Important Functions

    • Introduction
    • What Is in It for Me
    • Custom Formatting Introduction
    • Conditional Formatting Introduction
    • Logical Functions
    • Lookup and Reference Functions
    • VLOOKUP Function
    • HLOOKUP Function
    • MATCH Function
    • INDEX and OFFSET Function
    • Statistical Function
    • SUMIFS Function
    • COUNTIFS Function
    • STDEV, MEDIAN and RANK Function

    Analyzing Data with Pivot Tables

    • Introduction
    • What Is in It for Me
    • Pivot Table Introduction
    • Concept Video of Creating a Pivot Table
    • Grouping in Pivot Table Introduction
    • Custom Calculation
    • Calculated Field and Calculated Item
    • Slicer Intro
    • Creating a Slice
    • Dashboarding
    • Introduction
    • What Is in It for Me
    • What is a Dashboard
    • Principles of Great Dashboard Design
    • How to Create Chart in Excel
    • Chart Formatting
    • Thermometer Chart
    • Pareto Chart
    • Form Controls in Excel
    • Interactive Dashboard with Form Controls
    • Chart with Checkbox
    • Interactive Chart

    Business Analytics With Excel

    • Introduction
    • What Is in It for Me
    • Concept Video Histogram
    • Concept Video Solver Addin
    • Concept Video Goal Seek
    • Concept Video Scenario Manager
    • Concept Video Data Table
    • Concept Video Descriptive Statistics

    Data Analysis Using Statistics

    • Introduction
    • Moving Average
    • Hypothesis Testing
    • ANOVA
    • Covariance
    • Correlation
    • Regression
    • Normal Distribution

    Power BI

    • Introduction
    • Power Pivot
    • Power View
    • Power Query
    • Power Map

    Microsoft Power BI Desktop

    Microsoft Power BI Recipes

  • Topic Covered:

    Machine Learning Introduction

    • Techniques of Machine Learning
    • Supervised Learning
    • Unsupervised Learning
    • Semi-supervised Learning and Reinforcement Learning
    • Some Important Considerations in Machine Learning

    Data Preprocessing

    • Data Preparation
    • Feature engineering
    • Feature scaling
    • Datasets
    • Dimensionality reduction

    Math Refresher

    • Eigenvalues, Eigenvectors, and Eigendecomposition
    • Concepts of Linear Algebra
    • Introduction to Calculus
    • Probability and Statistics


    • Regression and Its Types
    • Linear Regression: Equations and Algorithms


    • Logistic regression
    • K-nearest neighbours
    • Support Vector Machines
    • Kernel SVM
    • Naive Bayes
    • Decision tree classifier
    • Random forest classifier

    Unsupervised learning – Clustering

    • K-Means Clustering
    • Clustering Algorithms

    Introduction to Deep Learning

    • Meaning and importance of deep learning
    • Artificial Neural networks
    • TensorFlow

    Introduction to Artificial Intelligence and Machine Learning

    • Artificial Intelligence
    • Machine Learning o Machine Learning algorithms o Applications of Machine Learning

    Python Programming for Beginners

    Python Django From Scratch

  • Topic Covered:

    Introduction to Deep Learning with TensorFlow

    • Introduction to TensorFlow
    • Intro to TensorFlow
    • Computational Graph
    • Key highlights
    • Creating a Graph
    • Regression example
    • Gradient Descent
    • TensorBoard
    • Modularity
    • Sharing Variables
    • Keras


    • What is a Perceptron
    • XOR Gate

    Activation Functions

    • Sigmoid
    • ReLU
    • Hyperbolic Fns
    • Softmax

    Artificial Neural Networks

    • Introduction
    • Perceptron Training Rule
    • Gradient Descent Rule

    Gradient Descent and Backpropagation

    • Gradient Descent
    • Stochastic Gradient Descent
    • Backpropagation
    • Some problems in ANN

    Optimization and Regularization

    • Overfitting and Capacity
    • Cross-Validation
    • Feature Selection
    • Regularization
    • Hyperparameters

    Intro to Convolutional Neural Networks

    • Intro to CNNs
    • Kernel filter
    • Principles behind CNNs
    • Multiple Filters
    • CNN applications

    Intro to Recurrent Neural Networks

    • Intro to RNNs
    • Unfolded RNNs
    • Seq2Seq RNNs
    • LSTM
    • RNN

    Deep Learning applications

    • Image Processing
    • Natural Language Processing
    • Speech Recognition
    • Video Analytics
