S P L A S H

Course Content

 

Data Science with R Syllabus

Module 1: Introduction to Data Science

  • What is Data Science?
  • What is Machine Learning?
  • What is Deep Learning?
  • What is AI?
  • Data Analytics & it’s types

Module 2: Introduction to R 

  • What is R?
  • Why R?
  • Installing R
  • R environment
  • How to get help in R
  • R Studio Overview

Module 3: R Basics 

  •  Environment setup
  • Data Types
  • Variables Vectors
  • Lists
  • Matrix
  • Array
  • Factors
  • Data Frames
  • Loops
  • Packages
  • Functions
  • In-Built Data sets

Module 4: R Packages  

  • Data Visualization
    • DataExplorer
    • esquisse
  • Machine Learning
    • MLR
    • parsnip
    • Ranger
    • purrr
  • Other Miscellaneous R Packages
    • rtweet
    • Reticulate
  • More R Packages!
    • InstallR
    • GitHubInstall

Module 5: Importing Data 

  • Reading CSV files
  • Writing data to csv file
  • Reading data from RDBMS 
  • Writing data into RDBMS

Module 6: Manipulating Data

  • Selecting rows/observations
  • Rounding Number
  • Selecting columns/fields
  • Merging data
  • Data aggregation
  • Data munging techniques

Module 7: Statistics Basics 

  • Central Tendency
    • Mean
    • Median
    • Mode
    • Skewness
    • Normal Distribution
  •  Probability Basics
    • What does mean by probability?
    • Types of Probability
    • ODDS Ratio?
  • Standard Deviation
    • Data deviation & distribution
    • Variance
  • Bias variance Trade off
    • Underfitting
    • Overfitting
  • Distance metrics
    • Euclidean Distance
    • Manhattan Distance
  • Outlier analysis
    • What is an Outlier?
    • Inter Quartile Range
    • Box & whisker plot
    • Upper Whisker
    • Lower Whisker
    • Scatter plot
    • Cook’s Distance
  • Missing Value treatments
    • What is a NA?
    • Central Imputation
    • KNN imputation
    • Dummification
  • Correlation
    • Pearson correlation
    • Positive & Negative correlation

Module 8: Error Metrics 

  • Classification
    • Confusion Matrix
    • Precision
    • Recall
    • Specificity
    • F1 Score
  • Regression
    • MSE
    • RMSE
    • MAPE

Module 9: Machine Learning - Supervised Learning

  • Linear Regression
    • Linear Equation
    • Slope
    • Intercept
    • R square value
  •  Logistic regression
    • ODDS ratio
    • Probability of success
    • Probability of failure
    • ROC curve
    • Bias Variance Tradeoff

Module 10 Machine Learning - Unsupervised Learning 

  • K-Means
  • Hierarchical Clustering

Module 11: Machine Learning using R 

  • Random forest
  • Naïve Bayes