Data Science traininig program

2014 Year for Big Data, even if you know the concpets you would have been in there. If Missed the Bus in 2014, this the precise time to become competent in & get into Data Science/Macine Learning projects.

Python or R programming is Universe option for Data Science. Any professional having knowledge in SQL, Python, Java, Advanced Excel can quickly get into Data Science and apply Machine Learninig algorithm to solve their problem.

Data Science job market is diverse, most popular job description in Data Science / Data Analytics / Data Engineers and Statistics inferencer. Skill required is Python, R, SQL, Machine Learning. SPLASH - offers a customized list of Data Science with Python training in pallikaranai, chennai. We also upskill bit more complicated topics like Hypothesis, selecting right model for the problems, Parameters optimization, Evaluating model accuracy.

course designed such a way to handle real time solution to the Machine Learning problem. SPLASH - covers all aspect of practical Data Science - Loading a dataset, Exploratery analysis, Pre-processing, feature extraction, statistical inference, data visuvalization.
We encourage trainee / participant to come up with their own data for exploration.

Data Science Training in Pallikaranai, Chennai

Course Content


Module 1 - Introduction to Data Science

  • Lets understand - What is Data Science?
  • Concepts of Machine Learning
  • Deep Learning VS Machine Learninig
  • Finally.. Artificial Intelligence(AI)

Module 2 – Introduction to Python

  • Why Python for Data Science
  • Creating Python Environment - Windows/Ubuntu/MacOS
  • Python Repositary - pip vs Conda
  • IDE - Pupyter Notebook Overview

Module 3 – Programming Fundamentals in Python

  • Python Basic Data types
  • List
  • Tuples
  • Dictionaries
  • Python Set
  • Indixing & Slicing
  • Selection by position & Labels
  • If Statements
  • Loops and Nested loops
  • Creating function in Python

Module 4 – Python Package for Data Science

  • Pandas
  • Numpy
  • MatPlot
  • Sci-Kit learn
  • SciPy

Module 5 – How to Load Machine Learning Data in Python

  • Importing data from CSV/TSV files
  • Exporting data to CSV files
  • Python Package - Pickle
  • Saving Python Objects
  • Loading data from Python Objects

Module 6 – Data Manipulation in Pandas

  • Selecting rows/ range of observations
  • Rounding / Absolute numbers
  • Selecting Columns / Attributes
  • Merging Data in Pandas
  • Data munging techniques

Module 7 – Data pre-processing

  • Missing Value analysis
    • What is NA / NaN / NULL values
    • Data Imputation
    • Mode
  • Data Normalization

Module 8 – Statistics Inference

  • Central Tendency
    • Mean
    • Median
    • Mode
  • Statistical Data Dispersion analysis
    • Range
    • Data Variance
    • Standard Deviation
    • Data Skewness
  • Outlier analysis
    • What is Outlier
    • Outlier influence in central tendency
  • Detecting Outliers
    • Seaborn - boxplot - box whisker plot
    • Using Inter Quartile Range(IQR)
    • Using Z-Score
  • How to Treat Outliers
    • What is data Transformation
    • Log Data Transformation
    • Winsorization Transformation
    • Drop out / Not to Drop out
  • Interpreting Correlation
    • Bivariate analysis
    • Pearson Correlation
    • Strength of the association
    • Direction of the relationship

Module 9 – Error Metric for Machine Learning Models

  • Regression
    • MAE
    • MSE
    • RMSE
    • MAPE
  • Classification
    • Confusion Matrix
    • Precision
    • Recall - Sensitivity
    • Specificity
    • F1-Score

Data Science Machine Learning Training in Pallikaranai, Chennai


Module 10 – Supervised Machine Learning

  • Liner Regression
    • When to use Liner Regression
    • Univariate Prediction
    • Live Project : Prediction
    • Multivariate analysis
  • Logistic regression
    • Binary Classification
    • Model evaluation
    • Probability of success / failure
    • Live Project : Anomaly Detection

Module 11 – UnSupervised Machine Learning

  • Understanding Unsupervised
  • What is Data Clustring analysis
  • KMeans clustring
    • Finding the centroid
    • Evaluate model with Test Data
    • Live Project : Customer spend analysis
  • Hierarchical Clustering
    • Agglomerative hierarchical clustering
    • Divisive hierarchical clustering
    • Live Project : Customer spend analysis
    • Kmeans VS Hierarchical - Result analysis

Module 12 – Classification algorithms in Machine Learning

  • K - Nearest Neighbour
  • Naive Bayes Classifier
  • Decision Tree
  • Support Vector Machines
  • Live Project : News Article Classification
  • Random Forest
  • Live Project : Twitter data analysis

Since we landed in Big Data, We understood pain of applying tratitional Machine learning algorithm in big data. put-in Machine learning algorithm to find the insights from enormous volume of data is time consuming process.

Similar to us, One needs a data science solutions from Big Data, pick right platform. Yes.. PySpark is GPL designed to perform Data Analysis at scale. Data Science with PySpark for Big data - training programe will designed to cover Big Data & Machine Learning.

SPLASH -A Data Training Institute in Pallikaranai, Chennai. We have been in the field of data training in last 4 years, over online. we have client across glope and our training institute is open round the clock.

We have also offering Big Data and No-SQL training, We at SPLASH basically focussing on DATA. check out our list our data training courses.