Welcome to Introduction to Data Analytics! This course will guide you through the essential techniques for working with data, equipping you with skills used by data experts across industries. You’ll explore how to clean and preprocess data using Python libraries like Pandas and NumPy, laying the groundwork for effective data analysis.
We’ll dive into exploratory data analysis (EDA), where you’ll uncover hidden patterns and insights. You’ll also be introduced to key machine learning algorithms for predicting outcomes and solving real-world problems. Along the way, we’ll cover best practices for evaluating your models and ensuring their reliability.
The course also includes hands-on projects to solidify your learning and practical exercises to apply your skills. By the end, you’ll have a robust toolkit for approaching data-driven challenges confidently, whether you're advancing your career or tackling new opportunities in the data field. Join us on this learning journey!
This module provides a comprehensive introduction to data analytics, covering its definition, importance, key components, and industry applications. Students will learn to apply the four types of data analytics (descriptive, diagnostic, predictive, and prescriptive) to solve business problems and make data-driven decisions. They will also analyse real-world use cases, challenges, and future trends in data analytics across various domains. Additionally, the students will gain an understanding of structured, unstructured, semi-structured, quantitative, and qualitative data from primary, secondary, internal, and external sources, and learn how to apply this knowledge to data analytics projects.
What's included
19 videos5 readings16 assignments
Show info about module content
19 videos•Total 108 minutes
Meet Your Instructor - Prof. Seetha Parameswaran•2 minutes
Meet Your Instructor - Prof. Aneesh Chivukula•1 minute
Course Introductory Video•3 minutes
Definition of Data Analytics•7 minutes
The Importance of Data Analytics•6 minutes
Key Components •6 minutes
Descriptive Analytics•6 minutes
Diagnostic Analytics•6 minutes
Predictive Analytics•6 minutes
Prescriptive Analytics•6 minutes
Industry Applications •7 minutes
Challenges in Data Analytics•6 minutes
Structured Data•6 minutes
Unstructured Data•6 minutes
Semi-Structured Data•7 minutes
Quantitative Data•7 minutes
Qualitative Data•7 minutes
Primary and Secondary Data Sources•6 minutes
Internal and External Data Sources•6 minutes
5 readings•Total 70 minutes
Course Overview•10 minutes
Essential Reading: Data Analytics Process•15 minutes
Essential Reading: Skills Required and Tools and Technologies Used in Data Analytics•15 minutes
Essential Reading: Use Cases and Applications of Data Analytics•15 minutes
Essential Reading: Examples of Data and Data Sources•15 minutes
16 assignments•Total 51 minutes
Definition of Data Analytics•6 minutes
The Importance of Data Analytics•3 minutes
Key Components •3 minutes
Descriptive Analytics•3 minutes
Diagnostic Analytics•3 minutes
Predictive Analytics•3 minutes
Prescriptive Analytics•3 minutes
Industry Applications •3 minutes
Challenges in Data Analytics•3 minutes
Structured Data•3 minutes
Unstructured Data•3 minutes
Semi-Structured Data•3 minutes
Quantitative Data•3 minutes
Qualitative Data•3 minutes
Primary and Secondary Data Sources•3 minutes
Internal and External Data Sources•3 minutes
Python Fundamentals
Module 2•5 hours to complete
Module details
This module focuses on essential Python concepts and techniques for data analytics. The module introduces basic Python concepts, such as the Python interpreter, Jupyter Notebook, input/output, and indentation, enabling students to start developing Python programs for data analytics. Students will learn to apply Python scalar types, objects, attributes, methods, and operators to create and manipulate data structures. They will also apply control statements and iterations, such as conditional statements and loops, to control the flow of execution and process data efficiently. The module covers the use of regular and lambda functions to create reusable and modular code. Additionally, students will learn to apply file-handling techniques to read from and write to files, facilitating data persistence and external data processing. By the end of this module, students will have the necessary Python skills to perform data manipulation, analysis, and processing tasks.
What's included
21 videos5 readings17 assignments
Show info about module content
21 videos•Total 126 minutes
Python Interpreter•7 minutes
Jupyter Python•6 minutes
Input and Print•6 minutes
Indentations•6 minutes
Lesson 1 Demo•4 minutes
Python Scalar Types •6 minutes
Objects •5 minutes
Attributes•6 minutes
Methods•5 minutes
Operators•6 minutes
Lesson 2 Demo•12 minutes
Conditional Statement•6 minutes
Nested Conditional Statement•5 minutes
For and While Loops•6 minutes
Lesson 3 Demo•9 minutes
Regular Functions•7 minutes
Lambda Functions•7 minutes
Lesson 4 Demo•4 minutes
Reading Files•6 minutes
Writing Files•6 minutes
Lesson 5 Demo•3 minutes
5 readings•Total 55 minutes
Essential Reading: Indentations in Python•15 minutes
Essential Reading: Operator Precedence and Indentation in Python•10 minutes
Essential Reading: Control Statements and Iterations in Python•10 minutes
Essential Reading: Handling Functions•10 minutes
Essential Reading: Handling Files•10 minutes
17 assignments•Total 108 minutes
Graded Quiz for Week 1 and 2•60 minutes
Python Interpreter•3 minutes
Jupyter Python•3 minutes
Input and Print•3 minutes
Indentations•3 minutes
Python Scalar Types •3 minutes
Objects •3 minutes
Attributes•3 minutes
Methods•3 minutes
Operators•3 minutes
Conditional Statement•3 minutes
Nested Conditional Statement•3 minutes
For and While Loops•3 minutes
Regular Functions•3 minutes
Lambda Functions•3 minutes
Reading Files•3 minutes
Writing Files•3 minutes
Fundamental Data Structures and NumPy in Python
Module 3•4 hours to complete
Module details
This module explores essential data structures in Python, covering both immutable and mutable types and the powerful NumPy library. Students will learn to apply tuples and strings, along with their methods, to store and manipulate fixed data. They will also apply lists, dictionaries, and sets, as well as their respective methods and operations, to handle changeable data effectively. The module introduces NumPy, enabling students to create, manipulate, and perform arithmetic operations on NumPy arrays using built-in functions. By the end of this module, students will have a solid understanding of Python data structures and NumPy, equipping them with the necessary tools for efficient data manipulation and numerical computations in data analytics tasks.
What's included
18 videos3 readings15 assignments
Show info about module content
18 videos•Total 123 minutes
Tuple•7 minutes
Tuple Methods •5 minutes
Strings•6 minutes
Accessing Strings•6 minutes
Lesson 1 Demo•6 minutes
Lists •5 minutes
Slicing List•6 minutes
List Methods•7 minutes
Dictionary•7 minutes
Set•6 minutes
Set Operations•6 minutes
Lesson 2 Demo•11 minutes
NumPy Arrays•7 minutes
NumPy Data Types•8 minutes
Arithmetic with NumPy•6 minutes
Indexing and Slicing Arrays•6 minutes
NumPy Functions•7 minutes
Lesson 3 Demo•12 minutes
3 readings•Total 45 minutes
Essential Reading: Immutable Data Structures•15 minutes
Essential Reading: Mutable Data Structures•15 minutes
Essential Reading: NumPy Library•15 minutes
15 assignments•Total 45 minutes
Tuple•3 minutes
Tuple Methods•3 minutes
Strings•3 minutes
Accessing Strings•3 minutes
Lists •3 minutes
Slicing List•3 minutes
List Methods•3 minutes
Dictionary•3 minutes
Set - Practice Quiz•3 minutes
Set Operations•3 minutes
NumPy Arrays•3 minutes
NumPy Data Types•3 minutes
Arithmetic with NumPy•3 minutes
Indexing and Slicing Arrays•3 minutes
NumPy Functions•3 minutes
Exploratory Data Analysis (EDA)
Module 4•5 hours to complete
Module details
This module focuses on exploratory data analysis (EDA) and visualisation using the Pandas library and Matplotlib in Python. Students will learn to apply Pandas to create, manipulate, and perform operations on Series and DataFrame objects, enabling efficient data analysis and preprocessing. They will conduct EDA to identify patterns, trends, and relationships in the data. Additionally, students will apply Matplotlib to create informative and visually appealing plots to effectively communicate insights derived from EDA. By the end of this module, students will have the skills to perform comprehensive exploratory data analysis and create meaningful visualisations using Python.
What's included
18 videos3 readings16 assignments
Show info about module content
18 videos•Total 138 minutes
Series •8 minutes
DataFrame•6 minutes
Indexing a DataFrame•4 minutes
Selection in a DataFrame•5 minutes
Filtering a DataFrame•5 minutes
Operations on a DataFrame•4 minutes
Lesson 1 Demo•16 minutes
Descriptive Statistics for Numerical Data•8 minutes
Descriptive Statistics for Categorical Data•8 minutes
Data Relationship: Correlation and Covariance•6 minutes
Univariate Analysis•4 minutes
Bivariate Analysis•4 minutes
Lesson 2 Demo•12 minutes
Scatter Plots•8 minutes
Line Plots•8 minutes
Bar Plots•9 minutes
Histograms•7 minutes
Lesson 3 Demo•15 minutes
3 readings•Total 45 minutes
Essential Reading: Pandas Library•15 minutes
Essential Reading: EDA•15 minutes
Essential Reading: EDA Visualisation Using Matplotlib•15 minutes
16 assignments•Total 105 minutes
Graded Quiz for Week 3 and 4•60 minutes
Series •3 minutes
DataFrame•3 minutes
Indexing a DataFrame•3 minutes
Selection in a DataFrame•3 minutes
Filtering a DataFrame•3 minutes
Operations on a DataFrame•3 minutes
Descriptive Statistics for Numerical Data•3 minutes
Descriptive Statistics for Categorical Data•3 minutes
Data Relationship: Correlation and Covariance•3 minutes
Univariate Analysis•3 minutes
Bivariate Analysis•3 minutes
Scatter Plots•3 minutes
Line Plots•3 minutes
Bar Plots•3 minutes
Histograms•3 minutes
Data Cleaning and Preparation
Module 5•4 hours to complete
Module details
This module focuses on data preprocessing techniques essential for preparing data for analysis. Students will learn to apply methods for reading and writing data in text format while identifying and addressing data quality issues. They will handle missing data by filtering out or filling in missing values and applying various data transformation techniques such as removing duplicates, mapping, replacing values, discretisation, outlier detection and filtering, and encoding categorical variables. Additionally, students will apply data aggregation techniques, including grouping, aggregation and combining functions, to summarise and analyse data. By the end of this module, students will have the skills to preprocess and clean datasets effectively, ensuring data quality and readiness for further analysis.
Essential Reading: Data Transformations•15 minutes
Essential Reading: Data Aggregation•15 minutes
16 assignments•Total 48 minutes
Reading Data from Text Format•3 minutes
Writing Data to Text Format•3 minutes
Data Quality Issues•3 minutes
Filtering out Missing Data•3 minutes
Filling in Missing Data•3 minutes
Removing Duplicates•3 minutes
Transforming Data Using Mapping•3 minutes
Replacing Value•3 minutes
Discretisation and Binning•3 minutes
Encoding Categorical Data•3 minutes
Detecting Outliers•3 minutes
Filtering Outliers•3 minutes
Split - Apply - Combine•3 minutes
Split Step•3 minutes
Apply Step•3 minutes
Combine Step•3 minutes
Feature Engineering
Module 6•8 hours to complete
Module details
This module focuses on advanced data preprocessing techniques for handling large and complex datasets. Students will learn to apply data reduction techniques, including dimensionality reduction, numerosity reduction, and sampling methods, to reduce the size and complexity of datasets while preserving important information. They will also apply feature selection techniques, such as filter methods, wrapper methods, and embedded methods, to identify and select the most relevant features for data analysis. Additionally, students will explore feature extraction techniques, including Principal Component Analysis (PCA) and Covariance Analysis, to transform and extract new, informative features from the original dataset. By the end of this module, students will have the skills to effectively preprocess and optimise datasets for improved performance and insights in data analysis tasks.
What's included
13 videos3 readings14 assignments1 ungraded lab
Show info about module content
13 videos•Total 99 minutes
Dimensionality Reduction•8 minutes
Numerosity Reduction•9 minutes
Sampling Methods•5 minutes
Filter Methods•6 minutes
Correlation Based Filters•15 minutes
Entropy-Based Filters•5 minutes
Wrapper Methods•7 minutes
Forward Selection•7 minutes
Backward Elimination•7 minutes
Embedded Methods•6 minutes
Mutual Information•10 minutes
Covariance Analysis•6 minutes
Principal Component Analysis•7 minutes
3 readings•Total 170 minutes
Essential Reading: Data Reduction•50 minutes
Essential Reading: Feature Selection•60 minutes
Essential Reading: Feature Extraction•60 minutes
14 assignments•Total 138 minutes
Graded Quiz for Week 5 and 6•60 minutes
Dimensionality Reduction•6 minutes
Numerosity Reduction•6 minutes
Sampling Methods•6 minutes
Filter Methods•6 minutes
Correlation Based Filters•6 minutes
Entropy-Based Filters•6 minutes
Wrapper Methods•6 minutes
Forward Selection•6 minutes
Backward Elimination•6 minutes
Embedded Methods•6 minutes
Mutual Information•6 minutes
Covariance Analysis•6 minutes
Principal Component Analysis•6 minutes
1 ungraded lab•Total 60 minutes
Practice Lab: ML Engineering•60 minutes
Regression
Module 7•6 hours to complete
Module details
This module focuses on regression analysis, a fundamental technique in predictive modeling and data analysis. Students will learn to apply linear regression techniques, including univariate and multivariate linear models, to analyse and model the relationship between dependent and independent variables in real-world applications. They will also apply model fitting techniques, such as gradient descent, and evaluate regression models using appropriate metrics to select the best-performing model for a given dataset. Additionally, students will explore nonlinear regression techniques, including smoothing methods, regularised models, robust regression, and nonlinear models, to capture and model complex, nonlinear relationships between variables. By the end of this module, students will have the skills to effectively apply regression techniques to solve real-world problems and make data-driven predictions.
This module focuses on classification techniques, specifically rule-based and parameter-based models. Students will learn to apply decision trees to solve binary and multilabel classification problems and evaluate the performance of these models. They will explore decision tree induction algorithms, considering design issues and measures of impurity, and random forests, to build effective and interpretable models. Students will also apply model selection techniques, such as cross-validation, and address overfitting issues to optimise decision tree models and visualise decision boundaries. Additionally, they will learn to apply logistic regression and discriminant analysis, parameter-based models, to solve classification problems and evaluate its performance. By the end of this module, students will have the skills to effectively apply classification techniques to real-world problems and make data-driven predictions.
What's included
16 videos4 readings17 assignments1 ungraded lab
Show info about module content
16 videos•Total 82 minutes
Applications•5 minutes
Binary Classification •5 minutes
Multiclass Classification•5 minutes
Building Decision Trees - Part 1•5 minutes
Building Decision Trees - Part 2•2 minutes
Design Issues•5 minutes
Measures of Impurity - Part 1•4 minutes
Measures of Impurity - Part 2•4 minutes
Cross-Validation•6 minutes
Overfitting•5 minutes
Random Forests•5 minutes
Decision Boundaries•9 minutes
Logistic Regression•4 minutes
Discriminant Analysis•4 minutes
Classifier’s Performance Evaluation - Part 1•8 minutes
Classifier’s Performance Evaluation - Part 2•5 minutes
4 readings•Total 240 minutes
Essential Reading: Rule Based Models•60 minutes
Essential Reading: Decision Tree Induction Algorithms•60 minutes
Essential Reading: Model Selection in Decision Trees•60 minutes
Essential Reading: Parameter Based Models•60 minutes
17 assignments•Total 156 minutes
Graded Quiz for Week 7 and 8•60 minutes
Applications•6 minutes
Binary Classification •6 minutes
Multiclass Classification•6 minutes
Building Decision Trees - Part 1•6 minutes
Building Decision Trees - Part 2•6 minutes
Design Issues•6 minutes
Measures of Impurity - Part 1•6 minutes
Measures of Impurity - Part 2•6 minutes
Cross-Validation•6 minutes
Overfitting•6 minutes
Random Forests•6 minutes
Decision Boundaries•6 minutes
Logistic Regression•6 minutes
Discriminant Analysis•6 minutes
Classifier’s Performance Evaluation - Part 1•6 minutes
Classifier’s Performance Evaluation - Part 2•6 minutes
1 ungraded lab•Total 60 minutes
Practice Lab: Model Optimization•60 minutes
Clustering
Module 9•5 hours to complete
Module details
This module focuses on unsupervised learning techniques for clustering, which aim to discover natural groupings and patterns in data without prior knowledge of class labels. Students will learn to apply partitional clustering techniques, specifically the k-Means algorithm, considering similarity measures, distance matrices, and cluster goodness evaluation. They will also explore hierarchical clustering methods, both bottom-up agglomerative and top-down divisive, to create nested clusters and analyse data at different levels of granularity. Additionally, students will apply cluster validation techniques, including external and internal indices, to assess the quality of clustering results and determine the optimal number of clusters for a given dataset. By the end of this module, students will have the skills to effectively apply clustering techniques to real-world problems and gain insights from unlabeled data.
What's included
13 videos3 readings13 assignments
Show info about module content
13 videos•Total 52 minutes
Applications•3 minutes
Types of Clusters•3 minutes
Types of Clustering Algorithms•3 minutes
Similarity Measures•6 minutes
Distance Matrix•4 minutes
k-Means Algorithm•5 minutes
Fuzzy C-Means Algorithm•6 minutes
Bottom-Up Agglomerative Methods•4 minutes
Top-Down Divisive Methods•4 minutes
Distance Measures in Hierarchical Methods•2 minutes
Distance Measures in Hierarchical Methods•6 minutes
Aspects of Cluster Validation•6 minutes
External Indices•6 minutes
Internal Indices •6 minutes
Ethical Implications
Module 10•7 hours to complete
Module details
This module focuses on privacy, fairness, and security of data analytics. Students will learn about the risk assessment and threat modeling in the practical use of data analytics. Privacy-preserving data mechanism for model privacy will be surveyed. The attack strategies and defense mechanisms of model security will be emphasized. Notions of AI fairness and algorithmic bias will be covered at the stages of pre-processing, in-processing, post-processing stages of data analytics. Cost-sensitive classification and machine learning will be discussed to assess model fairness. Model security will be formalized under frameworks of adversarial data mining for game theory based AI with applications in the cyber kill chain for cybersecurity. Adversarial example games will be summarized for specific targets in adversarial capability, ability and goals. An adversarial risk analysis of the game theories and association optimization trade-offs will be presented in the setup of binary classification, multiclass classification, and multilabel classification. Relation between adversarial and robust data mining for classifier design will be motivated with respect to the robustness properties of analytics models satisfied in defense mechanisms such as semi-supervised machine learning, adversarial training and learning, empirical risk minimization, and mistake-bounds frameworks for adversarial classification. By the end of this module, students will have the skills to effectively apply data analytics techniques to real-world problems and gain insights in a safe, secure, and transparent manner.
What's included
15 videos4 readings16 assignments1 ungraded lab
Show info about module content
15 videos•Total 112 minutes
Data Privacy•8 minutes
Model Privacy•9 minutes
Privacy Enhancing Strategies•7 minutes
Data Fairness•4 minutes
Model Fairness•6 minutes
Algorithmic Fairness•7 minutes
Model Security - Part 1•7 minutes
Model Security - Part 2•6 minutes
Cost-Sensitive Classification•6 minutes
Cost-Sensitive Learning•9 minutes
Adversarial Data Mining - Part 1 •9 minutes
Adversarial Data Mining - Part 2•13 minutes
Robust Data Mining - Part 1 •8 minutes
Robust Data Mining - Part 2•6 minutes
Adversarial and Robust Data Mining•8 minutes
4 readings•Total 105 minutes
Essential Reading: Analytics Privacy•30 minutes
Essential Reading: Analytics Fairness•45 minutes
Essential Reading: Analytics Security•20 minutes
Course Summary•10 minutes
16 assignments•Total 150 minutes
Graded Quiz for Week 9 and 10•60 minutes
Data Privacy•6 minutes
Model Privacy•6 minutes
Privacy Enhancing Strategies•6 minutes
Data Fairness•6 minutes
Model Fairness•6 minutes
Algorithmic Fairness•6 minutes
Model Security - Part 1•6 minutes
Model Security - Part 2•6 minutes
Cost-Sensitive Classification•6 minutes
Cost-Sensitive Learning•6 minutes
Adversarial Data Mining - Part 1 •6 minutes
Adversarial Data Mining - Part 2•6 minutes
Robust Data Mining - Part 1 •6 minutes
Robust Data Mining - Part 2•6 minutes
Adversarial and Robust Data Mining•6 minutes
1 ungraded lab•Total 60 minutes
Practice Lab: Neural Networks•60 minutes
Build toward a degree
This course is part of the following degree program(s) offered by Birla Institute of Technology & Science, Pilani. If you are admitted and enroll, your completed coursework may count toward your degree learning and your progress can transfer with you.¹
View eligible degrees
Build toward a degree
This course is part of the following degree program(s) offered by Birla Institute of Technology & Science, Pilani. If you are admitted and enroll, your completed coursework may count toward your degree learning and your progress can transfer with you.¹
¹Successful application and enrollment are required. Eligibility requirements apply. Each institution determines the number of credits recognized by completing this content that may count towards degree requirements, considering any existing credits you may have. Click on a specific course for more information.
Birla Institute of Technology & Science, Pilani (BITS Pilani) is one of only ten private universities in India to be recognised as an Institute of Eminence by the Ministry of Human Resource Development, Government of India. It has been consistently ranked high by both governmental and private ranking agencies for its innovative processes and capabilities that have enabled it to impart quality education and emerge as the best private science and engineering institute in India.
BITS Pilani has four international campuses in Pilani, Goa, Hyderabad, and Dubai, and has been offering bachelor's, master’s, and certificate programmes for over 58 years, helping to launch the careers for over 1,00,000 professionals.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I purchase the Certificate?
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.