When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 5 modules in this course
In this course, you’ll explore three exploratory data analysis (EDA) practices: cleaning, joining, and validating. You'll discover the importance of these practices for data analysis, and you’ll use Python to clean, validate, and join data.
By the end of this course, you will be able to:
• Apply input validation skills to a dataset with Python
• Explain the importance of input validation
• Demonstrate how to transform categorical data into numerical data with Python
• Explain the importance of categorical versus numerical data in a dataset
• Explain the importance of recognizing outliers in a dataset
• Demonstrate how to identify outliers in a dataset with Python
• Understand when to contact stakeholders or engineers regarding missing values
• Explain the importance of ethically considering missing values
• Demonstrate how to identify missing data with Python
Missing or duplicate data can appear in datasets for numerous reasons. The impact of missing values can vary depending on how many are present. In this module, you will learn strategies to address missing data entries, determine when deduplication is needed, and use common Python functions for handling duplicates.
What's included
4 videos1 reading1 assignment3 ungraded labs
Show info about module content
4 videos•Total 27 minutes
Introduction to data cleaning•4 minutes
Methods for handling missing data •8 minutes
Work with missing data in a Python notebook•12 minutes
Remy: A day in the life of a data professional•3 minutes
1 reading•Total 8 minutes
Data deduplication with Python•8 minutes
1 assignment•Total 8 minutes
Test your knowledge: The challenge of missing or duplicate data•8 minutes
3 ungraded labs•Total 100 minutes
Annotated follow-along guide: Work with missing data in a Python notebook•20 minutes
Activity: Address missing data•60 minutes
Exemplar: Address missing data•20 minutes
The ins and outs of data outliers
Module 2•1 hour to complete
Module details
Outliers are data points that stand out amongst others. A tactful approach to outliers recognizes the human stories and real-world effects they represent. In this module, you will learn the types of outliers, how to handle them, and visualize them.
What's included
2 videos2 readings1 assignment
Show info about module content
2 videos•Total 20 minutes
Account for outliers•6 minutes
Identify and deal with outliers in Python•14 minutes
2 readings•Total 16 minutes
Protect the people behind the data•8 minutes
Reference guide: How to handle outliers•8 minutes
1 assignment•Total 6 minutes
Test your knowledge: The ins and outs of data outliers•6 minutes
Change categorical data to numerical data
Module 3•1 hour to complete
Module details
Data models typically work better with numerical inputs. To facilitate this, categorical data is encoded into numeric digits for analysis. In this module, you will learn why this transformation is needed, what dummy variables are, and how to select the right encoding method.
What's included
2 videos2 readings1 assignment
Show info about module content
2 videos•Total 13 minutes
Sort numbers versus names•4 minutes
Label encoding in Python•9 minutes
2 readings•Total 16 minutes
Other approaches to data transformation•8 minutes
Reference guide: Data cleaning in Python •8 minutes
1 assignment•Total 6 minutes
Test your knowledge: Changing categorical data to numerical data•6 minutes
Input validation
Module 4•2 hours to complete
Module details
Input validation focuses on thoroughly checking data for completeness and to eliminate errors. In this module, you will learn why validation minimizes errors, how to detect improper inputs, and why it's essential for joining datasets.
What's included
2 videos1 assignment2 ungraded labs1 plugin
Show info about module content
2 videos•Total 14 minutes
The value of input validation•6 minutes
Input validation with Python•8 minutes
1 assignment•Total 6 minutes
Test your knowledge: Input validation•6 minutes
2 ungraded labs•Total 80 minutes
Activity: Validate and clean your data•60 minutes
Exemplar: Validate and clean your data•20 minutes
1 plugin•Total 10 minutes
Identify: Python functions for cleaning data•10 minutes
Review: Clean your data
Module 5•1 hour to complete
Module details
Review everything you’ve learned and take the final assessment.
What's included
1 reading1 assignment
Show info about module content
1 reading•Total 5 minutes
Wrap-up•5 minutes
1 assignment•Total 50 minutes
Course 6 challenge: Data cleaning•50 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor
Instructor ratings
Instructor ratings
We asked all learners to give feedback on our instructors based on the quality of their teaching style.
Grow with Google is an initiative that draws on Google's decades-long history of building products, platforms, and services that help people and businesses grow. We aim to help everyone – those who make up the workforce of today and the students who will drive the workforce of tomorrow – access the best of Google’s training and tools to grow their skills, careers, and businesses.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.9
33 reviews
5 stars
94.11%
4 stars
0%
3 stars
5.88%
2 stars
0%
1 star
0%
Showing 3 of 33
T
TH
5·
Reviewed on Dec 15, 2025
his course helped me understand the basics clearly and improved my practical skills. Well structured and easy to follow.
R
RA
5·
Reviewed on Feb 1, 2026
It's an amazing course with so much useful and practical material. It's ideal to start your Data science/Data Analysis journey with Python.
M
MM
5·
Reviewed on Nov 20, 2025
From the Python course, I learned the foundational skills of programming, such as writing code, using variables, loops, and functions, and understanding how to solve problems using Python.”
Organizations of all types and sizes have business processes that generate massive volumes of data. Every moment, all sorts of information gets created by computers, the internet, phones, texts, streaming video, photographs, sensors, and much more. In the global digital landscape, data is increasingly imprecise, chaotic, and unstructured. As the speed and variety of data increases exponentially, organizations are struggling to keep pace.
Data science is part of a field of study that uses raw data to create new ways of modeling and understanding the unknown. To gain insights, businesses rely on data professionals to acquire, organize, and interpret data, which helps inform internal projects and processes. Data scientists rely on a combination of critical skills, including statistics, scientific methods, data analysis, and artificial intelligence.
What do data professionals do?
A data professional is a term used to describe any individual who works with data and/or has data skills. At a minimum, a data professional is capable of exploring, cleaning, selecting, analyzing, and visualizing data. They may also be comfortable with writing code and have some familiarity with the techniques used by statisticians and machine learning engineers, including building models, developing algorithmic thinking, and building machine learning models.
Data professionals are responsible for collecting, analyzing, and interpreting large amounts of data within a variety of different organizations. The role of a data professional is defined differently across companies. Generally speaking, data professionals possess technical and strategic capabilities that require more advanced analytical skills such as data manipulation, experimental design, predictive modeling, and machine learning. They perform a variety of tasks related to gathering, structuring, interpreting, monitoring, and reporting data in accessible formats, enabling stakeholders to understand and use data effectively. Ultimately, the work of data professionals helps organizations make informed, ethical decisions.
Why start a career in data science?
Large volumes of data — and the technology needed to manage and analyze it — are becoming increasingly accessible. Because of this, there has been a surge in career opportunities for people who can tell stories using data, such as senior data analysts and data scientists. These professionals collect, analyze, and interpret large amounts of data within a variety of different organizations. Their responsibilities require advanced analytical skills such as data manipulation, experimental design, predictive modeling, and machine learning.
Do I need to take the course in a certain order?
We highly recommend taking the courses in the order presented, as the content builds on information from earlier courses. This is the sixth course in a series of six courses that make up the Google Data Analysis with Python Specialization.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.