When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 4 modules in this course
Finding stories in data using exploratory data analysis (EDA) is all about organizing and interpreting raw data. Python can help you do this quickly and effectively. In this course, you’ll learn how to use Python to perform the EDA practices of discovering and structuring.
By the end of this course, you will be able to:
• Identify ethical issues that may come up during the data “discovering” practice of EDA
• Use Python to merge or join data based on defined criteria
• Use Python to sort and/or filter data
• Use relevant Python libraries for cleaning raw data
• Recognize opportunities for creating hypotheses based on raw data
• Recognize when and how to communicate status updates and questions to key stakeholders
• Apply Python tools to examine raw data structure and format.
• Use the PACE workflow to understand whether given data is adequate and applicable to a data science project
• Differentiate between the common formats of raw data sources (json, tabular, etc.) and data types
Data professionals must understand data sources, file formats, and responsible parties during exploratory analysis. In this module, you will learn when to contact data owners for questions or issues, how to import data using Python and perform EDA using basic functions in Python.
What's included
5 videos3 readings1 assignment3 ungraded labs
Show info about module content
5 videos•Total 34 minutes
Introduction to data exploration•3 minutes
Yaser: Understand data to drive value•2 minutes
Where the data comes from•9 minutes
Find stories using the six exploratory data analysis practices •10 minutes
EDA using basic data functions with Python•10 minutes
3 readings•Total 24 minutes
Reference guide: The EDA process•8 minutes
Reference guide: Import datasets with Python•8 minutes
Reference guide: Pandas methods for the discovery of a dataset•8 minutes
1 assignment•Total 8 minutes
Test your knowledge: Discovering is the beginning of an investigation•8 minutes
3 ungraded labs•Total 100 minutes
Annotated follow-along resource: EDA using basic functions with Python•20 minutes
Activity: Discover what is in your dataset•60 minutes
Exemplar: Discover what is in your dataset•20 minutes
Understand data format
Module 2•1 hour to complete
Module details
EDA discovery uses targeted questioning to identify data gaps and missing information. In this module, you will learn how to formulate hypotheses, manipulate datetime strings and create bar graph visualizations.
What's included
2 videos1 reading1 assignment1 ungraded lab
Show info about module content
2 videos•Total 20 minutes
Discover what is missing from your dataset•6 minutes
Date string manipulations with Python•14 minutes
1 reading•Total 8 minutes
Reference guide: Datetime manipulation•8 minutes
1 assignment•Total 6 minutes
Test your knowledge: Understand data format•6 minutes
1 ungraded lab•Total 20 minutes
Annotated follow-along guide: Date string manipulations with Python•20 minutes
Create structure from raw data
Module 3•2 hours to complete
Module details
Structuring is an EDA practice for organizing data to learn more about it. In this module, you will learn different types of structuring methods, pandas tools for structuring datasets, and interpret histograms to understand data distributions.
Grow with Google is an initiative that draws on Google's decades-long history of building products, platforms, and services that help people and businesses grow. We aim to help everyone – those who make up the workforce of today and the students who will drive the workforce of tomorrow – access the best of Google’s training and tools to grow their skills, careers, and businesses.
Organizations of all types and sizes have business processes that generate massive volumes of data. Every moment, all sorts of information gets created by computers, the internet, phones, texts, streaming video, photographs, sensors, and much more. In the global digital landscape, data is increasingly imprecise, chaotic, and unstructured. As the speed and variety of data increases exponentially, organizations are struggling to keep pace.
Data science is part of a field of study that uses raw data to create new ways of modeling and understanding the unknown. To gain insights, businesses rely on data professionals to acquire, organize, and interpret data, which helps inform internal projects and processes. Data scientists rely on a combination of critical skills, including statistics, scientific methods, data analysis, and artificial intelligence.
What do data professionals do?
A data professional is a term used to describe any individual who works with data and/or has data skills. At a minimum, a data professional is capable of exploring, cleaning, selecting, analyzing, and visualizing data. They may also be comfortable with writing code and have some familiarity with the techniques used by statisticians and machine learning engineers, including building models, developing algorithmic thinking, and building machine learning models.
Data professionals are responsible for collecting, analyzing, and interpreting large amounts of data within a variety of different organizations. The role of a data professional is defined differently across companies. Generally speaking, data professionals possess technical and strategic capabilities that require more advanced analytical skills such as data manipulation, experimental design, predictive modeling, and machine learning. They perform a variety of tasks related to gathering, structuring, interpreting, monitoring, and reporting data in accessible formats, enabling stakeholders to understand and use data effectively. Ultimately, the work of data professionals helps organizations make informed, ethical decisions.
Why start a career in data science?
Large volumes of data — and the technology needed to manage and analyze it — are becoming increasingly accessible. Because of this, there has been a surge in career opportunities for people who can tell stories using data, such as senior data analysts and data scientists. These professionals collect, analyze, and interpret large amounts of data within a variety of different organizations. Their responsibilities require advanced analytical skills such as data manipulation, experimental design, predictive modeling, and machine learning.
Do I need to take the course in a certain order?
We highly recommend taking the courses in the order presented, as the content builds on information from earlier courses. This is the fifth course in a series of six courses that make up the Google Data Analysis with Python Specialization.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.