COURSE OBJECTIVE:
This course teaches participants the following skills:
• Derive insights from data using the analysis and visualization tools on Google Cloud Platform
• Interactively query datasets using Google BigQuery
• Load, clean, and transform data at scale
• Visualize data using Google Data Studio and other third-party platforms
• Distinguish between exploratory and explanatory analytics and when to use each approach
• Explore new datasets and uncover hidden insights quickly and effectively
• Optimizing data models and queries for price and performance
TARGET AUDIENCE:
• Data Analysts, Business Analysts, Business Intelligence professionals
• Cloud Data Engineers who will be partnering with Data Analysts to build scalable data solutions on Google Cloud Platform
COURSE PREREQUISITES:
To get the most out of this course, participants should have:
• Basic proficiency with ANSI SQL
COURSE CONTENT:
Module 1: Introduction to Data on the Google Cloud Platform
Before and Now: Scalable Data Analysis in the Cloud
Topics Covered
• Highlight Analytics Challenges Faced by Data Analysts
• Compare Big Data On-Premise vs on the Cloud
• Learn from Real-World Use Cases of Companies Transformed through Analytics on the Cloud
• Navigate Google Cloud Platform Project Basics
• Lab: Getting started with Google Cloud Platform
Module 2: Big Data Tools Overview
Sharpen the Tools in your Data Analyst toolkit
Topics Covered
• Walkthrough Data Analyst Tasks, Challenges, and Introduce Google Cloud Platform Data Tools
• Demo: Analyze 10 Billion Records with Google BigQuery
• Explore 9 Fundamental Google BigQuery Features
• Compare GCP Tools for Analysts, Data Scientists, and Data Engineers
• Lab: Exploring Datasets with Google BigQuery
Module 3: Exploring your Data with SQL
Get Familiar with Google BigQuery and Learn SQL Best Practices
Topics Covered
• Compare Common Data Exploration Techniques
• Learn How to Code High Quality Standard SQL
• Explore Google BigQuery Public Datasets
• Visualization Preview: Google Data Studio
• Lab: Troubleshoot Common SQL Errors
Module 4: Google BigQuery Pricing
Calculate Google BigQuery Storage and Query Costs
Topics Covered
• Walkthrough of a BigQuery Job
• Calculate BigQuery Pricing: Storage, Querying, and Streaming Costs
• Optimize Queries for Cost
• Lab: Calculate Google BigQuery Pricing
Module 5: Cleaning and Transforming your Data
Wrangle your Raw Data into a Cleaner and Richer Dataset
Topics Covered
• Examine the 5 Principles of Dataset Integrity
• Characterize Dataset Shape and Skew
• Clean and Transform Data using SQL
• Clean and Transform Data using a new UI: Introducing Cloud Dataprep
• Lab: Explore and Shape Data with Cloud Dataprep
Module 6: Storing and Exporting Data
Create new Tables and Exporting Results
Topics Covered
• Compare Permanent vs Temporary Tables
• Save and Export Query Results
• Performance Preview: Query Cache
• Lab: Creating new Permanent Tables
Module 7: Ingesting New Datasets into Google BigQuery
Bring your Data into the Cloud
Topics Covered
• Query from External Data Sources
• Avoid Data Ingesting Pitfalls
• Ingest New Data into Permanent Tables
• Discuss Streaming Inserts
• Lab: Ingesting and Querying New Datasets
Module 8: Data Visualization
Effectively Explore and Explain your Data through Visualization
Topics Covered
• Overview of Data Visualization Principles
• Exploratory vs Explanatory Analysis Approaches
• Demo: Google Data Studio UI
• Connect Google Data Studio to Google BigQuery
• Lab: Exploring a Dataset in Google Data Studio
Module 9: Joining and Merging Datasets
Combine and Enrich your Datasets with more Data
Topics Covered
• Merge Historical Data Tables with UNION
• Introduce Table Wildcards for Easy Merges
• Review Data Schemas: Linking Data Across Multiple Tables
• Walkthrough JOIN Examples and Pitfalls
• Lab: Join and Union Data from Multiple Tables
Module 10: Google BigQuery Table Deep Dive
What sets Cloud Architecture apart?
Topics Covered
• Compare Data Warehouse Storage Methods
• Deep-dive into Column-Oriented Storage
• Examine Logical Views, Date-Partitioned Tables, and Best Practices
• Query the Past with Time Travelling Snapshots
Module 11: Schema Design and Nested Data Structures
Model your Datasets for Scale in Google BigQuery
Topics Covered
• Compare Google BigQuery vs Traditional RDBMS Data Architecture
• Normalization vs Denormalization: Performance Tradeoffs
• Schema Review: The Good, The Bad, and The Ugly
• Arrays and Nested Data in Google BigQuery
• Lab: Querying Nested and Repeated Data
Module 12: Advanced Visualization with Google Data Studio
Create Pixel-Perfect Dashboards
Topics Covered
• Create Case Statements and Calculated Fields
• Avoid Performance Pitfalls with Cache considerations
• Share Dashboards and Discuss Data Access considerations
• Lab: Visualizing Insights with Google Data Studio
Module 13: Advanced Functions and Clauses
Dive Deeper into Advanced Query Writing with Google BigQuery
Topics Covered
• Review SQL Case Statements
• Introduce Analytical Window Functions
• Safeguard Data with One-Way Field Encryption
• Discuss Effective Sub-query and CTE design
• Compare SQL and Javascript UDFs
• Lab: Deriving Insights with Advanced SQL Functions
Module 14: Optimizing for Performance
Troubleshoot and Solve Query Performance Problems
Topics Covered
• Avoid Google BigQuery Performance Pitfalls
• Prevent Hotspots in your Data
• Diagnose Performance Issues with the Query Explanation map
• Lab: Optimizing and Troubleshooting Query Performance
Module 15: Advanced Insights
Think, Analyze, and Share Insights like a Data Scientist
Topics Covered
• Distill Complex Queries
• Brainstorm Data-Driven Hypotheses
• Think like a Data Scientist
• Introducing Cloud Datalab
• Lab: Reading a Google Cloud Datalab notebook
Module 16: Data Access
Keep Data Security top-of-mind in the Cloud
Topics Covered
• Compare IAM and BigQuery Dataset Roles
• Avoid Access Pitfalls
• Review Members, Roles, Organizations, Account Administration, and Service Accounts
FOLLOW ON COURSES:
Gain a wider view of Google Cloud Platform using Big Data and ML Fundamentals (GO8325)