Advanced 8 Steps 7h 30m 46 Credits
Big data, machine learning, and scientific data? It sounds like the perfect match. In this advanced-level quest, you will get hands-on practice with GCP services like Big Query, Dataproc, and Tensorflow by applying them to use cases that employ real-life, scientific data sets. By getting experience with tasks like earthquake data analysis and satellite image aggregation, Scientific Data Processing will expand your skill set in big data and machine learning so you can start tackling your own problems across a spectrum of scientific disciplines.
Prerequisites
This Quest requires hands-on experience with GCP data processing and machine learning services like Dataproc, Dataflow, and Cloud ML Engine. It is recommended that the student have at least earned a Badge by completing the hands-on labs in the Baseline: Data, ML, and AI Quest before beginning.Quest Outline
Introduction to SQL for BigQuery and Cloud SQL
In this lab you will learn fundamental SQL clauses and will get hands on practice running structured queries on BigQuery and Cloud SQL.
Rent-a-VM to Process Earthquake Data
In this lab you spin up a virtual machine, configure its security, access it remotely, and then carry out the steps of an ingest-transform-and-publish data pipeline manually. This lab is part of a series of labs on processing scientific data.
Weather Data in BigQuery
In this lab you analyze historical weather observations using BigQuery and use weather data in conjunction with other datasets. This lab is part of a series of labs on processing scientific data.
Distributed Image Processing in Cloud Dataproc
In this lab, you will learn how to use Apache Spark on Cloud Dataproc to distribute a computationally intensive image processing task onto a cluster of machines.
Distributed Computation of NDVI from Landsat Images Using Cloud Dataflow
In this lab you process Landsat data in a distributed manner using Apache Beam and Cloud Dataflow. This lab is part of a series of labs on processing scientific data.
Analyzing Natality Data Using Datalab and BigQuery
In this lab you analyze a large (137 million rows) natality dataset using Google BigQuery and Cloud Datalab. This lab is part of a series of labs on processing scientific data.
Predicting Baby Weight with TensorFlow on Cloud ML Engine
In this lab you train, evaluate, and deploy a machine learning model to predict a baby’s weight. You then send requests to the model to make online predictions. This lab is part of a series of labs on processing scientific data.
Image Classification of Coastline Images Using TensorFlow on Cloud ML Engine
In this lab, you carry out a transfer learning example based on Inception-v3 image recognition neural network. The objective is to classify coastline images captured using drones based on their potential for flood damage.