Slide1:
Ingest & Process:
- Pub/Sub to collect streaming events (e.g., user clicks on ecommerce platform, IoT
sensor data), kind of like continuous data.
- Dataflow to build unified pipelines handling both batch ETL (nightly workloads) and
streaming transforms (real-time metrics)
- Dataproc for managed Apache Spark/Hadoop clusters when you need familiar
big-data APIs
- Cloud Data Fusion for low-code/no-code data integration.
Slide 2:
We have got you covered for Structured, unstructured, sql/no-sql or scalability needs.
Slide 3:
OLTP: Larger number of small atomic transactions. Common use cases: order entry,
payment processing, inventory management
OLAP: Designed for complex queries, and large scale data analysis. Common use
cases: sales reporting, trend analysis, business intelligence
Slide 4: Analyze & Visualize:
- BigQuery as fully managed data warehouse (storage + analytics) using SQL over
petabytes
- BigQuery: serverless, highly parallel columnar engine; supports JOINs, window
functions, geospatial, ML extensions
- Looker: semantic modeling layer (LookML); interactive dashboards; data governance
and access controls
- Can be used for business intelligence: dashboards, data modeling, embedded
analytics .
- Use Case: run monthly sales aggregation in minutes, then embed charts in internal
portal
Slide 5:
Vertex AI: unified environment for data prep, training, tuning, deployment, and monitoring
- AutoML: no-code model building for vision, language, tabular data
- Workbench / Colab Enterprise: managed notebooks with GPUs/TPUs for custom code
- Vertex AI Studio & Model Garden: catalog of pre-trained large language and multimodal
models for customization
Pre-built solutions:
- Document AI for document parsing and form extraction
- Contact Center AI for conversational agents powered by LLMs
- Retail Search & Recommendations for e-commerce personalization
- Healthcare Data Engine for clinical data analytics
Slide 6:
Slide 1: Defining Artificial Intelligence, Machine Learning, and Generative AI
- Artificial Intelligence: computers mimicking human intelligence (e.g., robots, self-driving
cars)
Artificial Intelligence umbrella covers any technique enabling machines to simulate
human reasoning or perception
- Machine Learning: subset of AI where computers learn from data without explicit
programming
- Machine Learning revolves around algorithms that improve performance as they are
exposed to more data
- Deep Learning: subset of machine learning using multi-layer neural networks for
complex feature extraction
- Deep Learning adds hidden layers between inputs and outputs, enabling hierarchical
feature learning (e.g., convolutional neural networks for images)
- Generative Artificial Intelligence: uses large deep-learning models to create new
content on demand
- Generative AI relies on transformer-based architectures (e.g., GPT, BERT, Gemini) to
produce text, images, or other media
Slide 2: Supervised vs Unsupervised Learning
- Supervised Learning: uses labeled data with known outcomes to train models
- Unsupervised Learning: uses unlabeled data to discover hidden patterns or groupings
Speaker Notes:
- Supervised Learning example: classifying images as cats or dogs when labels are
provided
- Unsupervised Learning example: clustering dog breeds without predefined labels to
find natural groupings
- Key distinction: supervised is task-driven and goal-oriented; unsupervised is data-driven
and exploratory
- Labeled data supplies the “answer key”; unlabeled data requires pattern detection
algorithms
Slide 3: Types of Supervised Learning
- Classification: predicts discrete categories (e.g., cat vs dog)
- Regression: predicts continuous numeric values (e.g., future sales)
Speaker Notes:
- Classification models include logistic regression, decision trees, support vector
machines, and neural networks for categorical outputs
- Regression models include linear regression, ridge regression, and neural networks for
numeric forecasting
- Choose classification when output is a class label; choose regression when output is a
quantity
- Example: predicting customer churn as a yes/no classification; forecasting monthly
revenue as regression
Slide 4: Types of Unsupervised Learning
- Clustering: groups similar data points into clusters (e.g., customer segmentation)
- Association: discovers rules that describe relationships between variables (e.g.,
market-basket analysis)
- Dimensionality Reduction: reduces feature count to simplify models and visualize data
(e.g., principal component analysis)
Speaker Notes:
- Clustering algorithms include k-means, hierarchical clustering, and DBSCAN for
identifying natural groupings
- Association rule mining uses algorithms like Apriori to find
item-set correlations (e.g., {bread, milk} → butter)
- Dimensionality reduction techniques like principal component analysis and t-SNE help
remove noise and speed up training
- Use clustering for segmentation tasks, association for recommendation systems, and
dimensionality reduction for preprocessing