0% found this document useful (0 votes)
145 views54 pages

OCI AI Foundations

The document explains the concepts of Artificial General Intelligence (AGI) and Artificial Intelligence (AI), detailing their applications in daily life and various domains such as language, speech, and vision. It outlines the differences between AI, Machine Learning (ML), and Deep Learning (DL), along with their respective algorithms and use cases. Additionally, it provides an overview of Oracle Cloud Infrastructure (OCI) AI services, including pre-trained models for image and text analysis.

Uploaded by

Siwada Somsuk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views54 pages

OCI AI Foundations

The document explains the concepts of Artificial General Intelligence (AGI) and Artificial Intelligence (AI), detailing their applications in daily life and various domains such as language, speech, and vision. It outlines the differences between AI, Machine Learning (ML), and Deep Learning (DL), along with their respective algorithms and use cases. Additionally, it provides an overview of Oracle Cloud Infrastructure (OCI) AI services, including pre-trained models for image and text analysis.

Uploaded by

Siwada Somsuk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 54

Human intelligence encompasses our ability to learn through observation, reason through

abstract concepts, communicate verbally and nonverbally, respond to complex situations, and
create original works. When machines replicate these capabilities—such as sensory perception,
motor skills, learning, and problem-solving—this is known as Artificial General Intelligence
(AGI). When AGI is applied to narrow tasks, it becomes Artificial Intelligence (AI).

AI is already integrated into daily life, often unnoticed, through applications like object
recognition, spam detection, code generation, and price prediction. Due to the exponential
growth of data, AI has become essential in processing information faster and more effectively
than humans.

There are two main drivers behind the adoption of AI:

1. Automation of routine tasks (e.g., credit approvals, insurance claims, product


recommendations).
2. Augmenting creativity and decision-making, enabling machines to assist in
storytelling, design, music, and humor.

AI spans several domains, including:

 Language (translation),
 Vision (image classification),
 Speech (text-to-speech),
 Recommendation systems,
 Anomaly detection (fraud),
 Reinforcement learning (self-driving cars),
 Forecasting (weather), and
 Generative AI (creating images from text).

🔍 Overview: AI Tasks & Data in Language, Speech, and


Vision Domains

📘 1. Language AI Tasks

Types of Tasks:

 Text-related AI: Uses input text to produce outputs like:


o Language detection
o Entity/keyphrase extraction
o Translation
 Generative AI: Model creates output text, such as:
o Stories, poems
o Text summarization
o Q&A (e.g., ChatGPT)

How Text is Processed:

 Sequential Data: Sentences → Words → Numbers


 Tokenization: Converts words to numbers
 Padding: Ensures equal sentence length for training
 Embedding: Captures similarity between words/sentences
 Similarity Measures: Dot product, cosine similarity

Model Architectures:

 RNNs: Sequential processing with memory


 LSTMs: Retain context using gates
 Transformers: Process all data in parallel using self-attention

🎙️2. Speech/Audio AI Tasks

Types of Tasks:

 Audio-related AI:
o Speech-to-text
o Speaker recognition
o Voice conversion
 Generative AI:
o Music composition
o Speech synthesis

How Audio is Processed:

 Sampling Rate: Commonly 44.1 kHz (44,100 samples/sec)


 Bit Depth: Determines richness of each sample
 Contextual Analysis: Requires multiple samples for meaningful insight

Model Architectures:

 RNNs / LSTMs / Transformers


 Waveform Models: Work directly on audio waves
 Variational Autoencoders (VAEs): For generative audio
 Siamese Networks: For comparison tasks like speaker verification

🖼️3. Vision AI Tasks


Types of Tasks:

 Image-related AI:
o Image classification
o Object detection
o Facial recognition
 Generative AI:
o Text-to-image generation
o Style transfer / super-resolution
o Creating 3D models or synthetic data

How Images are Processed:

 Pixels: Grayscale or color (RGB)


 Context: A single pixel has no meaning; the entire image provides context

Model Architectures:

 CNNs: Learn visual features hierarchically


 YOLO (You Only Look Once): Real-time object detection
 GANs (Generative Adversarial Networks): Generate realistic images/videos

⚙️Other Notable AI Tasks

Task Data Type Use Cases


Anomaly Detection Time series Fraud detection, machine failure
Recommendations User/product data Product suggestions, personalization
Forecasting Time series Weather, stock prices, demand planning

Oracle Cloud Infrastructure (OCI) AI Services Overview


OCI offers powerful pre-trained AI models for:

 ✅ Image Analysis (Vision AI)


 ✅ Text Analysis & Translation (Language AI)
 ✅ Document Understanding (Advanced OCR)

👁️Vision AI Service
1. Image Classification
 Labels objects/concepts in an image.
 Provides confidence scores (e.g., “vegetation: 99.23%”).
 Example: Detects zebra, grassland, mammal.

2. Object Detection

 Draws bounding boxes around detected objects.


 Detects multiple entities (e.g., people, cars, fruit in baskets).
 Labels each object with high accuracy.

3. Text Detection (OCR)

 Extracts printed text from images.


 Detects multiple text styles, sizes, and orientations.
 Can extract:
o License plates
o Posters or signs
o Fonts in varied styles

4. Document AI / Document Understanding

 Extracts:
o Raw text
o Key-value pairs (e.g., date, total, tax)
o Structured tables (e.g., receipts, invoices)
 Example: A receipt was scanned, extracting:
o Date & time
o Itemized totals
o Terminal info

💬 Language AI Service
1. Text Analysis

Applies pre-trained NLP models for:

 Language detection
 Text classification (e.g., domain = science & tech)
 Entity extraction (e.g., food, products, events)
 Key phrase extraction
 Sentiment analysis
o Aspect-based (e.g., “early computers” → positive)
o Sentence-level
 PII Detection (e.g., date of birth, location, events)

2. Text Translation

 Input language is auto-detected or manually set.


 Translates into many languages (e.g., French, Japanese).
 Fast and accurate real-time results.

3. Custom Model Training

 Users can train their own language models with domain-specific data.

🧪 Summary: Why Use OCI AI Services?


Feature Vision AI Language AI
Pre-trained Models ✅ Yes ✅ Yes
Custom Model Training ⬜ Not discussed here ✅ Available
Classification, OCR, Object
Use Cases Sentiment, Translation, PII
Detection
Output Labels, Bounding Boxes, Tables Entities, Sentiments, Translations

🤖 AI vs ML vs DL — What's the Difference?


🧠 Artificial Intelligence (AI)

 Definition: AI is the broad field of creating machines that can perform tasks requiring
human-like intelligence.
 Example: A self-driving car making decisions like detecting pedestrians and navigating
traffic.
 Scope: Includes decision-making, language understanding, visual recognition, etc.

📊 Machine Learning (ML)

 Definition: A subset of AI that enables machines to learn from data using algorithms,
without being explicitly programmed.
 Example: A spam filter that learns from user behavior to identify spam emails.
🔁 Types of ML:
Type Description Example Use Case

Learns from labeled data to predict or classify Credit card approval, email
Supervised Learning
new data classification

Unsupervised Finds hidden patterns or clusters in unlabeled Customer segmentation,


Learning data content grouping

Reinforcement Learns by interacting with the environment using Game playing (chess),
Learning rewards and punishments autonomous vehicles

🤿 Deep Learning (DL)

 Definition: A subfield of ML using deep neural networks to learn from complex data
patterns.
 Example: An app that can identify cats or dogs in images.

🔧 Key Concepts:

 Neural Networks: Modeled after human brain neurons, layers work together to
approximate complex functions.
 Use Cases: Image recognition, voice assistants, autonomous driving, etc.

🧪 Algorithms & Approaches


Area Models / Algorithms Purpose

Linear regression, Decision Trees, Neural


Supervised ML Predict outcomes from labeled data
Nets

Unsupervised ML K-means, PCA, Hierarchical Clustering Discover structure/patterns in data

Reinforcement
Q-learning, Deep Q-Networks (DQN) Trial-and-error based decision-making
ML

Learning high-level features (images,


Deep Learning CNN, RNN, Transformers
sequences)

🎨 Generative AI
 Definition: A subcategory of ML that creates content (text, images, audio).
 Example: ChatGPT generates human-like responses; other models create music, images,
etc.
 Uses: Chatbots, image generation (DALL·E), synthetic voice, creative design.

🎯 Recap: Key Differences


Feature AI ML DL

Broad – human-like Subset – learning from


Scope Subset of ML – complex patterns
intelligence data

Example Self-driving car Spam detection Cat vs dog image classification

Requires large datasets &


Dependency Not always data-driven Data is essential
compute

Algorithms Logic, rules, ML, etc. Regression, SVM, KNN CNN, RNN, Transformers

🧠 Machine Learning Foundations – Summary

What is Machine Learning (ML)?


Machine Learning is a subset of Artificial Intelligence (AI) that enables computer systems to
learn from data and make predictions or decisions without being explicitly programmed.

🔍 How Does It Work?

1. Input Features: Data points like texture, color, etc.


2. Labels: The correct output (e.g., "dog" or "cat").
3. Training: The model learns the relationship between input features and output labels.
4. Inference: The trained model predicts outputs for new data.

🧪 Types of Machine Learning

Use of
Type Description Examples
Labels
Disease detection, spam filtering,
Supervised Learning Learns from labeled data ✅ Yes
stock prediction
Unsupervised Finds patterns in ❌ No Customer segmentation, anomaly
Use of
Type Description Examples
Labels
Learning unlabeled data detection
Reinforcement Learns from feedback Robotics, autonomous driving,
⚠️Indirect
Learning and rewards game playing

💡 Real-World Applications

 E-commerce: Product recommendations


 Streaming Platforms: Personalized content suggestions
 Email: Spam detection
 Autonomous Vehicles: Navigation and decision-making
 Healthcare: Disease prediction

🧠 Supervised Learning: Classification – Summary

Supervised Learning is a machine learning approach where a model learns from labeled data.
There are two types of outputs:

 Continuous output → Regression


 Categorical output → Classification

🔍 What is Classification?

Classification is a supervised ML task where the goal is to assign a label or category to an input
based on its features.

 Binary Classification:
e.g., Spam detection → "Spam" or "Not Spam"
 Multi-class Classification:
e.g., Sentiment analysis → "Positive", "Negative", "Neutral"

🧪 Common Classification Algorithm: Logistic Regression

 Predicts a binary outcome (True/False, Pass/Fail)


 Uses a sigmoid function (S-curve) instead of a straight line
 Outputs a probability between 0 and 1
 Decision threshold (e.g., 0.5) is used to classify the result

Example:
Hours Studied Probability to Pass Classification
6 0.80 Pass
4 0.20 Fail

🌸 Demo: Iris Dataset (Multi-Class Classification)

 Dataset: 150 flower instances from 3 classes


o Iris setosa, Iris versicolor, Iris virginica
 Features:
o Sepal length, Sepal width, Petal length, Petal width
 Model Used: Logistic Regression
 Goal: Predict the flower class based on the features

📘 Supervised Learning: Linear Regression – Summary

🧠 What is Supervised Learning?

Supervised learning is a machine learning approach where:

 The model is trained on labeled data


 It learns a mapping between input features and output labels
 Output can be:
o Categorical → use Classification
o Continuous → use Regression

🏠 Example Use Case: House Price Prediction

 Input (Feature): Size of the house (in sq. ft.)


 Output (Label): Price of the house (in dollars)

The model learns from a dataset of previous house sizes and their corresponding prices, then
predicts the price for a new house size.

📊 Visualizing the Data

 Plotted on a scatter plot


 Shows positive correlation (house size ↑ → price ↑)
 A straight line (best-fit line) is used to approximate the relationship
🧮 Linear Regression Model

 Model equation:

f(x)=w⋅x+bf(x) = w \cdot x + bf(x)=w⋅x+b

where:

o www: weight (slope) — how much price changes per unit size
o bbb: bias (intercept) — base price when size is 0
 The model:
o Learns w and b by minimizing the difference between predicted and actual
prices
o Uses a method to reduce loss, which measures prediction error

⚠️Loss and Optimization

 Error = Actual value − Predicted value


 Loss = Squared error
 The algorithm adjusts w and b to minimize loss during training

✅ Inference (Prediction)

Once trained, the model can:

 Take a new input (e.g., 1,100 sq. ft.)


 Predict its price using the learned function

📌 In Summary

Step Description

1. Provide training data (size → price)

2. Fit a linear function f(x)=wx+bf(x) = wx + bf(x)=wx+b

3. Use loss function to optimize model

4. Predict new prices with the trained model


🧠 Unsupervised Machine Learning – Summary
📌 What is Unsupervised Learning?

Unsupervised learning is a type of machine learning where:

 No labeled outputs are provided.


 The algorithm learns patterns or structures from the data.
 It’s used to group or cluster similar data points without predefined categories.

🧩 Everyday Analogy

 Imagine giving a child a basket of colored LEGO pieces without instructions.


o The child may group them by color, size, or type based on observed similarities.
 Similarly, clustering groups similar data without explicit labels.

🍎 Example

You have a basket of:

 Apples (red & round)


 Bananas (yellow & long)
 Oranges (orange & round)

➡️Without labels, you might cluster:

 All round red fruits together


 All elongated yellow fruits together
➡️That’s unsupervised learning in action.

🔄 Clustering and Outliers

 Clustering: Groups similar data points together.


 Outlier: A data point that doesn’t belong to any cluster (e.g., grapes among apples &
bananas).

🛠️Key Use Cases


Use Case Description
Group customers based on purchase behavior (e.g., fitness
📊 Market Segmentation
products buyers)
💳 Outlier Analysis Detect fraudulent credit card transactions
🎬 Recommendation
Suggest movies/music based on similar user behavior
Systems

📐 Similarity

 Similarity measures how close two data points are.


 Values range from 0 (not similar) to 1 (very similar).
 Example: Apple and Cherry have high similarity based on color.

📈 Common Similarity Metrics

Metric Description
Euclidean Distance Straight-line distance between points
Manhattan Distance Distance along axes (like city blocks)
Cosine Similarity Measures angle between vectors
Jaccard Similarity Compares shared vs. total features

🧭 Clustering Workflow

1. Prepare Data
o Remove missing values
o Normalize and scale features
2. Create Similarity Matrix
o Choose the right similarity metric based on data and goals
3. Run Clustering Algorithm
o Types include:
 Partition-based (e.g., K-means)
 Hierarchical
 Density-based (e.g., DBSCAN)
 Distribution-based (e.g., GMM)
4. Interpret and Adjust
o Analyze clusters and iteratively refine
o No ground truth, so evaluation is exploratory

✅ In Summary
Unsupervised learning is about discovering hidden patterns in data:

 No labels are needed.


 You can cluster, detect outliers, or personalize recommendations.
 It's exploratory, iterative, and powerful for understanding structure in your data.

🎓 Reinforcement Learning – Summary


📌 What Is It?

Reinforcement Learning (RL) is a type of machine learning where an agent learns by


interacting with its environment, receiving rewards or penalties based on its actions —
similar to how we train pets through praise or correction.

🧠 Key Terminology

Term Definition
Agent The learner or decision-maker (e.g., self-driving car, robotic arm, dog)
Environment The context in which the agent operates (e.g., road, warehouse)
State A snapshot of the environment at a given moment
Action A decision or move the agent can take (e.g., turn left, pick item)
Feedback from the environment: positive for good actions, negative for bad
Reward
ones
Policy The strategy the agent follows to decide which action to take in a given state
Optimal Policy The best strategy that maximizes cumulative rewards over time

🐕 Simple Analogy

 Training a dog:
o Agent: Dog
o Environment: Training space
o Reward: Treat for right trick
o Penalty: No treat or correction
➡ Over time, the dog learns the right actions to maximize rewards.

🤖 Practical Examples

 Autonomous vehicles: Learn to navigate roads safely using sensor data.


 Smart assistants: Adapt to your voice and preferences (e.g., Siri, Alexa).
 Robotics: Optimize item placement in a warehouse using trial and error.
 Gaming: AI opponents improve as you play more.

🦾 Robotic Arm Case Study

Goal: Teach a robotic arm to place items accurately in a warehouse.

1. Define the environment: Arm, items, layout, targets.


2. Define state: Position of arm, items, and goals.
3. Define actions: Pick, move, place, etc.
4. Define rewards: +1 for correct placement, -1 for dropping or damaging.
5. Train:
o Initially tries random actions.
o Learns which ones lead to high rewards.
o Gradually improves performance using algorithms like Q-Learning or Deep Q-
Networks (DQN).

🧠 Why It Matters

Reinforcement Learning is useful for problems where:

 You can’t define exact rules.


 Outcomes are only known after actions.
 The environment is dynamic or complex.

🔍 Introduction to Deep Learning

Deep Learning is a subset of machine learning that uses artificial neural networks (ANNs) to
solve tasks like image classification. It’s powerful because:

 It can process raw data (like image pixels).


 It automatically extracts features without manual input.
 It scales well with large datasets using parallel computations (e.g., via GPUs).

🧠 Why Deep Learning?

 Traditional ML requires manual feature engineering.


 Deep learning extracts hierarchical features directly from data.
 Well-suited for complex, high-dimensional data (images, audio, text, video).
🕰️History of Deep Learning (Milestones)

Year Milestone
1950s Artificial neuron, perceptron concepts introduced
1980s Backpropagation algorithm
1990s CNNs for image analysis
2000s GPU adoption for training
2010+ Explosion in applications – CV, NLP, speech
2012 AlexNet, Deep Q-Network
2016+ Rise of generative models (GANs, transformers)

📊 Deep Learning Applications by Data Type

Data Type Example Applications Common Architectures


Image Image classification, object detection CNN, GAN
Text Sentiment analysis, translation RNN, LSTM, Transformers
Audio Speech-to-text, music generation RNN, WaveNet
Video Action recognition, video generation CNN+RNN, Transformers

🔧 How ANN Works (Simplified)

1. Layers:
o Input layer: receives data (e.g., 28x28 pixels)
o Hidden layers: extract features
o Output layer: returns predictions (e.g., digit 0–9)
2. Neurons:
o Process inputs, apply weights, bias, and activation function
3. Forward Pass:
o Data flows through the network; outputs are generated
4. Backpropagation:
o Compares prediction with true label
o Error is calculated and weights adjusted to minimize loss
o This is done repeatedly (over many images/batches) = training

🧪 Example – Digit Recognition with ANN

 Input: 28×28 pixel images of digits


 Output: 10 neurons representing digits 0–9
 Architecture:
o Input layer: 784 neurons
o Hidden layers: 2 layers with 16 neurons each
o Output layer: 10 neurons
 Training:
o Show labeled images
o ANN adjusts weights via backpropagation
o Learns to recognize digits correctly

🎯 Summary

 Deep learning enables end-to-end learning from raw data.


 It’s revolutionized fields like computer vision, NLP, and generative AI.
 Core technologies include ANNs, CNNs, RNNs, LSTMs, and transformers.
 Success depends on large datasets, compute power, and proper architectures.

🧠 Deep Learning for Sequential Data


✅ What is Sequential Data?

 Ordered list of data points/events


 Requires models that capture temporal or positional dependencies

📌 Real-World Applications

Domain Use Case


NLP Sentiment analysis, machine translation, text generation
Speech Speech-to-text conversion
Music Music composition & generation
Sign Language Gesture recognition
Finance / Weather Time series forecasting

🔄 Recurrent Neural Networks (RNN)


🔹 Key Characteristics

 Maintains a hidden state (memory) across time steps


 Input at time t depends on output of time t−1
 Capable of learning temporal dependencies

🔸 RNN Architectures
Architecture Description Example
One-to-One Standard neural net (not sequential) Image classification
One-to-Many 1 input → sequence of outputs Music generation
Many-to-One Sequence of inputs → 1 output Sentiment analysis
Many-to-Many Sequence in → sequence out Machine translation, NER

⚠️Limitation

 Suffers from vanishing gradient problem → poor at capturing long-term dependencies

⏳ Long Short-Term Memory (LSTM)


🔹 What is LSTM?

 A type of RNN that can remember long-term dependencies


 Uses cell state and gates to control memory

🔸 Components of LSTM

Gate Function
Input Gate Decides what to add to memory
Forget Gate Decides what to discard
Output Gate Decides what to output as hidden state

🧭 Workflow at Each Timestep

1. Takes current input, previous hidden state, and previous cell state
2. Gates filter and process this information
3. Updates the cell state
4. Outputs current hidden state (used for next step)

📝 Summary
 Use RNN for short-sequence tasks and LSTM for longer dependencies
 Both are essential in processing data where order matters
 LSTM resolves memory decay problems that RNNs face over long sequences

🧠 Convolutional Neural Networks (CNNs) — Overview


CNNs are a type of deep learning model designed for processing grid-like data, especially
images and videos.

📚 Common Deep Learning Architectures

Model Purpose
FNN / MLP Feedforward Neural Network (basic ANN)
CNN Detects spatial/local patterns in images/videos
RNN Handles sequential data (e.g. text, time series)
LSTM Specialized RNN for long-term memory
Autoencoder Feature extraction, anomaly detection
GAN Generates synthetic data (images, audio, text)
Transformer State-of-the-art in NLP and generative tasks

🧩 CNN Architecture
1️⃣ Input Layer

 Accepts 2D image data (unlike flattening in ANN).

2️⃣ Feature Extraction Layers

Layer Analogy Description


Blueprint Applies filters (kernels) to detect patterns like
Convolutional Layer
Detector edges or shapes
Activation Function (e.g. Pattern
Adds non-linearity to capture complex features
ReLU) Highlighter
Pooling Layer (e.g. Max Room Reduces spatial dimensions and computational
Pooling) Summarizer load

3️⃣ Classification Layers

Layer Analogy Function


Fully Connected
House Expert Learns final classification from extracted features
Layer
Softmax Layer Guess Maker Converts raw scores to class probabilities
Quality Prevents overfitting by randomly dropping neurons
Dropout Layer
Checker during training
🔍 CNN Strengths
✅ Automatically extracts meaningful patterns from raw data
✅ Works well with 2D data (images, videos)
✅ Scales to complex image-related tasks

⚠️CNN Limitations
Issue Description
Computational Cost Requires powerful hardware (GPUs)
Overfitting Sensitive to small data sets or class imbalance
Interpretability Black-box nature makes it hard to explain
Input Sensitivity Small changes in images can affect output

📸 Real-World Applications
 🐱 Image Classification: e.g., Cat vs Dog
 📦 Object Detection: Bounding boxes (YOLO, SSD)
 🧠 Medical Imaging: Tumor detection, X-ray analysis
 🧍 Face Recognition: Facial identification and verification
 🚗 Autonomous Driving: Detecting lanes, signs, pedestrians
 🌍 Remote Sensing: Satellite image segmentation, land use classification

🧾 Summary
CNNs are powerful deep learning models tailored for image data. Their architecture mimics
human visual perception using layers to extract and classify features. While they are excellent for
visual tasks, they require significant computation and careful handling to avoid overfitting and
ensure robust performance.

🧪 Demonstration Overview: Classifying Non-Linear Data


with Deep Learning
🌀 Problem

Use a deep learning classifier to separate two concentric circles (non-linear data) generated
with make_circles from scikit-learn.
 Label 0: Outer circle
 Label 1: Inner circle
 Objective: Predict the correct label for new data points

🔧 Dataset Generation
make_circles() Parameters:

Parameter Description
n_samples=300 Total data points (150 per class)
noise=0.1 Adds randomness to points (higher = more scattered)
factor=0.5 Ratio of inner to outer circle radius
random_state=1 Ensures reproducibility

Example:

python
CopyEdit
from sklearn.datasets import make_circles
X, y = make_circles(n_samples=300, noise=0.1, factor=0.5, random_state=1)

🧠 Using MLP Classifier


🧮 Classifier: Multi-layer Perceptron (MLP)

 Model: One hidden layer


 Activation: ReLU (to support non-linearity)
 Training: Uses backpropagation

🔢 Impact of Hidden Layer Size

Neurons Result
1 Poor separation; all points labeled 0
2 Slight improvement; still inaccurate
3–6 Increasing complexity of decision boundary, better separation and accuracy

🖼️Visualization (with Interactive Widget)


How It's Built:
 Slider changes the number of hidden neurons
 Each change re-trains and redraws the plot

Plot Features:

 Background shading shows decision boundary


 Red points = Label 0, Green points = Label 1
 Uses matplotlib.pyplot.contourf() for visualization

🧱 Code Logic Summary


1. Create MLP Classifier
python
CopyEdit
from sklearn.neural_network import MLPClassifier
clf = MLPClassifier(hidden_layer_sizes=(n,), activation='relu', max_iter=1000,
random_state=1)

2. Train Model
python
CopyEdit
clf.fit(X, y)

3. Predict on Grid Data

 Generate a grid of X and Y values over the data range


 Predict on this grid to visualize decision boundary

4. Plot with Contour and Scatter


python
CopyEdit
plt.contourf(xx, yy, Z.reshape(xx.shape), ...)
plt.scatter(X[y==0][:, 0], X[y==0][:, 1], color='red')
plt.scatter(X[y==1][:, 0], X[y==1][:, 1], color='green')

📌 Key Takeaways
 MLP (deep learning) can model non-linear boundaries better than linear models.
 Increasing hidden layer size improves accuracy and boundary complexity.
 Visualization helps understand model performance and boundaries.
 make_circles is a great benchmark for testing non-linear classification performance.
🧠 Introduction to Generative AI – Summary

1. What is AI, ML, and Deep Learning?

 AI (Artificial Intelligence): The broader concept of machines mimicking human


intelligence.
 Machine Learning (ML): A subset of AI where algorithms learn from past data to make
predictions or detect patterns.
 Deep Learning (DL): A subset of ML using neural networks to learn from complex
data (e.g., images, speech).

2. What is Generative AI (GenAI)?

 A type of AI that creates new content (text, images, music, video, etc.).
 Based on deep learning, especially neural networks trained on large datasets.
 It learns patterns from data and generates original outputs (not copied).

📌 Example: If trained on pictures of dogs, GenAI can draw new dogs by learning their features
(ears, teeth, etc.).

3. How Generative AI Differs from Traditional Machine Learning:

Traditional ML Generative AI
Requires labeled data Learns from unlabeled data
Output: predictions or labels Output: new content (text, images)
Used for classification, forecasts Used for creation and automation

4. Types of Generative AI Models:

 Text-based GenAI:
Generates text, dialogue, articles, or even code (e.g., ChatGPT, Copilot).
 Multimodal GenAI:
Can understand and generate across multiple media types:
📄 Text, 🖼️Images, 🎵 Audio, 🎥 Video.

5. Applications of GenAI:

 Content creation (marketing, media)


 Image/video generation
 Medical imaging, drug discovery
 Scientific research
 Personalized chatbots
 Code generation
 Education, design, and more

📚 Lesson Summary: Large Language Models (LLMs)

🧠 What is a Language Model?

 A language model is a probabilistic model that predicts the next word in a sequence
based on previous words.
 It assigns probabilities to each word in its vocabulary to determine which is most likely
to follow.
o Example:
_“They sent me a __” → likely choices:
 Dog (0.45)
 Lion (0.03)
 Elephant (0.02)
 EOS (End of sentence)

🔍 How Does It Work?

 Highest probability word is chosen to continue the sentence.


 The output word is appended and the model predicts again.
 The process continues until an EOS (End of Sentence) token is predicted.

👉 Illustration:
Input: “They sent me a” → Model predicts: dog → Then EOS → Sentence ends.

📏 Why “Large” in LLMs?

 "Large" refers to the number of parameters in the model.


 Parameters = weights in the neural network that get optimized during training.
 Some LLMs have hundreds of billions of parameters.
 But more parameters ≠ always better performance (risk of overfitting).

⚙️What Can LLMs Do?

LLMs are capable of a wide range of Natural Language Processing (NLP) tasks:
Task Example

🧠 Question Answering “What is the capital of France?” → “Paris”

✍️Text Generation Write essays or emails

🌐 Translation “How are you?” → “Comment allez-vous?”

📊 Sentiment Analysis Detect tone in a review

🔍 Summarization Shorten long documents

🔗 What Are LLMs Based On?

 Based on Transformer architecture from deep learning.


 Transformers use self-attention to capture contextual relationships between words.
 Key strength: Ability to understand context and dependencies across long text spans.

📦 Training Data

 LLMs are trained on massive text datasets, including large parts of the internet.
 They do not require labeled data during pre-training.
 Can be fine-tuned later for specific tasks using labeled data.

✅ In Summary

 LLMs predict the most likely next word in a sentence using probabilities.
 They are deep learning models (neural networks) trained on huge text datasets.
 Transformer architecture gives them the ability to model complex dependencies.
 They power many generative AI applications like ChatGPT, Bard, Claude, etc.

🧠 Lesson Summary: Transformer Architecture (Part 1)

📌 The Challenge with Understanding Language

 Sentence example: "Jane threw the Frisbee and her dog fetched it."
o For humans, it's easy to know "it" = "Frisbee"
o For machines, long-range dependencies (like "Jane" ↔ "it") are harder to
understand.
🔁 Limitations of RNNs (Recurrent Neural Networks)

 RNNs process one word at a time using a hidden state.


 Struggle with long sentences or distant word relationships.
 Suffer from vanishing gradients, making it hard to retain earlier context.
 Can’t handle long-range dependencies well (e.g., “Jane” and “it”).

⚡ Introduction to Transformer Models

 Transformers can process all words in a sentence simultaneously.


 They view the whole sentence like a “bird’s eye view”, not word-by-word.
 This allows the model to understand how all the words relate contextually.

✨ Self-Attention Mechanism

 Key innovation in Transformers.


 Allows the model to:
o Assign importance (weights) to different words.
o Understand the relationship between all words in the input.
 In our sentence, it helps the model know “it” refers to “Frisbee” by comparing all words
together.

🔧 Transformer Structure

 Introduced in the famous paper: “Attention Is All You Need”


 Consists of two main parts:
1. Encoder: Takes input text and turns it into contextual vectors
2. Decoder: Uses these vectors to generate the output sequence
 Both encoder and decoder use self-attention layers.

🔄 Comparison: RNN vs Transformer

Feature RNN Transformer

Processing Style Sequential (1 word at a time) Parallel (all words at once)

Memory Hidden state (limited context) Global self-attention (full context)


Feature RNN Transformer

Long-range handling Poor Excellent

Speed & Efficiency Slower Faster (supports parallelism)

✅ In Summary

 Transformers outperform RNNs by analyzing entire sequences at once.


 Self-attention allows Transformers to understand context deeply.
 The encoder-decoder structure is essential in sequence-to-sequence tasks (e.g.,
translation).

🔄 Transformer Architecture – Part 2

🧱 Key Components of Transformer Models

1. Encoder
o Takes input text
o Outputs embeddings (vector representations of tokens)
2. Decoder
o Takes embeddings or previous text
o Predicts the next token, one at a time
3. Encoder-Decoder
o Combines both for sequence-to-sequence tasks (e.g., translation)

🔤 Tokens and Tokenization

 Token = smallest unit a model understands (word, subword, or punctuation)


 Example:
o "apple" = 1 token
o "friendship" = 2 tokens: "friend" + "ship"
o Punctuation (like ,, .) = separate tokens
 Token count:
o Simple text ≈ 1 token per word
o Complex text ≈ 2–3 tokens per word

📈 Embeddings
 Embeddings = numeric/vector representation of text (tokens, sentences, etc.)
 Captures semantic meaning and relationships
 Used in:
o Semantic Search
o Vector Databases
o Retrieval-Augmented Generation (RAG)

🔍 Retrieval-Augmented Generation (RAG)

 Combines:
1. Retrieval from vector database (via embeddings)
2. Generation using a large language model (LLM)
 Use case: Answering questions using internal + external knowledge sources

🧠 Decoder Models

 Input: Partial sequence (e.g., “They sent me a”)


 Output: Next token based on probability
 Decodes one token at a time
 Repeatedly called to generate full sequence (e.g., text completion, article writing)

🔁 Encoder-Decoder Architecture

 Used in sequence-to-sequence tasks (e.g., translation)


 Workflow:
1. Input text → tokenized → encoded to embeddings
2. Embeddings → decoded one token at a time
3. Decoder uses prior outputs (self-loop) to generate next token

🔀 Summary of Transformer Model Types

Model Type Description Use Case Examples


Encoder-only Understands & embeds input text Semantic search, RAG, classification
Decoder-only Generates new text from given input Text generation, summarization
Encoder-Decoder Converts one sequence to another Translation, Q&A, summarization
✅ Key Takeaways

 Tokens and embeddings are foundational for all transformer models.


 Transformer models come in three main variants, each suited to specific tasks.
 Encoder-only: understand input
 Decoder-only: generate output
 Encoder-decoder: translate/transform sequences
 Architecture choice depends on task requirements.

🎯 Prompt Engineering – Overview

🧾 What is a Prompt?

 A prompt is the input text provided to a Large Language Model (LLM).


 The model generates responses by predicting the next token based on this input.

🛠️What is Prompt Engineering?

 The iterative process of crafting or refining prompts to elicit a desired, high-quality


response from an LLM.
 Goal: Convert user intent into input the model can understand and respond to effectively.

🧠 LLM Behavior Recap


 LLMs are fundamentally text completion models.
 Given a phrase like:
"Four score and seven years ago…",
the model continues based on training data patterns (e.g., Lincoln’s Gettysburg Address).
 ➤ Limitation:
These models predict what’s likely next, not necessarily what’s correct or safe.

🧪 Instruction Tuning & RLHF


 Instruction Tuning
→ Fine-tuning LLMs to follow instructions, not just continue text.
 Reinforcement Learning from Human Feedback (RLHF)
o Human labelers rate model responses.
o A reward model is trained to align LLM outputs with human preferences.
o Used in advanced instruction-tuned models like Llama 2 Chat, GPT-4, etc.

✅ Successful Prompting Techniques

1. In-Context Learning & Few-Shot Prompting

Type Description

Zero-Shot Provide only task description.

One-Shot Give one example + task.

Few-Shot Give multiple examples to guide the model's behavior.

📌 Few-shot prompting is proven to perform better than zero-shot in many tasks.

Example:

Task: Translate English to French

 Provide 3 examples in prompt


 Ask: "Cheese → ?"

2. Chain-of-Thought Prompting

 Encourage the model to show its reasoning before giving a final answer.
 Great for math, logic, and multi-step problems.

Example:

Prompt:
"Roger has 5 tennis balls, buys 2 cans (3 balls each). How many total?"

Response with Chain-of-Thought:


5 + (2×3) = 11 balls → Final answer: 11
⚠️Hallucination in LLMs
🔍 What is Hallucination?

 When the model generates non-factual or made-up text, despite being grammatically
correct and fluent.

Example:

“In the U.S., people adopted driving on the left…”


🛑 Factually wrong → hallucination.

❗ Challenge:

 Hallucinations can be subtle and hard to detect.


 Mitigation strategies:
o Use retrieval-augmented generation (RAG).
o Research on groundedness metrics is ongoing.

📌 Recap
Topic Summary

Prompt Engineering Crafting inputs to control and improve LLM outputs

Instruction Tuning Training LLMs to follow explicit instructions

In-Context & Few-Shot Prompt Provide examples in prompt to guide model

Chain-of-Thought Encourage reasoning before final answer

Hallucination Outputs not grounded in fact or training data

🎯 Customizing LLMs to Work with Your Own Data


🔁 Two Axes of Customization

Axis Focus Example


Context Optimization Provide more user-/domain- Orders, documents, receipts,
(Horizontal) specific info etc.
Adapt model behavior to Legal tone, chatbot task
LLM Optimization (Vertical)
task/domain flow
🔧 1. Prompt Engineering
 🟢 Easiest to start with: Fast, no cost, iterative.
 ✅ Use when:
LLM already understands your task and general domain.
 ⚙️Example: Few-shot prompting, zero-shot tasks, prompt chaining.

📚 2. Retrieval-Augmented Generation (RAG)


 🧠 Architecture:
o Retrieval → Search enterprise knowledge base (e.g. vector DB).
o Augmented Generation → LLM uses retrieved data to generate grounded
responses.
 ✅ Use when:
o You want factual, up-to-date, or private information.
o You want to reduce hallucination.
 🟢 No fine-tuning required.
 🔴 Needs a high-quality knowledge base and integration.

📦 Example Flow (Chatbot):

1. User: "I want to return a dress."


2. LLM queries return policy in private DB.
3. LLM uses receipt, date, and sale info to validate.
4. Response is accurate and grounded in enterprise data.

🧬 3. Fine-Tuning
 ⚙️Customize a pre-trained LLM using labeled, domain-specific data.
 🚀 Improves:
o Model accuracy
o Style/tone adherence
o Efficiency (shorter prompts, fewer tokens)
 ✅ Use when:
o LLM fails at task
o Domain requires specialized knowledge
o You want long-term consistency across use cases
🔬 OCI T-Few (Advanced)

 Fine-tunes only select layers to reduce cost/training time.

🧭 Choosing the Right Strategy


Method Pros Cons Best For
Prompt Easy, zero cost, fast General tasks,
Limited task specificity
Engineering iterations prototyping
Real-time, grounded, uses Complex setup, needs Chatbots, support,
RAG
private/up-to-date info high-quality data enterprise knowledge
Highly accurate, efficient, Requires labeled data, Domain-specific apps,
Fine-Tuning
custom tone/style costly, time-intensive regulated industries

🛤️Typical Implementation Path


1. Start → Prompt Engineering
2. Add RAG if you need external/private context
3. Fine-tune if:
o Output format/style isn't ideal
o RAG alone is not performant

You can combine all three:


👉 Prompt Engineering + RAG + Fine-Tuning

🧩 Final Framework (Summary Visual)


pgsql
CopyEdit
LLM Optimization ↑
| ◉ Fine-Tuning (custom style/accuracy)
| ◉ RAG + Fine-Tuning (domain-optimized answers)
| ◉ RAG (add context from DB/knowledge base)
|◉ Prompt Engineering (zero/few-shot, chain-of-thought)
+----------------------------------------------→ Context Optimization

📌 Overview of OCI AI Services


Oracle Cloud Infrastructure (OCI) offers pre-built AI services to help organizations leverage
their data for business-specific purposes, without needing to manage infrastructure.
🔧 How OCI AI Services Are Accessed
1. OCI Console – Browser-based UI for managing services and notebook sessions.
2. REST APIs – Programmatic access requiring development expertise.
3. Language SDKs – Available for Java, Python, JavaScript, .NET, Go, etc.
4. OCI CLI – Command-line interface for scripting and direct access.

🧠 Types of OCI AI Services


Service Functionality
Text analysis: Sentiment analysis, NER, classification, translation, PII
Language
detection
Image analysis: Object detection, classification, OCR (optical
Vision
character recognition)
Speech Speech-to-text: Converts audio to JSON/SRT formatted transcripts
Document OCR, key-value extraction (e.g., invoices), table extraction,
Understanding classification
AI-powered chatbot platform that routes conversation flows
Digital Assistant
intelligently

🧩 Language Service Breakdown


 Pre-trained models:
Language detection, sentiment analysis, key phrase extraction, NER, text classification,
PII detection
 Custom models:
Train NER and classification with your own data
 Text translation:
Translate across multiple languages using NMT

🖼️Vision Service Breakdown


 Pre-trained models:
Object detection, image classification, OCR
 Custom models:
Custom object detection + bounding boxes, custom image classification
🔈 Speech Service
 Converts media files to text
 Output in JSON and SRT format
 Supports highly accurate transcription for human speech

📄 Document Understanding
 Text Extraction: Line/word-level text with coordinates
 Key-Value Extraction: From structured documents like receipts, IDs
 Table Extraction: Preserves table structure
 Classification: Classify documents into predefined types

💬 Oracle Digital Assistant


 Build conversational interfaces that:
o Route user requests to the right “skills”
o Handle disambiguation, interruptions, and session control
o List available functions upon greeting

✅ OCI Machine Learning Services Overview


OCI offers cloud services to build, train, deploy, and manage machine learning models,
helping data scientists work efficiently across the entire ML lifecycle.

🔁 AI vs. ML in Oracle Cloud Stack

 AI Services: Pre-trained models for specific business use cases (no coding or data
science required).
 ML Services (OCI Data Science): Tools for data scientists to build custom models
using code and open-source frameworks.

💡 What is OCI Data Science?


A fully-managed cloud platform for:

 Rapidly building, training, deploying ML models


 Supporting Python and open-source libraries
 Serving data scientists across the full ML lifecycle

⚙️Three Core Principles


Principle Description
Instant access to compute (CPU/GPU) + pre-installed libraries + AutoML +
🚀 Accelerated
ADS SDK
👥 Collaborative Shared projects, reproducibility, auditability of models
🛡 Enterprise-grade Integrated with OCI IAM, logging, patching, and infrastructure security

🧪 Key Components of OCI Data Science


Component Description
Projects Collaborative containers for organizing notebooks, models, and artifacts
Notebook Sessions JupyterLab with CPU/GPU options + auto-managed infrastructure
Conda Environment Environment manager for Python packages
Oracle’s Accelerated Data Science Python SDK for data handling &
ADS SDK
AutoML
Models Mathematical representations trained from data
Model Catalog Central repository for storing, tracking, and sharing models
Model Deployments Turn models into REST APIs (HTTP endpoints) for real-time predictions
Jobs Define and schedule repeatable ML tasks on OCI infrastructure

🔍 ADS SDK Key Capabilities


 Connect to data
 Explore & visualize
 AutoML model training
 Model evaluation and explanation
 Easy integration with:
o OCI Object Storage
o OCI Model Catalog
o OCI Jobs
🛠 How It Works
1. Create Project → shared workspace
2. Launch Notebook Session → choose compute, use JupyterLab
3. Build Model → use Python + ADS SDK
4. Store Model → in Model Catalog
5. Deploy Model → as HTTP API for predictions
6. Schedule Tasks → with Jobs

🚀 Why GPUs Matter for AI & ML


 AI/ML workloads involve heavy computation—especially during training and inference.
 GPUs are specialized hardware that:
o Perform parallel computations using thousands of cores.
o Accelerate frameworks like TensorFlow, PyTorch, and ONNX Runtime.
o Greatly outperform CPUs for deep learning, batch inference, and large model
processing.

🧠 GPU Architecture & Performance


GPU Model Architecture Key Features Release
Tensor Cores for matrix
A100 Ampere 2020
ops
Transformer Engine for
H100 Hopper 2022
LLMs
H200 Hopper+ H100 + higher memory 2024
Optimized for large-scale
Blackwell Blackwell 2025
LLMs
GB200 (Grace Combo of 4 Blackwell GPUs + 2 High-density LLM/HPC
2025
Blackwell) Grace CPUs compute

🖥️NVIDIA Grace CPU


 Built for HPC, AI cloud, and data centers
 Forms the core of GB200 superchip, integrated with Blackwell GPUs.

🔧 OCI GPU Compute Services


Oracle Cloud offers a growing GPU portfolio:

GPU Option Status


H100, L40S Available now
Edge 200, B200, GB200 Taking orders; GA in 2025
Superclusters Based on Edge, B200, GB200

 Supercluster scaling:
o Edge 200: 🧱 Scales 10x vs Edge 100
o B200 / GB200: ⛰ APEX performance vs H100

🧪 LLM Training & Deployment on OCI


Using OCI Data Science AI Quick Actions:

1. ✅ Deploy pre-trained popular LLMs to GPU VMs/bare metal.


2. 🔁 Fine-tune LLMs using your data.
3. 🚀 Deploy fine-tuned models for real-time inference.
4. 🔌 Supports models via:
o vLLM
o Next-gen inference
o Text embedding inference containers

📌 Summary
 GPUs are essential for scalable, high-performance AI workloads.
 NVIDIA’s latest hardware (H100, H200, GB200) powers next-gen LLMs and inference.
 OCI offers seamless support for training, fine-tuning, and deploying models with GPU-
backed compute.
 AI Quick Actions simplify the process for developers and data scientists.

🚀 OCI RDMA Supercluster: First Principles Overview


OCI's approach focuses on maximizing performance at minimum cost using RDMA (Remote
Direct Memory Access)—a key enabler of high-throughput, low-latency infrastructure.

🔧 What is RDMA?
 A technology that bypasses CPUs during data transfer between nodes.
 Enables low latency and high bandwidth communication—ideal for:
o Databases (Exadata, Autonomous DB)
o HPC workloads
o Large-scale GPU clusters (e.g. LLM training)

🧠 Supercluster Architecture Overview


🧩 Node and GPU Setup

 Each node: 8x NVIDIA A100 GPUs connected via NVLink


 Each node connects to network fabric at 1.6 Tbps
 Each GPU gets 200 Gbps dedicated bandwidth

🌐 Three-Tier CLOS Network (RDMA Fabric)

 Built for tens of thousands of GPUs (scalable to 100k+ GPUs)


 Organized into blocks (e.g., Block 1 to Block N)
 Each block: three-tier CLOS topology (non-blocking)
 Total RDMA fabric latency:
o Within block: ~6.5 µs
o Cross-block: ~20 µs

🧠 Key Optimizations & Innovations


1. Lossless RDMA Networking

 Packet loss avoided using:


o Buffer tuning
o Intelligent congestion control
 Guarantees reliable high-speed GPU communication

2. Intelligent Workload Placement

 Control plane schedules workloads in optimal locations:


o Small workloads or latency-sensitive apps → Single block → lowest latency
(~6.5 µs)
o Large-scale jobs → Distributed across blocks with optimized traffic paths

3. Network Locality Hints

 GPUs get topology awareness via OCI API


 Workloads are orchestrated to:
o Keep ~85% of traffic local (within the same block)
o Place frequently communicating GPUs in close proximity
 Results in:
o Lower average latency
o Reduced flow collisions
o Higher throughput

📈 Performance Gains
Scenario Latency Throughput
Within block (intra-node or
~6.5 µs High
intra-block)
Still higher than traditional cloud
Cross-block (full fabric) ~20 µs
networking
Mixed (avg. lower than
Locality-aware workloads Maximized due to flow isolation
20 µs)

🛠️Use Cases
 LLM training at Supercluster scale
 Massive-scale AI/ML model training
 HPC and database clusters needing predictable low-latency RDMA
 Oracle Exadata and Autonomous DB leveraging RDMA fabric

🧩 Summary
OCI’s RDMA Supercluster achieves:

1. Scalability to 100,000+ GPUs


2. Lossless RDMA Fabric with tuned buffers and silicon
3. Latency-aware control plane scheduling
4. Locality-aware workload orchestration
5. Enterprise-ready fabric for AI, ML, DB, and HPC

🧭 What Is Responsible AI?


Responsible AI ensures that artificial intelligence is:

 Trustworthy
 Ethical
 Safe
 Lawful

📌 Why Responsible AI Is Necessary


AI is now used in:

 Healthcare (diagnosis)
 Transportation (self-driving cars)
 Decision-making (credit scoring, hiring)

🔒 Question: Can we trust AI decisions?


👉 Only if they are developed with strong ethical and legal foundations.

⚖️Guiding Principles of Trustworthy AI


1. Lawfulness
o Must comply with all national and international laws.
o Includes sector-specific laws (e.g., medical device regulations).
2. Ethical
o Must uphold human dignity, freedom, and democracy.
o Protect privacy, support freedom of choice, and avoid manipulation.
3. Robustness
o Should be technically secure and socially aligned.
o Must avoid unintended consequences or harm.

👥 Human Ethics and Rights in AI


 Human Dignity: Respect physical and mental integrity.
 Freedom: Protect expression and privacy.
 Democracy: Do not interfere with democratic systems.
 Equality: Avoid bias or discrimination.

🤖 Responsible AI in Practice
Ethical AI Must:

 Be human-centric
 Allow for human oversight
 Avoid physical or social harm
 Provide transparency and explainability

🔄 Responsible AI Implementation Cycle


1. Governance: Establish rules, oversight, and ethical leadership.
2. Policies & Procedures: Create frameworks for responsible use.
3. Monitoring & Evaluation: Ensure systems meet standards over time.

🧩 Roles Involved:

 Developers – Build AI systems


 Deployers – Implement and monitor usage
 End Users – Interact with AI and are affected by its decisions

🏥 Example: AI in Healthcare
Key Challenges:

 Bias in Training Data: If trained on limited demographics, AI may perform poorly on


others.
 Lack of Explainability: Complex algorithms may be hard to interpret for clinicians or
patients.
 Trust & Accountability: Ensuring reliability and safety in medical decisions.

✅ Solution: Regular testing, fairness evaluation, and transparency to build trust.


📌 Summary Chart: Mapping Ethics to Responsible AI
Ethical Principle Responsible AI Requirement
Human-centric design Enable meaningful human choice and oversight
Safety and security Ensure technical robustness and non-malicious use
Fairness and equality Prevent bias and discrimination
Transparency and accountability Make decisions explainable and traceable

✅ OCI Generative AI Service: Overview

 Fully managed, serverless service to build generative AI applications.


 Provides single API access to multiple foundational models (e.g., Meta, Cohere).
 No infrastructure management required.

🧠 Key Characteristics

1. Pre-trained Foundational Models


o Chat Models:
 Command R+ (128k token context): Advanced, powerful, higher cost.
 Command R (16k): Entry-level, more affordable.
 LLaMA 3–70B Instruct: Meta's instruction-following model.
o Embedding Models:
 Embed English
 Embed Multilingual (100+ languages; supports cross-language semantic
search)
2. Flexible Fine-tuning
o Customize a pre-trained model on your domain-specific data.
o Improves:
 Task-specific performance
 Efficiency (especially with T-Few fine-tuning)
o T-Few Fine-Tuning:
 Inserts new layers.
 Updates only a subset of weights.
 Reduces training cost and time.
3. Dedicated AI Clusters
o GPU-based compute resources with exclusive RDMA networking.
o Designed for:
 Fine-tuning
 Inference
o GPUs are isolated per customer (security-focused).
🧰 Use Cases

 Chat and dialogue generation


 Text summarization, Q&A
 Semantic search (via embeddings)
 Multilingual retrieval (e.g., search French docs using Chinese queries)

✅ OCI Generative AI Service: Demo Summary


🔹 Service Access & Setup

 Accessible from: Analytics & AI → AI Services → Generative AI


 Currently available in select regions (e.g., Germany Central - Frankfurt)

🧪 Playground Interface

 No code required: Used for exploration and prompt tuning.


 Easily switch between:
o Chat models: Command-R, Command-R+, Meta LLaMA 3 (70B)
o Embedding models: English & Multilingual

💬 Chat Model Features

 Conversational memory: Follows contextual continuity


 Example:
o Prompt: “Teach me how to fish”
o Follow-up: “Describe step 3” → maintains context
 Preamble override: Change model’s persona/style (e.g., travel advisor with pirate tone)
 Temperature: Controls output randomness (higher = more creative)

🔧 View & Export Code

 Languages supported: Python, Java


 Automatically generates working code → export and run in IDE/Jupyter easily

📊 Embedding Models

 Converts text to vector (embedding) for semantic search


 Supports:
o English-only
o Multilingual (100+ languages, cross-language search)

Demo Example:

 41 HR articles converted into vectors


 Plotted in 2D to show clustering of semantically similar topics
 Used for:
o Semantic search
o Clustering
o Similarity matching

🧠 Model Customization: Fine-Tuning

 Use case: When pre-trained models don’t meet task-specific needs


 Choose:
o Base model (e.g., Command-R)
o T-Few fine-tuning method (adds custom layers, updates fraction of weights)
 Requires: Dedicated AI cluster

🚀 Dedicated AI Clusters

 GPU-powered backend for:


o Inference (hosting)
o Fine-tuning
 RDMA-enabled for low-latency, high-throughput
 Fully isolated per tenant

🌐 Endpoints for Inference

 Once a model is fine-tuned, you must deploy it to an endpoint:


o Configure cluster, model, and other specs
o Used to serve real-time traffic

🧰 Use Case Scenarios


 Chatbots (context-aware, persona-driven)
 Enterprise Q&A systems
 Semantic document search
 Multilingual content retrieval
 Custom fine-tuned domain-specific LLMs

🧠 If You're Planning to Build...


Goal Recommendation

Quick testing / prototyping Use Playground

Text generation (with history) Use Chat models

Document similarity / search Use Embedding models

Domain-specific accuracy Use Fine-tuning

Production deployment Setup Endpoint on dedicated AI cluster

✅ Oracle Database 23ai – AI Vector Search Summary


🔹 What Is AI Vector Search?

A built-in feature in Oracle Database 23ai that enables semantic similarity search across
structured and unstructured data using vector embeddings.

🧠 Why It Matters

 Powers Gen AI pipelines directly inside the database.


 Combines SQL-native similarity search with enterprise data.
 Supports RAG (Retrieval-Augmented Generation) designs.
 Avoids the need for external vector databases.

🧩 Key Components
1. VECTOR Data Type

 Stores high-dimensional embeddings.


 Can be used alongside relational columns.
 Flexible format (with or without dimension spec).

2. VECTOR_EMBEDDING() Function

 Generates vector embeddings from inputs.


 Supports ONNX models or OCI GenAI APIs (e.g., ResNet50).
 Can use models deployed inside the DB.

3. VECTOR_DISTANCE() Function

 Computes distance between vectors (similarity metric).


 Supports metrics like COSINE, EUCLIDEAN, etc.
 Smaller distance = more similar.

4. Similarity Search via SQL

 Use standard SQL with vector columns + distance functions.


 Example: top 10 jobs matching a resume across cities.

5. Vector Indexes

 Improve performance and optionally approximate results.


 Types:
o INMEMORY NEIGHBOR GRAPH
o NEIGHBOR PARTITIONS
 Can define DISTANCE metric and TARGET ACCURACY.
 Supports APPROXIMATE FETCH TOP N queries.

6. Joins with Vector Search

 A major differentiator: supports joins across tables with embeddings.


 Allows enriched results from normalized enterprise schemas.
 Example: Join Authors, Books, and Pages with embedded page content.

🔧 GenAI Pipeline Integration


Stage Function
Data Ingestion Load data from DB, CSV, social media
Processing Chunk, summarize, embed
Storage Store vectors in VECTOR type columns
Retrieval Search via VECTOR_DISTANCE() or APPROXIMATE queries
Stage Function
Augmentation RAG: Retrieve + generate via LLM
Integration Connect with LangChain, LlamaIndex, OCI GenAI

🔒 Enterprise-Grade Features
 Converged DB: JSON, XML, Graph, Spatial + Vectors
 SQL Optimizer aware of vector indexes and joins
 Fully managed, high performance, and reliable
 Seamless integration with OCI GenAI and LLM orchestration

🧠 Example Use Cases


 Resume-to-job matching (vectorized resume vs. job descriptions)
 Document semantic search
 Legal contracts comparison
 Product recommendations
 AI chatbots using internal knowledge bases (RAG)

🧩 Summary of Benefits
Feature Benefit
Built-in VECTOR type No need for external vector DBs
SQL-native search Easy integration for DB devs
Joins + Vectors Powerful enterprise use
Approximate search + accuracy control Balance between speed and precision
Embedding model loading (ONNX) Custom ML workflow support
LangChain / LlamaIndex AI app orchestration

🧠 Oracle Autonomous Database – Select AI Overview


🔹 What Is Select AI?

Select AI is a capability in Oracle Autonomous Database that enables users to:

 Query their enterprise data using natural language (NL)


 Automatically generate SQL from natural language via large language models (LLMs)
 Integrate easily with APEX apps, SQL Developer, or other tools
🧩 Key Benefits
Benefit Description
🔍 Natural Language to Ask questions like “Top 10 streamed movies” — no SQL or schema
SQL knowledge needed
💡 AI-Driven SQL
Automatically translates intent into optimized SQL
Generation
🔐 Enterprise-Grade Data remains inside the Oracle tenancy — no exposure to external
Security LLMs
🔌 Pluggable LLMs Choose from OCI GenAI, Cohere, OpenAI, or Llama
🔧 Developer Friendly Easily integrates with tools like APEX and SQL Developer
📱 Mobile + Web Ready Works seamlessly on mobile apps built with APEX

🛠️How Select AI Works


1. User Input (Natural Language)

 Example: “What are the top 10 streamed movies?”

2. LLM Prompt Engineering

 Oracle Autonomous DB sends prompt to LLM via AI profile.


 Includes relevant metadata (schemas, columns, etc.).

3. SQL Generation

 LLM returns SQL.


 SQL is executed securely on your Autonomous Database.

4. Output

 Results are returned in app, SQL Developer, or APEX report.


 You can inspect SQL using SELECT AI SHOWSQL.

🧩 Technical Features
🔷 SELECT AI Syntax
sql
CopyEdit
SELECT AI 'List the total number of movies streamed last week.';

🔷 SELECT AI SHOWSQL

 Reveals the generated SQL behind the NL query.

🔷 AI Profiles

 Configurable objects linking:


o LLM provider (OCI GenAI, OpenAI, Cohere, etc.)
o Supported schemas, tables, views
o Token, auth, and rules
 Managed via:

plsql
CopyEdit
DBMS_CLOUD_AI

🧠 Use Cases
Area Examples
🎥 Media “Which actor appears most in top 10 streamed movies?”
📈 Business Intelligence “List top 5 customer segments by revenue.”
📊 Analytics “Show number of orders by region for last quarter.”
🔍 Exploratory Queries No need to know table/column names

🧰 App Integration: APEX + Select AI


 APEX apps can:
o Accept natural language input
o Use Select AI to fetch data
o Render as reports, charts, or dashboards
 All with no need for users to understand SQL

🔐 Security Considerations
 No data is sent outside the Oracle tenancy
 Queries are executed locally in the Autonomous DB
 LLM interaction is controlled via AI Profiles
🔁 Extensibility
 Supports multi-model AI backends
o Cohere, Llama, OCI GenAI, OpenAI
 Can evolve with future fine-tuned LLMs
 Use AI Profiles to switch models without changing application logic

✅ Summary Chart
Feature Select AI
Input Natural Language
Output SQL or Results
Model Pluggable LLM (OCI, Cohere, OpenAI, etc.)
Tools APEX, SQL Developer, Custom Apps
Security Data stays within Oracle tenancy
Config DBMS_CLOUD_AI + AI Profiles

summary of the five main capabilities of OCI Language:

1. Language Detection
o Identifies the language of the text.
o Supports 75 languages (e.g., Afrikaans to Welsh).
2. Entity Recognition
o Detects 14 types of named entities (e.g., names, places, dates, emails, currencies,
organizations, phone numbers).
3. Sentiment Analysis
o Determines sentiment (positive, negative, neutral).
o Analyzes sentiment per sentence and per aspect (e.g., “food” = positive,
“service” = negative).
4. Key Phrase Extraction
o Identifies important phrases or ideas in the text.
5. Text Classification
o Categorizes text into 600+ topics and subtopics.

🔍 Accessing OCI Language Service

1. Log in to OCI Console


2. Navigate to:
☰ Menu → Analytics & AI → AI Services → Language
🧭 Language Service Console Overview

 Documentation Links: Quickly access guides, blogs, API references, and SDK docs.
 Service Panel: Includes options like Text Analytics for trying out the language model.

🛠️Trying Out Text Analytics

1. Click Text Analytics


2. Enter or use the default sample text
3. Click Analyze

🧠 Output of Text Analysis

Here's what OCI Language returns:

Feature Description
Language
Detects text language (e.g., English), shows confidence level.
Detection
Predicts category (e.g., "Science & Technology") and subcategory (e.g.,
Text Classification
"Earth sciences") with probability.
Highlights named entities (like dates, products, quantities, locations),
Entity Recognition
shows type and confidence, color-coded.
Key Phrase
Extracts important phrases representing core ideas in the text.
Extraction
Provides sentiment at three levels:
– Document-level (overall sentiment)
Sentiment Analysis
– Aspect-based (sentiment by topic/aspect)
– Sentence-level (individual sentence sentiment)

✅ Key Takeaway:

OCI Language provides deep natural language understanding with no coding needed. The
console UI makes it easy to test and visualize how text is analyzed—ideal for both technical and
business users exploring AI.

🎧 What is OCI Speech?


OCI Speech is a fully managed AI service that automatically converts speech in audio/video
files to text using advanced deep learning models—with no data science expertise required.

🧩 Key Features of OCI Speech

Feature Description
Supports English, Spanish, and Portuguese, with more languages to
Multilingual Support
come.
Audio-to-Text Converts audio/video files stored in OCI Object Storage into
Transcription timestamped, grammatically correct text.
Batch Processing Submit multiple files in a single request for large-scale processing.
Transcribes hours of audio in under 10 minutes by parallel
High Performance
processing via chunking.
Provides a confidence score per word and per transcription for
Confidence Scores
accuracy insight.
Punctuation & Automatically adds punctuation to improve readability and make text
Readability ready for downstream NLP systems.
Outputs SRT closed caption files, enabling subtitle integration in
SRT File Support
videos.
Converts literal transcriptions into human-readable formats:
Normalization – "ten o'clock" → 10:00
– "twenty-one main street" → 21 Main St.
Options include:
– Remove (asterisks)
Profanity Filtering
– Mask (e.g., s***)
– Tag (word retained with metadata tag)

✅ Ideal Use Cases

 Transcribing business meetings or customer support calls


 Generating captions/subtitles for multimedia content
 Enabling downstream NLP on spoken data (e.g., for chatbots, summarization)

🛠️Steps to Use OCI Speech in the Console

1. Access the Speech Service

 Go to the OCI Console


 Navigate via ☰ Menu → Analytics & AI → AI Services → Speech
2. Pre-requisite

 ✅ Ensure your audio file (e.g., .wav) is uploaded to an Object Storage bucket in the
correct compartment

3. Create a Transcription Job

 Click Create Transcription Job


 Enter a job name (e.g., Training)
 Select the appropriate:
o Compartment
o Object Storage bucket
o Audio file (e.g., audio.wav)

4. Run the Job

 Click Run
 Transcription typically completes in seconds to a few minutes depending on file size

5. View Results

 Select the completed job


 View the transcribed text, which includes:
o 📝 Punctuation added automatically
o 🔤 Speaker segmentation
o 🔄 Normalization (e.g., “one hundred percent” → 100%)

✅ Example Transcription Output

“I'm having an issue with my wireless headphones. I think my Bluetooth connectivity to the TV is
not working. I have high quality headphones…”
Support Agent: “Hi, thanks for contacting support. I'm sorry to hear about your Bluetooth
issue.”

🎯 Benefits Shown in This Demo

 Accurate multi-speaker transcription


 Human-readable output with proper punctuation and formatting
 Simple no-code UI workflow within OCI Console

You might also like