0% found this document useful (0 votes)

5 views10 pages

Quick Reference

This document serves as a quick reference guide for image processing tasks, detailing model IDs for image generation and editing, various edit and mask modes, aspect ratios, and common semantic classes. It includes code snippets for generating and editing images using different models, along with best practices, performance tips, and common issues with their solutions. Additionally, it provides information on safety and content filtering, API usage, and storage operations.

Uploaded by

Marcin Stankiewicz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views10 pages

Quick Reference

Uploaded by

Marcin Stankiewicz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

# Image Processing Quick Reference Guide

Quick lookup for common image processing tasks, model IDs, and code patterns.

## Model IDs

### Imagen Models

```python
# Image Generation
MODEL_IMAGEN_3 = "imagen-3.0-generate-002"
MODEL_IMAGEN_3_FAST = "imagen-3.0-fast-generate-001"
MODEL_IMAGEN_4 = "imagen-4.0-generate-001"
MODEL_IMAGEN_4_FAST = "imagen-4.0-fast-generate-001"
MODEL_IMAGEN_4_ULTRA = "imagen-4.0-ultra-generate-001"

# Image Editing
MODEL_IMAGEN_EDITING = "imagen-3.0-capability-001"

# Specialized
MODEL_IMAGEN_PRODUCT_RECONTEXT = "imagen-product-recontext-preview-06-30"
MODEL_VTO = "virtual-try-on-preview-08-04"
```

### Gemini Models

```python
# Multimodal Image Generation
MODEL_GEMINI_IMAGE = "gemini-2.5-flash-image-preview"

# General Purpose (with vision)

MODEL_GEMINI_FLASH = "gemini-2.5-flash"
```

## Edit Modes

```python
# Inpainting - Add Content
edit_mode = "EDIT_MODE_INPAINT_INSERTION"

# Inpainting - Remove Content

edit_mode = "EDIT_MODE_INPAINT_REMOVAL"

# Background Replacement
edit_mode = "EDIT_MODE_BGSWAP"

# Extend Image Boundaries

edit_mode = "EDIT_MODE_OUTPAINT"

# General Editing (no mask)

edit_mode = "EDIT_MODE_DEFAULT"
```

## Mask Modes

```python
# Automatic Foreground Detection
mask_mode = "MASK_MODE_FOREGROUND"
# Automatic Background Detection
mask_mode = "MASK_MODE_BACKGROUND"

# Semantic Segmentation (specify class IDs)

mask_mode = "MASK_MODE_SEMANTIC"
segmentation_classes = [8] # dog

# Descriptive (text-based)
mask_mode = "MASK_MODE_PROMPT"

# Custom Mask Image

mask_mode = "MASK_MODE_USER_PROVIDED"
```

## Aspect Ratios

```python
aspect_ratio = "1:1" # Square
aspect_ratio = "16:9" # Widescreen
aspect_ratio = "9:16" # Portrait/Mobile
aspect_ratio = "4:3" # Standard
aspect_ratio = "3:4" # Portrait Standard
```

## Common Semantic Classes

| Class ID | Object | Class ID | Object | Class ID | Object |

|----------|--------|----------|--------|----------|--------|
| 6 | bird | 37 | laptop | 125 | person |
| 7 | cat | 42 | television | 175 | bicycle |
| 8 | dog | 66 | bed | 176 | car |
| 9 | horse | 67 | table | 178 | motorcycle |
| 28 | toilet | 85 | mirror | 179 | airplane |

[Full list of 194 classes in main documentation]

## Code Snippets

### Generate Image (Imagen)

```python
from models.image_models import generate_images
from config.default import Default

cfg = Default()

response = generate_images(
model=cfg.MODEL_IMAGEN_3_FAST,
prompt="a beautiful sunset over mountains",
number_of_images=1,
aspect_ratio="16:9",
negative_prompt="people, text, watermark"
)

# Access generated images

for img in response.generated_images:
gcs_uri = img.image.gcs_uri
image_bytes = img.image.image_bytes
```
### Edit Image (Inpaint - Add Object)

```python
from models.image_models import edit_image
from config.default import Default

cfg = Default()

edited_uris = edit_image(
model=cfg.MODEL_IMAGEN_EDITING,
prompt="a red sports car",
edit_mode="EDIT_MODE_INPAINT_INSERTION",
mask_mode="MASK_MODE_FOREGROUND",
reference_image_bytes=original_image_bytes,
number_of_images=1
)
```

### Remove Object (Semantic Mask)

```python
edited_uris = edit_image(
model="imagen-3.0-capability-001",
prompt="", # Empty for removal
edit_mode="EDIT_MODE_INPAINT_REMOVAL",
mask_mode="MASK_MODE_SEMANTIC",
reference_image_bytes=image_bytes,
number_of_images=1
)

# Need to pass segmentation_classes separately via MaskReferenceImage

# See full implementation in models/image_models.py
```

### Replace Background

```python
edited_uris = edit_image(
model="imagen-3.0-capability-001",
prompt="a modern minimalist studio with white walls and soft lighting",
edit_mode="EDIT_MODE_BGSWAP",
mask_mode="MASK_MODE_BACKGROUND",
reference_image_bytes=product_image_bytes,
number_of_images=1
)
```

### Outpaint Image

```python
# Requires padding the image first with PIL
from PIL import Image
import io

# Load and pad image

original = Image.open(io.BytesIO(image_bytes))
# ... padding logic (see character_consistency.py)
edited_uris = edit_image(
model="imagen-3.0-capability-001",
prompt="continue the scene naturally",
edit_mode="EDIT_MODE_OUTPAINT",
mask_mode="MASK_MODE_USER_PROVIDED",
reference_image_bytes=padded_image_bytes,
number_of_images=1
)
```

### Generate with Gemini

```python
from models.gemini import generate_image_from_prompt_and_images

gcs_uris, execution_time = generate_image_from_prompt_and_images(

prompt="a futuristic cityscape at night with neon lights",
images=[], # Optional reference images
gcs_folder="gemini_generations",
file_prefix="city"
)
```

### Generate with Reference Images

```python
gcs_uris, execution_time = generate_image_from_prompt_and_images(
prompt="Create a similar scene but in winter with snow",
images=["gs://bucket/reference-image.png"],
gcs_folder="gemini_generations",
file_prefix="winter_scene"
)
```

### Virtual Try-On

```python
from google.cloud import aiplatform
from google.cloud.aiplatform.gapic import PredictionServiceClient

client = PredictionServiceClient(
client_options={"api_endpoint": f"{location}-aiplatform.googleapis.com"}
)

model_endpoint =
f"projects/{project_id}/locations/{location}/publishers/google/models/virtual-try-
on-preview-08-04"

instances = [{
"personImage": {"image": {"bytesBase64Encoded": person_b64}},
"productImages": [{"image": {"bytesBase64Encoded": outfit_b64}}],
}]

response = client.predict(
endpoint=model_endpoint,
instances=instances,
parameters={}
)
```
## PIL/Pillow Common Operations

### Load Image

```python
from PIL import Image
import io

# From bytes
pil_image = Image.open(io.BytesIO(image_bytes))

# From file
pil_image = Image.open("path/to/image.jpg")

# From URL (with requests)

import requests
response = requests.get(image_url)
pil_image = Image.open(io.BytesIO(response.content))
```

### Get Image Info

```python
width, height = pil_image.size
mode = pil_image.mode # 'RGB', 'RGBA', 'L', etc.
format = pil_image.format # 'JPEG', 'PNG', etc.
```

### Resize Image

```python
# Resize to exact dimensions
new_image = pil_image.resize((800, 600))

# Resize maintaining aspect ratio (thumbnail)

pil_image.thumbnail((800, 600)) # Modifies in-place
```

### Create New Image

```python
# RGB image with white background
new_image = Image.new("RGB", (800, 600), color=(255, 255, 255))

# Grayscale image (for masks)

mask = Image.new("L", (800, 600), 0) # Black mask
```

### Crop Image

```python
# Define crop box (left, top, right, bottom)
box = (100, 100, 400, 400)
cropped = pil_image.crop(box)
```

### Paste Image

```python
# Paste small_image onto canvas at position (x, y)
canvas.paste(small_image, (100, 100))

# With mask for transparency

canvas.paste(small_image, (100, 100), mask=mask)
```

### Convert to Bytes

```python
# PNG
byte_io = io.BytesIO()
pil_image.save(byte_io, format="PNG")
image_bytes = byte_io.getvalue()

# JPEG with quality

byte_io = io.BytesIO()
pil_image.save(byte_io, format="JPEG", quality=90)
image_bytes = byte_io.getvalue()
```

### Convert Color Mode

```python
# Convert to RGB
rgb_image = pil_image.convert("RGB")

# Convert to grayscale
gray_image = pil_image.convert("L")

# Add alpha channel

rgba_image = pil_image.convert("RGBA")
```

## OpenCV Operations

### Read Video

```python
import cv2

cap = cv2.VideoCapture("video.mp4")
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Process frame
cap.release()
```

### Extract Frame

```python
cap = cv2.VideoCapture("video.mp4")
cap.set(cv2.CAP_PROP_POS_FRAMES, frame_number)
ret, frame = cap.read()
cap.release()
```
### Convert Frame to PIL

```python
import cv2
from PIL import Image

# OpenCV uses BGR, PIL uses RGB

frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
pil_image = Image.fromarray(frame_rgb)
```

## Storage Operations

### Upload to GCS

```python
from common.storage import store_to_gcs

gcs_uri = store_to_gcs(
folder="my_folder",
file_name="image.png",
mime_type="image/png",
contents=image_bytes
)
# Returns: "gs://bucket/my_folder/image.png"
```

### Download from GCS

```python
from common.storage import download_from_gcs

image_bytes = download_from_gcs("gs://bucket/path/to/image.png")
```

### Convert GCS URI to HTTPS URL

```python
from common.utils import gcs_uri_to_https_url

https_url = gcs_uri_to_https_url("gs://bucket/path/to/image.png")
# Returns: "https://storage.googleapis.com/bucket/path/to/image.png"
```

### Convert HTTPS URL to GCS URI

```python
from common.utils import https_url_to_gcs_uri

gcs_uri = https_url_to_gcs_uri("https://storage.googleapis.com/bucket/path/to/
image.png")
# Returns: "gs://bucket/path/to/image.png"
```

## Safety and Content Filtering

### Safety Filter Levels

```python
# Imagen generation/editing
safety_filter_level = "BLOCK_LOW_AND_ABOVE" # Most restrictive
safety_filter_level = "BLOCK_MEDIUM_AND_ABOVE" # Balanced
safety_filter_level = "BLOCK_ONLY_HIGH" # Permissive
safety_filter_level = "BLOCK_NONE" # No filtering
```

### Person Generation Settings

```python
# Imagen generation/editing
person_generation = "DONT_ALLOW" # No people
person_generation = "ALLOW_ADULT" # Only adults
person_generation = "ALLOW_ALL" # All ages
```

## Error Handling

### Retry Logic

```python
from tenacity import (
retry,
retry_if_exception_type,
stop_after_attempt,
wait_exponential,
)

@retry(
wait=wait_exponential(multiplier=1, min=1, max=10),
stop=stop_after_attempt(3),
retry=retry_if_exception_type(Exception),
reraise=True,
)
def generate_with_retry():
return generate_images(...)
```

### Check for Generation Errors

```python
response = generate_images(...)

if response.generated_images:
for img in response.generated_images:
if hasattr(img, 'error') and img.error:
print(f"Generation error: {img.error}")
elif hasattr(img, 'image') and img.image:
# Success
gcs_uri = img.image.gcs_uri
else:
print("No images generated")
```

## Best Practices

### Prompting
1. **Be specific**: Include details about style, lighting, composition
2. **Use negative prompts**: Exclude unwanted elements
3. **Set aspect ratio**: Match your use case (16:9 for web, 9:16 for mobile)
4. **Iterate**: Use reference images and refine prompts

### Image Processing

1. Always validate dimensions: Check image size before processing

2. **Handle aspect ratios**: Use thumbnail() to maintain proportions
3. **Use appropriate formats**: PNG for transparency, JPEG for photos
4. **Optimize quality**: Balance file size and visual quality
5. **Error handling**: Wrap PIL operations in try-except blocks

### API Usage

1. Implement retry logic: Network failures happen

2. **Use appropriate models**: Fast models for prototyping, standard for production
3. **Batch operations**: Process multiple images concurrently when possible
4. **Monitor costs**: Track API usage and optimize
5. **Cache results**: Store generated images in GCS

### Security

1. Validate inputs: Check file types and sizes

2. **Use safety filters**: Appropriate for your use case
3. **Sanitize prompts**: Remove potentially harmful instructions
4. **Rate limiting**: Implement on client side
5. **Access control**: Use IAP and proper GCS permissions

## Performance Tips

### Image Generation

- Use "fast" models for iteration (`imagen-3.0-fast-generate-001`)

- Generate multiple images in one call when possible
- Use lower resolution for drafts, higher for final
- Cache common generations

### Image Processing

- Resize images before processing when possible

- Use thumbnail() instead of resize() to maintain aspect ratio
- Process images in parallel with ThreadPoolExecutor
- Use appropriate JPEG quality (80-90 is usually sufficient)

### Storage

- Store generated images in GCS immediately

- Use appropriate bucket locations (same region as Vertex AI)
- Implement lifecycle policies for temporary images
- Use signed URLs for secure access

## Common Issues and Solutions

### Issue: "Image too large for API"

```python
from PIL import Image
max_dimension = 4096
if width > max_dimension or height > max_dimension:
pil_image.thumbnail((max_dimension, max_dimension))
```

### Issue: "Mask doesn't match image size"

```python
# Ensure mask has same dimensions as image
mask = Image.new("L", pil_image.size, 0)
```

### Issue: "RGBA to RGB conversion for JPEG"

```python
if pil_image.mode == "RGBA":
# Create white background
background = Image.new("RGB", pil_image.size, (255, 255, 255))
background.paste(pil_image, mask=pil_image.split()[3]) # Use alpha as mask
pil_image = background
```

### Issue: "Out of memory with large images"

```python
# Resize before processing
pil_image.thumbnail((2048, 2048))

# Or process in chunks
# (implementation depends on specific use case)
```

---

## Related Documentation

- [Comprehensive Report](./comprehensive-image-processing-report.md) - Full

analysis
- [Use Case Examples](./use-case-examples.md) - Practical scenarios
- [Main README](./README.md) - Overview and index

## Repository Files

Key files for reference:

- `models/image_models.py` - Core implementations
- `config/default.py` - Model IDs and configuration
- `components/constants.py` - UI constants and options
- `common/storage.py` - GCS operations
- `common/utils.py` - Utility functions

---

Last Updated: 2025-01-22

Python Code 3
No ratings yet
Python Code 3
17 pages
Project Guidelines - AIML
No ratings yet
Project Guidelines - AIML
30 pages
4-Channel YOLO Training Guide For RGB+IR Drone Detection
No ratings yet
4-Channel YOLO Training Guide For RGB+IR Drone Detection
22 pages
Stable Diffusion Report Updated
No ratings yet
Stable Diffusion Report Updated
19 pages
Dip Assigment Shahzaib
No ratings yet
Dip Assigment Shahzaib
10 pages
Dip Lab
No ratings yet
Dip Lab
5 pages
Performance Testing
No ratings yet
Performance Testing
15 pages
Demo Inference Note
No ratings yet
Demo Inference Note
15 pages
MiniProj-3-Colorizing Old B&W Images
No ratings yet
MiniProj-3-Colorizing Old B&W Images
4 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
4 pages
Colorized Image by JUNAID
No ratings yet
Colorized Image by JUNAID
4 pages
Developing Image Processing and Computer Vision Applications Dr. Usman Ghani
No ratings yet
Developing Image Processing and Computer Vision Applications Dr. Usman Ghani
52 pages
CV-Mini Project 2
No ratings yet
CV-Mini Project 2
15 pages
Image Classifier Report
No ratings yet
Image Classifier Report
7 pages
Advanced ML Image Processing
No ratings yet
Advanced ML Image Processing
6 pages
Deep Fake For Free - Ipynb
No ratings yet
Deep Fake For Free - Ipynb
5 pages
Point Operations in Image Processing
No ratings yet
Point Operations in Image Processing
7 pages
Lab 4-Image Segmentation Using U-Net
No ratings yet
Lab 4-Image Segmentation Using U-Net
9 pages
Updated Lab Manual 14 DIP
No ratings yet
Updated Lab Manual 14 DIP
24 pages
Deep Learning Manual
No ratings yet
Deep Learning Manual
44 pages
ALCANTARAuLaboratory 6 Image Processing Student - 031006
No ratings yet
ALCANTARAuLaboratory 6 Image Processing Student - 031006
9 pages
CNN 1721592934
No ratings yet
CNN 1721592934
53 pages
Infer
No ratings yet
Infer
2 pages
Here Are Common Image Preprocessing Techniques Used in Machine Learning and Deep Learning
No ratings yet
Here Are Common Image Preprocessing Techniques Used in Machine Learning and Deep Learning
7 pages
Image Processing and Validation Functions
No ratings yet
Image Processing and Validation Functions
5 pages
Image Processing with Jupyter Lab
No ratings yet
Image Processing with Jupyter Lab
8 pages
Info
No ratings yet
Info
10 pages
IV-Unit AIIA Complete Material
No ratings yet
IV-Unit AIIA Complete Material
111 pages
Python OpenCV Image Processing Guide
No ratings yet
Python OpenCV Image Processing Guide
3 pages
B120041 IVP Assignment
No ratings yet
B120041 IVP Assignment
14 pages
REF1 - OpenCV Basics
No ratings yet
REF1 - OpenCV Basics
16 pages
Img Proc
No ratings yet
Img Proc
2 pages
Deforum Stable Diffusion v0.7 Guide
No ratings yet
Deforum Stable Diffusion v0.7 Guide
12 pages
Lab Manual Aiml
No ratings yet
Lab Manual Aiml
29 pages
Intro to CNN for Tech Students
No ratings yet
Intro to CNN for Tech Students
4 pages
TLM For CNN
No ratings yet
TLM For CNN
32 pages
Rahulgroup Final
No ratings yet
Rahulgroup Final
3 pages
Srafvana
No ratings yet
Srafvana
6 pages
Generative AI Mini Projects
No ratings yet
Generative AI Mini Projects
39 pages
AI & CV Lab Exercises
No ratings yet
AI & CV Lab Exercises
48 pages
Deep Learning Image Search Engine
No ratings yet
Deep Learning Image Search Engine
5 pages
Flow Image
No ratings yet
Flow Image
2 pages
Dlweek 7
No ratings yet
Dlweek 7
9 pages
Lab05 ML Naqash
No ratings yet
Lab05 ML Naqash
10 pages
AI Visual Inspection System - Project Directory Structure
No ratings yet
AI Visual Inspection System - Project Directory Structure
9 pages
Explore The Implementation of CNNs in Python
No ratings yet
Explore The Implementation of CNNs in Python
10 pages
Image Processing
No ratings yet
Image Processing
36 pages
DIP Mini Project
100% (1)
DIP Mini Project
12 pages
Al3502 Deep Learning For Vision Lab Manuval
No ratings yet
Al3502 Deep Learning For Vision Lab Manuval
19 pages
Bot Py
No ratings yet
Bot Py
35 pages
21BCP167 Ai 9
No ratings yet
21BCP167 Ai 9
10 pages
Intro Ai Group3
No ratings yet
Intro Ai Group3
35 pages
PDL 05-Merged
No ratings yet
PDL 05-Merged
8 pages
Lab05 ML
No ratings yet
Lab05 ML
7 pages
Satellite Imagery DeepLearning-Base - Ipynb
No ratings yet
Satellite Imagery DeepLearning-Base - Ipynb
297 pages
Project Report - Intro To AI
No ratings yet
Project Report - Intro To AI
40 pages
Wa0029.
No ratings yet
Wa0029.
11 pages
Dream - Deep Dream
No ratings yet
Dream - Deep Dream
139 pages
業務処理定義書セマンティックセグメンテーション En
No ratings yet
業務処理定義書セマンティックセグメンテーション En
9 pages
Images and Vision - OpenAI API
No ratings yet
Images and Vision - OpenAI API
8 pages
SPM Planning Documents
No ratings yet
SPM Planning Documents
45 pages
2024 Dbir Data Breach Investigations Report
No ratings yet
2024 Dbir Data Breach Investigations Report
100 pages
Full Chapter of Kozier and Erbs Fundamentals of Nursing 11th Edition Berman Ebook and TestBank Bundle EPUB DOCX PDF Download Now
No ratings yet
Full Chapter of Kozier and Erbs Fundamentals of Nursing 11th Edition Berman Ebook and TestBank Bundle EPUB DOCX PDF Download Now
404 pages
Pressure Sensor Design Guide
No ratings yet
Pressure Sensor Design Guide
97 pages
O RAN - WG1.Slicing Architecture R003 v13.00
No ratings yet
O RAN - WG1.Slicing Architecture R003 v13.00
75 pages
Vu Extend Warranty
No ratings yet
Vu Extend Warranty
4 pages
Tugas Tabel Pti 4
No ratings yet
Tugas Tabel Pti 4
9 pages
Poweredge-2400 - Service Manual - En-Us
No ratings yet
Poweredge-2400 - Service Manual - En-Us
83 pages
Answer
No ratings yet
Answer
4 pages
Module - 2 Dip Notes
No ratings yet
Module - 2 Dip Notes
109 pages
Face Transformation
No ratings yet
Face Transformation
19 pages
Iris Dataset Project Report - Compress
No ratings yet
Iris Dataset Project Report - Compress
16 pages
Get PHP and MySQL Web Development Third Edition Luke Welling Free All Chapters
No ratings yet
Get PHP and MySQL Web Development Third Edition Luke Welling Free All Chapters
55 pages
Digital Divide
No ratings yet
Digital Divide
7 pages
ChatGPT Mastery Cheat Sheet
No ratings yet
ChatGPT Mastery Cheat Sheet
1 page
Computer Guess
No ratings yet
Computer Guess
11 pages
SGQ ATS en
No ratings yet
SGQ ATS en
11 pages
Spreadsheet Applications Exam Questions
No ratings yet
Spreadsheet Applications Exam Questions
8 pages
Computer Architecture 3rd Edition by Moris Mano CH 08
No ratings yet
Computer Architecture 3rd Edition by Moris Mano CH 08
43 pages
Morgan Stanley Cover Letter
100% (2)
Morgan Stanley Cover Letter
30 pages
Pinagmulan NG Tao
No ratings yet
Pinagmulan NG Tao
29 pages
Book 13 Texto PDF
50% (2)
Book 13 Texto PDF
206 pages
Ankur Manna Resume
No ratings yet
Ankur Manna Resume
2 pages
Formal Assertion Verification in VLSI
No ratings yet
Formal Assertion Verification in VLSI
232 pages
Mathematics Grade 11 Part III
No ratings yet
Mathematics Grade 11 Part III
5 pages
User Guide Packard Bell NEC Spirit 6000 Motherboard Manual 661GX-M7
100% (1)
User Guide Packard Bell NEC Spirit 6000 Motherboard Manual 661GX-M7
95 pages
Cloud Security Policy Template
No ratings yet
Cloud Security Policy Template
4 pages
Solutions 34 2
No ratings yet
Solutions 34 2
7 pages
Displays: Emissive Display - Convert Electrical Energy Into Light
No ratings yet
Displays: Emissive Display - Convert Electrical Energy Into Light
28 pages

Quick Reference

Uploaded by

Quick Reference

Uploaded by

# Image Processing Quick Reference Guide

### Imagen Models

### Gemini Models

# General Purpose (with vision)

# Inpainting - Remove Content

# Extend Image Boundaries

# General Editing (no mask)

# Semantic Segmentation (specify class IDs)

# Custom Mask Image

## Common Semantic Classes

| Class ID | Object | Class ID | Object | Class ID | Object |

[Full list of 194 classes in main documentation]

### Generate Image (Imagen)

# Access generated images

### Remove Object (Semantic Mask)

# Need to pass segmentation_classes separately via MaskReferenceImage

### Replace Background

### Outpaint Image

# Load and pad image

### Generate with Gemini

gcs_uris, execution_time = generate_image_from_prompt_and_images(

### Generate with Reference Images

### Virtual Try-On

### Load Image

# From URL (with requests)

### Get Image Info

### Resize Image

# Resize maintaining aspect ratio (thumbnail)

### Create New Image

# Grayscale image (for masks)

### Crop Image

### Paste Image

# With mask for transparency

### Convert to Bytes

# JPEG with quality

### Convert Color Mode

# Add alpha channel

### Read Video

### Extract Frame

# OpenCV uses BGR, PIL uses RGB

### Upload to GCS

### Download from GCS

### Convert GCS URI to HTTPS URL

### Convert HTTPS URL to GCS URI

## Safety and Content Filtering

### Safety Filter Levels

### Person Generation Settings

### Retry Logic

### Check for Generation Errors

### Image Processing

1. **Always validate dimensions**: Check image size before processing

### API Usage

1. **Implement retry logic**: Network failures happen

1. **Validate inputs**: Check file types and sizes

### Image Generation

- Use "fast" models for iteration (`imagen-3.0-fast-generate-001`)

### Image Processing

- Resize images before processing when possible

- Store generated images in GCS immediately

## Common Issues and Solutions

### Issue: "Image too large for API"

### Issue: "Mask doesn't match image size"

### Issue: "RGBA to RGB conversion for JPEG"

### Issue: "Out of memory with large images"

- [Comprehensive Report](./comprehensive-image-processing-report.md) - Full

Key files for reference:

**Last Updated:** 2025-01-22

You might also like

1. Always validate dimensions: Check image size before processing

1. Implement retry logic: Network failures happen

1. Validate inputs: Check file types and sizes

Last Updated: 2025-01-22