0% found this document useful (0 votes)
15 views5 pages

Classifying Buildings

The document outlines a process for pixel-based image classification and object detection using machine learning in Earth Engine, specifically for building detection. It details steps including data preparation, model training, and classification, emphasizing the importance of high-resolution imagery and quality training data. Additionally, it discusses advanced techniques like deep learning for improved accuracy and feature engineering for distinguishing building types.

Uploaded by

Mike Murefu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views5 pages

Classifying Buildings

The document outlines a process for pixel-based image classification and object detection using machine learning in Earth Engine, specifically for building detection. It details steps including data preparation, model training, and classification, emphasizing the importance of high-resolution imagery and quality training data. Additionally, it discusses advanced techniques like deep learning for improved accuracy and feature engineering for distinguishing building types.

Uploaded by

Mike Murefu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

You're venturing into a more advanced and powerful application of Earth Engine: pixel-

based image classification and object detection using machine learning. If the
existing Google Building Footprints are insufficient (e.g., they don't cover your AOI, are
outdated, or you need more specific building types), you'll need to train your own model.
This process generally involves these steps:
1. Data Preparation:
o Imagery: Select suitable high-resolution satellite imagery (e.g., Sentinel-2,
Landsat, or commercial imagery like Maxar if available to you). Cloud-free
imagery is crucial.
o Labels (Ground Truth): This is the most critical part. You need examples
of what constitutes a "building" and "non-building" in your imagery, or
examples of different building types if you want to classify them (e.g.,
residential, commercial).
 Google Building Footprints (V3): Can be an excellent source of
labels for "building" vs. "non-building" classification. You can
convert these polygons into a raster mask to train on.
 Manual Digitization: For specific building types or areas not
covered by Google data, you might need to manually digitize
training polygons.
 Other Vector Data: Existing land use/land cover maps, cadastral
data, or open street map data (with caution regarding accuracy).
o Feature Engineering: Decide which image bands and derived products
(e.g., spectral indices like NDVI, NDBI; textural features like GLCM;
topographic features like slope/aspect from a DEM) will help your model
distinguish between classes.
2. Sampling Training Data: Extract pixel values (features) from your imagery for
your labeled building and non-building areas.
3. Model Training: Train a machine learning classifier using the sampled features
and labels.
4. Classification/Prediction: Apply the trained model to your entire image (or AOI)
to predict the class for every pixel.
5. Post-processing (for building footprints):
o Convert the classified raster (e.g., building pixels) into vector polygons.

o Refine these polygons (smoothing, removing small artifacts, dissolving


adjacent pixels).

Simplified Example: Binary Building Detection (Building vs. Non-


Building) using Random Forest
This example demonstrates a common supervised classification workflow for detecting
generic buildings. This uses a Random Forest classifier, which is powerful and
available directly within Earth Engine.
Key Idea: We will use the existing Google Building Footprints as positive training
samples (buildings) and generate negative training samples (non-buildings) from areas
outside the footprints.
Python
import ee
import geemap

# Initialize Earth Engine


try:
ee.Initialize()
print("Earth Engine initialized successfully.")
except ee.EEException as e:
print(f"Error initializing Earth Engine: {e}")
print("Please run 'earthengine authenticate' in your terminal if you haven't.")
exit()

# --- 1. Define Area of Interest (AOI) and Image Collection ---


# Choose a smaller, representative AOI for faster execution during development
aoi = ee.Geometry.Rectangle([-122.45, 37.75, -122.40, 37.80]) # San Francisco area

# Select an appropriate high-resolution image (e.g., Sentinel-2)


# Filter for a recent, relatively cloud-free image in your AOI
image = ee.ImageCollection('COPERNICUS/S2_SR_HARMONIZED') \
.filterBounds(aoi) \
.filterDate('2023-01-01', '2023-06-30') \
.filter(ee.Filter.lt('CLOUDY_PIXEL_PERCENTAGE', 10)) \
.sort('CLOUDY_PIXEL_PERCENTAGE') \
.first() \
.clip(aoi)

if image is None:
print("No suitable image found for the AOI and date range. Adjust filters.")
exit()

print(f"Using image: {image.get('system:id').getInfo()}")

# Define bands to use for classification (spectral features)


# Common bands for land cover: B2(Blue), B3(Green), B4(Red), B8(NIR)
# Consider adding B11, B12 for SWIR, useful for built-up areas
bands = ['B2', 'B3', 'B4', 'B8', 'B11', 'B12']
image = image.select(bands)

# --- 2. Prepare Training Labels from Google Building Footprints ---


# Load Google Building Footprints
buildings_fc = ee.FeatureCollection('GOOGLE/Research/open-buildings/v3/polygons') \
.filterBounds(aoi) \
.filter(ee.Filter.gt('confidence', 0.7)) # Filter by confidence if desired

# Create 'building' (positive) samples


# Convert polygons to a raster mask where buildings are 1, non-buildings are 0
building_mask = ee.Image.constant(1).toByte().clip(buildings_fc).rename('building')
building_samples = image.sampleRegions(
collection=buildings_fc,
properties=['building'], # Add a 'building' property (value 1) to each sample
scale=10, # Resolution of Sentinel-2
tileScale=4 # Adjust for memory issues on large AOIs
)

# Create 'non_building' (negative) samples


# Generate random points *outside* the building footprints
# First, create a buffered inverse of buildings to avoid sampling too close to building edges
non_building_area = aoi.difference(buildings_fc.geometry().buffer(10)) # Buffer buildings by 10m

# Sample random points in the non-building area


# Use stratifiedSample for better distribution, but simple random points work too
non_building_samples = image.sample({
'region': non_building_area,
'scale': 10,
'numPixels': building_samples.size().getInfo() * 2, # Sample more non-building points than
building points
'seed': 0, # For reproducibility
'geometries': True # Keep geometry for visualization if needed
}).map(lambda feature: feature.set('building', 0)) # Assign 'building' property (value 0)

# Combine positive and negative samples


training_data = building_samples.merge(non_building_samples)

print(f"Total training samples: {training_data.size().getInfo()}")

# --- 3. Train a Classifier (Random Forest) ---


classifier = ee.Classifier.smileRandomForest(numberOfTrees=100).train(
features=training_data,
classProperty='building', # The property containing our class labels (0 or 1)
inputProperties=bands # The bands (features) used for classification
)

# --- 4. Classify the Image ---


classified_image = image.classify(classifier)

# --- 5. Post-processing (Optional, for smoother footprints) ---


# Small morphological operations to clean up the classification
classified_image = classified_image.focal_mode(1.5) # Apply a mode filter
classified_image = classified_image.focal_median(1.5) # Apply a median filter

# --- 6. Visualize the Results ---


m = geemap.Map(center=aoi.centroid().getInfo()['coordinates'][::-1], zoom=15)

# Add original image


vis_params_rgb = {'bands': ['B4', 'B3', 'B2'], 'min': 0, 'max': 2000, 'gamma': 1.8}
m.add_layer(image, vis_params_rgb, 'Sentinel-2 Image (RGB)')

# Add Google Building Footprints (for comparison)


m.add_layer(buildings_fc.style(color='white', fillColor='white'), {}, 'Google Building Footprints (Source
Labels)')

# Add the classified buildings


# Visualize building pixels (where value is 1)
classified_vis = {'min': 0, 'max': 1, 'palette': ['black', 'red']} # Non-building (0) as black, Building (1) as
red
m.add_layer(classified_image, classified_vis, 'Detected Buildings (Random Forest)')

m.add_layer_control()
m

Converting Classified Raster to Vector Footprints


If you want to get new vector building footprints from the classified_image, you would use
reduceToVectors():
Python
# (Continue from the previous script)

# --- 7. Convert Classified Raster to Vector Polygons ---


# Filter for pixels classified as 'building' (value = 1)
buildings_raster_mask = classified_image.eq(1)

# Convert the raster mask to vector polygons


# Only for regions where the pixel value is 1 (buildings)
new_building_footprints = buildings_raster_mask.reduceToVectors(
reducer=ee.Reducer.countDistinct(), # Count pixels, but we just need the geometry
geometry=aoi,
scale=10, # Must match image resolution
maxPixels=1e9, # Increase if you have memory errors for large areas
tileScale=4,
eightConnected=False # Use 4-connected for smoother shapes
).map(lambda f: f.simplify(1)) # Simplify polygons to reduce vertices

print(f"Number of newly detected building footprints: {new_building_footprints.size().getInfo()}")

# Optionally, add the new building footprints to the map


m.add_layer(new_building_footprints.style(color='yellow', fillColor='yellow60'), {}, 'New Detected
Footprints (Vector)')
m

Important Considerations for Advanced Classification:


1. High-Resolution Imagery: Building detection often requires very high-resolution
imagery (sub-meter), which might be commercial (e.g., Maxar, Planet). Sentinel-
2 (10m) is good for broad detection, but 0.5-2m resolution is ideal for precise
footprints.
2. Deep Learning: For state-of-the-art accuracy and more nuanced building
classifications (e.g., distinguishing building types from imagery features alone),
deep learning models (like U-Net for semantic segmentation) are preferred.
o Challenge in GEE: GEE's built-in ML classifiers are traditional (Random
Forest, SVM, CART, etc.). Deep learning models often need to be trained
outside GEE (e.g., using TensorFlow/Keras on Google Colab or Vertex AI)
and then their inference models can sometimes be imported back into
GEE using ee.Model.fromVertexAi() or similar for prediction.
3. Feature Engineering for Semantic Types: If you want to classify types of
buildings (residential vs. commercial), you'd need features that differentiate them.
This could include:
o Area: Very large buildings are less likely to be single-family homes.
o Context: Proximity to roads, other buildings, green spaces, water bodies.
o Texture: Urban patterns often differ between residential and commercial
areas.
o Building Height (if DEM/DSM is available): Tall buildings are often
commercial.
o Shadows: Analyzing shadow length can give height information.
o Time Series: Changes in activity over time could infer commercial use
(e.g., spectral patterns changing with business hours).
4. Training Data Quality: The quality and quantity of your training samples directly
dictate the model's performance. Clean, diverse, and well-distributed samples
are vital.
5. Computational Limits: Training large models or classifying vast areas can hit
GEE's memory and computation limits. tileScale and efficient code are important.
This provides a solid foundation for detecting buildings from imagery using a traditional
machine learning approach in GEE. For more advanced, deep learning-based solutions,
you would look into the GEE-TensorFlow integration pathways.

You might also like