The Glowing Python: image

Showing posts with label image. Show all posts

Tuesday, April 14, 2020

Recoloring NoIR images on the Raspberry Pi with OpenCV

Not too long ago I've been gifted a Raspberry Pi camera, after taking some pictures I realized that it produced very weird colors and I discovered that it was a NoIR camera! It means that it has no infrared filter and that it can take pictures in the darkness using an infrared LED. Since I never found an application that required taking pictures without proper lighting I started wondering if I could recolor the images processing them with some Python magic. While it's clear that the problem is ill posed, once the camera takes a picture sensing the wrong colors, the original colors are lost. It's also true that it's possible to transfer the coloring from one image to another. This, old but gold, paper shows a technique that is simple enough to be implemented on a pi. Hence, the idea to create a script that recolors the images from the NoIR camera using the colors from images taken with a proper infrared filter.

Here are the elements I gathered to start experimenting:

A nice implementation of the color transfer algorithm, easy to install and run on the pi.
An installation of OpenCV on the pi. It's possible to have a fully optimized OpenCV installation for your pi building it from the source but for this project it's okay to install the library from binaries (this command will do the trick: sudo apt-get install python-opencv).
A camera stand that I built myself recycling the components of an unused usb fan.

The Python script to acquire and recolor the images turned out to be pretty compact:

from picamera.array import PiRGBArray
from picamera import PiCamera
from sys import argv
# get this with: pip install color_transfer
from color_transfer import color_transfer 

import time
import cv2

# init the camera
camera = PiCamera()
rawCapture = PiRGBArray(camera)

# camera to warmup
time.sleep(0.1)

# capture
camera.capture(rawCapture, format="bgr")
captured = rawCapture.array

# import the color source
color_source = cv2.imread(argv[1])

# transfer the color
result = color_transfer(color_source, captured,
                        clip=True, preserve_paper=False)

cv2.imwrite(argv[2], result)

This script captures an image from the camera and reads another image, that will be the color source, from the disk. Then, it recolors the captured image and saves the result. The script takes in input two parameters, the color source and the name of the file in output. Here's an example of how to run the script on the pi:

$ python capture.py color_source.jpg result.jpg

Here are some samples pictures that were recolored. In each of the figures below there is the color source on the left, the image from the NoIR camera in the middle and final result on the right.

Here the source has vivid colors and the details are nice and sharp while the image from the NoIR camera is almost monochromatic. In the recolored image the color of the curtain and the wall were recovered, still the image has quite a low contrast.

This time the resulting image is much sharper and the resulting colors are more intense, even more intense than the source.

This result is particularly interesting because the NoIR image shows very nasty colors as there was quite a lot of sunlight when the picture was taken. Recoloring the image I could recover the green of some trees and the blue of the sky, however the walls and the ground got a greenish appearance while some plants look purple.

In conclusion, this turned out to be a fun experiment that also provided some encouraging results. Next step? Recoloring the images with, more modern, Deep Learning techniques.

Sunday, July 8, 2012

Color quantization

The aim of color clustering is to produce a small set of representative colors which captures the color properties of an image. Using the small set of color found by the clustering, a quantization process can be applied to the image to find a new version of the image that has been "simplified," both in colors and shapes.
In this post we will see how to use the K-Means algorithm to perform color clustering and how to apply the quantization. Let's see the code:

from pylab import imread,imshow,figure,show,subplot
from numpy import reshape,uint8,flipud
from scipy.cluster.vq import kmeans,vq

img = imread('clearsky.jpg')

# reshaping the pixels matrix
pixel = reshape(img,(img.shape[0]*img.shape[1],3))

# performing the clustering
centroids,_ = kmeans(pixel,6) # six colors will be found
# quantization
qnt,_ = vq(pixel,centroids)

# reshaping the result of the quantization
centers_idx = reshape(qnt,(img.shape[0],img.shape[1]))
clustered = centroids[centers_idx]

figure(1)
subplot(211)
imshow(flipud(img))
subplot(212)
imshow(flipud(clustered))
show()

The result shoud be as follows:

We have the original image on the top and the quantized version on the bottom. We can see that the image on the bottom has only six colors. Now, we can plot the colors found with the clustering in the RGB space with the following code:

# visualizing the centroids into the RGB space
from mpl_toolkits.mplot3d import Axes3D
fig = figure(2)
ax = fig.gca(projection='3d')
ax.scatter(centroids[:,0],centroids[:,1],centroids[:,2],c=centroids/255.,s=100)

show()

And this is the result:

Computing a disparity map in OpenCV

A disparity map contains information related to the distance of the objects of a scene from a viewpoint. In this example we will see how to compute a disparity map from a stereo pair and how to use the map to cut the objects far from the cameras.
The stereo pair is represented by two input images, these images are taken with two cameras separated by a distance and the disparity map is derived from the offset of the objects between them. There are various algorithm to compute a disparity map, the one implemented in OpenCV is the graph cut algorithm. To use it we have to call the function CreateStereoGCState() to initialize the data structure needed by the algorithm and use the function FindStereoCorrespondenceGC() to get the disparity map. Let's see the code:

def cut(disparity, image, threshold):
 for i in range(0, image.height):
  for j in range(0, image.width):
   # keep closer object
   if cv.GetReal2D(disparity,i,j) > threshold:
    cv.Set2D(disparity,i,j,cv.Get2D(image,i,j))

# loading the stereo pair
left  = cv.LoadImage('scene_l.bmp',cv.CV_LOAD_IMAGE_GRAYSCALE)
right = cv.LoadImage('scene_r.bmp',cv.CV_LOAD_IMAGE_GRAYSCALE)

disparity_left  = cv.CreateMat(left.height, left.width, cv.CV_16S)
disparity_right = cv.CreateMat(left.height, left.width, cv.CV_16S)

# data structure initialization
state = cv.CreateStereoGCState(16,2)
# running the graph-cut algorithm
cv.FindStereoCorrespondenceGC(left,right,
                          disparity_left,disparity_right,state)

disp_left_visual = cv.CreateMat(left.height, left.width, cv.CV_8U)
cv.ConvertScale( disparity_left, disp_left_visual, -16 );
cv.Save( "disparity.pgm", disp_left_visual ); # save the map

# cutting the object farthest of a threshold (120)
cut(disp_left_visual,left,120)

cv.NamedWindow('Disparity map', cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage('Disparity map', disp_left_visual)
cv.WaitKey()

These are the two input image I used to test the program (respectively left and right):

Result using threshold = 100

Face and eyes detection in OpenCV

The goal of object detection is to find an object of a pre-defined class in an image. In this post we will see how to use the Haar Classifier implemented in OpenCV in order to detect faces and eyes in a single image. We are going to use two trained classifiers stored in two XML files:

haarcascade_frontalface_default.xml - that you can find in the directory /data/haarcascades/ of your OpenCV installation
haarcascade_eye.xml - that you can download from this website.

The first one is able to detect faces and the second one eyes. To use a trained classifier stored in a XML file we need to load it into memory using the function cv.Load() and call the function cv.HaarDetectObjects() to detect the objects. Let's see the snippet:

imcolor = cv.LoadImage('detectionimg.jpg') # input image
# loading the classifiers
haarFace = cv.Load('haarcascade_frontalface_default.xml')
haarEyes = cv.Load('haarcascade_eye.xml')
# running the classifiers
storage = cv.CreateMemStorage()
detectedFace = cv.HaarDetectObjects(imcolor, haarFace, storage)
detectedEyes = cv.HaarDetectObjects(imcolor, haarEyes, storage)

# draw a green rectangle where the face is detected
if detectedFace:
 for face in detectedFace:
  cv.Rectangle(imcolor,(face[0][0],face[0][1]),
               (face[0][0]+face[0][2],face[0][1]+face[0][3]),
               cv.RGB(155, 255, 25),2)

# draw a purple rectangle where the eye is detected
if detectedEyes:
 for face in detectedEyes:
  cv.Rectangle(imcolor,(face[0][0],face[0][1]),
               (face[0][0]+face[0][2],face[0][1]+face[0][3]),
               cv.RGB(155, 55, 200),2)

cv.NamedWindow('Face Detection', cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage('Face Detection', imcolor) 
cv.WaitKey()

These images are produced running the script with two different inputs. The first one is obtained from an image that contains two faces and four eyes:

And the second one is obtained from an image that contains one face and two eyes (the shakira.jpg we used in the post about PCA):

Monday, October 24, 2011

Corner Detection with OpenCV

Corner detection is an approach used within computer vision systems to extract certain kinds of features and infer the contents of an image [Ref]. We have seen in the previous post how to perform an edge detection using the Sobel operator. Using the edges, we can define a corner as a point for which there are two dominant and different edge directions in a local neighborhood of the point.
One of the most used tool for corner detection is the Harris Corner Detector operator. OpenCV provide a function that implement this operator. The name of the function is CornerHarris(...) and the corners in the image can be found as the local maxima of the returned image. Let's see how to use it in Python:

imcolor = cv.LoadImage('stairs.jpg')
image = cv.LoadImage('stairs.jpg',cv.CV_LOAD_IMAGE_GRAYSCALE)
cornerMap = cv.CreateMat(image.height, image.width, cv.CV_32FC1)
# OpenCV corner detection
cv.CornerHarris(image,cornerMap,3)

for y in range(0, image.height):
 for x in range(0, image.width):
  harris = cv.Get2D(cornerMap, y, x) # get the x,y value
  # check the corner detector response
  if harris[0] > 10e-06:
   # draw a small circle on the original image
   cv.Circle(imcolor,(x,y),2,cv.RGB(155, 0, 25))

cv.NamedWindow('Harris', cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage('Harris', imcolor) # show the image
cv.SaveImage('harris.jpg', imcolor)
cv.WaitKey()

The following image is the result of the program:

The red markers are the corners found by the OpenCV's function.

Friday, October 14, 2011

Beginning with OpenCV in Python

OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision [Ref]. In this post we will see how to use some of the basic functions of OpenCV in Python.

The following code opens an image from the disk, prints some image properties on the console and shows a window that contains the image.

# load and show an image in gray scale
image = cv.LoadImage('ariellek.jpg',cv.CV_LOAD_IMAGE_GRAYSCALE)

# print some image properties
print 'Depth:',image.depth,'# Channels:',image.nChannels
print 'Size:',image.width,image.height
print 'Pixel values average',cv.Avg(image)

# create the window
cv.NamedWindow('my window', cv.CV_WINDOW_AUTOSIZE)
cv.ShowImage('my window', image) # show the image
cv.WaitKey() # the window will be closed with a (any)key press

This is the image I used for this example.

And this is what the script showed on the console:

Depth: 8 # Channels: 1
Size: 366 550
Pixel values average (80.46735717834079, 0.0, 0.0, 0.0)

Now we can resize the image loaded above:

# resize the image
dst = cv.CreateImage((150,150), 8, 1)
cv.Resize(image,dst,interpolation=cv.CV_INTER_LINEAR)
cv.ShowImage('my window', dst)
cv.WaitKey()
cv.SaveImage('image2.jpg', dst) # save the image

And this is the result.

A Sobel operator can be applied as follow:

# Sobel operator
dstSobel = cv.CreateMat(image.height, image.width, cv.CV_32FC1)
cv.Sobel(image,dstSobel,1,1,3)
cv.ShowImage('my window', dstSobel)
cv.WaitKey()
cv.SaveImage('imageSobel.jpg', dstSobel)

And this is the result on the picture that I'm using:

The final example below uses two operation, a smoothing filter and a subtraction. It applies a Gaussian Blur to the original image and subtracts the result of the filtering from the original image.

# image smoothing and subtraction
imageBlur = cv.CreateImage(cv.GetSize(image), image.depth, image.nChannels)
# filering the original image
cv.Smooth(image, imageBlur, cv.CV_BLUR, 15, 15)
diff = cv.CreateImage(cv.GetSize(image), image.depth, image.nChannels)
# subtraction (original - filtered)
cv.AbsDiff(image,imageBlur,diff)
cv.ShowImage('my window', diff)
cv.WaitKey()
cv.SaveImage('imageDiff.jpg', diff)

The final output is:

Wednesday, August 10, 2011

Applying a Moebius transformation to an image

The Moebius transformation is defined as

where z is a complex variable different than -d/c. And a, b, c, d are complex numbers.
In this post we will see how to apply the Moebius transform to an image.
Given a set of three distinct points on the complex plane z1, z2, z3 and a second set of distinct points w1, w2, w3, there exists precisely one Moebius transformation f(z) which maps the zs to the ws. So, in the first step we have to determine f(z) from the given sets of points. We will use the explicit determinant formula to compute the coefficients:

from pylab import *
from numpy import *

zp=[157+148j, 78+149j, 54+143j]; # (zs) the complex point zp[i]
wa=[147+143j, 78+140j, 54+143j]; # (ws) will be in wa[i]

# transformation parameters
a = linalg.det([[zp[0]*wa[0], wa[0], 1], 
                [zp[1]*wa[1], wa[1], 1], 
                [zp[2]*wa[2], wa[2], 1]]);

b = linalg.det([[zp[0]*wa[0], wa[0], wa[0]], 
                [zp[1]*wa[1], wa[1], wa[1]], 
                [zp[2]*wa[2], wa[2], wa[2]]]);         

c = linalg.det([[zp[0], wa[0], 1], 
                [zp[1], wa[1], 1], 
                [zp[2], wa[2], 1]]);

d = linalg.det([[zp[0]*wa[0], zp[0], 1], 
                [zp[1]*wa[1], zp[1], 1], 
                [zp[2]*wa[2], zp[2], 1]]);

Now we can apply the transformation to the pixel coordinates.

img = fliplr(imread('mondrian.jpg')) # load an image

r = ones((500,500,3),dtype=uint8)*255 # empty-white image
for i in range(img.shape[0]):
 for j in range(img.shape[1]):
  z = complex(i,j)
  # transformation applied to the pixels coordinates
  w = (a*z+b)/(c*z+d)
  r[int(real(w)),int(imag(w)),:] = img[i,j,:] # copy of the pixel

subplot(1,2,1)
title('Original Mondrian')
imshow(img)
subplot(1,2,2)
title('Mondrian after the transformation')
imshow(roll(r,120,axis=1))
show()

And this is the result.

This is another image obtained changing the vectors zp and wa.

As we can see, the result of the transformation has some empty areas that we should fill using interpolation. So let's look to another way to implement a geometric transformation on an image. The following code uses the scipy.ndimage.geometric_transform to implement the inverse Moebius transform:

from scipy.ndimage import geometric_transform

def shift_func(coords):
 """ Define the moebius transformation, though backwards """
 #turn the first two coordinates into an imaginary number
 z = coords[0] + 1j*coords[1]
 w = (d*z-b)/(-c*z+a) #the inverse mobius transform
 #take the color along for the ride
 return real(w),imag(w),coords[2]
    
r = geometric_transform(img,shift_func,cval=255,output_shape=(450,350,3))

It gives the following result:

We can see that the geometric_transform provides the pixel interpolation automatically.

Wednesday, July 27, 2011

PCA and image compression with numpy

In the previous post we have seen the princomp function. This function performs principal components analysis (PCA) on the n-by-p data matrix and uses all the p principal component to computed the principal component scores. In this new post, we will see a modified version of the princomp where the representation of the original data in the in the principal component space is computed with less than p principal components:

from numpy import mean,cov,cumsum,dot,linalg,size,flipud

def princomp(A,numpc=0):
 # computing eigenvalues and eigenvectors of covariance matrix
 M = (A-mean(A.T,axis=1)).T # subtract the mean (along columns)
 [latent,coeff] = linalg.eig(cov(M))
 p = size(coeff,axis=1)
 idx = argsort(latent) # sorting the eigenvalues
 idx = idx[::-1]       # in ascending order
 # sorting eigenvectors according to the sorted eigenvalues
 coeff = coeff[:,idx]
 latent = latent[idx] # sorting eigenvalues
 if numpc < p and numpc >= 0:
  coeff = coeff[:,range(numpc)] # cutting some PCs if needed
 score = dot(coeff.T,M) # projection of the data in the new space
 return coeff,score,latent

The following code uses the new version of the princomp to compute the PCA of a matrix that represents an image in gray scale. The PCA is computed ten times with an increasing number of principal components. The script show the images reconstructed using less than 50 principal components (out of 200).

from pylab import imread,subplot,imshow,title,gray,figure,show,NullLocator
A = imread('shakira.jpg') # load an image
A = mean(A,2) # to get a 2-D array
full_pc = size(A,axis=1) # numbers of all the principal components
i = 1
dist = []
for numpc in range(0,full_pc+10,10): # 0 10 20 ... full_pc
 coeff, score, latent = princomp(A,numpc)
 Ar = dot(coeff,score).T+mean(A,axis=0) # image reconstruction
 # difference in Frobenius norm
 dist.append(linalg.norm(A-Ar,'fro'))
 # showing the pics reconstructed with less than 50 PCs
 if numpc <= 50:
  ax = subplot(2,3,i,frame_on=False)
  ax.xaxis.set_major_locator(NullLocator()) # remove ticks
  ax.yaxis.set_major_locator(NullLocator())
  i += 1 
  imshow(flipud(Ar))
  title('PCs # '+str(numpc))
  gray()

figure()
imshow(flipud(A))
title('numpc FULL')
gray()
show()

The resulting images:

We can see that 40 principal components are enough to reconstruct the original image.

At the end of this experiment, we can plot the distance of the reconstructed images from the original image in Frobenius norm (red curve) and the cumulative sum of the eigenvalues (blue curve). Recall that the cumulative sum of the eigenvalues shows the level of variance accounted by each of the corresponding eigenvectors. On the x axis there is the number of eigenvalues/eigenvectors used.

from pylab import plot,axis
figure()
perc = cumsum(latent)/sum(latent)
dist = dist/max(dist)
plot(range(len(perc)),perc,'b',range(0,full_pc+10,10),dist,'r')
axis([0,full_pc,0,1.1])
show()