Thursday, August 6, 2009

Activity 10: Preprocessing text

In this activity we set out to clean up a scanned document to make it ready for handwriting recognition. Our main goals were to remove unecessary details, i.e. table lines, and to binarize the result. I first rotated the image such that the horizontal lines were parallel to the x axis. I did this by making sure that the maxima of the FFT of the image formed a perfectly vertical line. I then removed the horizontal lines via a vertical mask over the FFT. I then binarized the image using the im2bw function and a threshold of .38.

As an additional objective, we also sought to detect all occurences of the word "description". I did this using the correlation technique we used in activity 5.

RESULTS



Original image / horizontal lines removed / b&w



Basis for correlation



Result of correlation

The removal of the horizontal lines was quite successful. Even though the regularity of the removal is apparent (sinusoidal erasure), it works when we convert the image to B&W. As for the correlation, it works quite well as long as you take only the maxima into account.

For this activity I give myself an 8/10 since I wasn't able to take advantage of morphological operations in binarizing the image.

Activity 12: Color image segmentation

For this activity we try to distinguish ROI's based on their color hues. By changing our color coordinates such that we separate value into an independent variable, we are left with a 2D color map that disregards brightness. With value aside we may now distinguish objects based on color alone.

We achieved the actual differentiation via 2 ways: parametric and non-parametric. In the parametric method, we assume that the color distribution of our object is a Gaussian curve. It is parametric in that we may adjust the standard deviation of the spread to be more or less tolerant. In the non-parametric method, we use the objects actual color distribution and we use the resulting curve for back projection.

RESULTS

Original Image




http://www.missouriplants.com/Yellowopp/Helianthus_divaricatus_flowers.jpg

Parametric approach



Non-parametric approach



It seems that the assumption of a Gaussian distribution of colors results in the manifestation of the white noise on the ROI. This is probably a reflection of the normal distribution of noise on the original image. When we remove this assumption, we get a much more solid and accurate ROI since we use prior knowledge of the objects color distribution. I also observed in my peer's (Sison) work that this method is much more effective in differentiating adjacent colors (e.g. orange and red).

I give myself a 10/10 because I was able to execute the procedure completely and efficiently.

Monday, July 20, 2009

Activity 8: Morphological Operations

In this activity we were tasked to explore the properties of dilation and erosion by applying the said operations on a square, a triangle, a circle, a hollow square and a cross. We are to perform the operations using each of the following: 1) 4x4 square, 2) 2x4 rectangle, 3) 4x2 rectangle, 4) a plus sign of width 1 and length 5. My results are listed below. They are arranged in order of the operators above.

DILATION







EROSION







My predictions for the dilations were spot on except for the triangle dilated with a plus sign. I forgot to note that the bottom side should have manifested the plus sign as square 'clips'. As for the erosions, I was correct for all except the hollow squares. I thought the widths of the operators would have been enough to completely erode the corresponding edges. it seems however that except for the last case (the plus sign operator), thin lines still remained.

I give myself a grade of 9/10 because I was able to execute the procedure completely. My predictions were a bit off hence the -1.

Wednesday, July 8, 2009

Activity 6: Properties of the 2D Fourier Transform

We seek to understand the properties of the 2D Fourier Transform by applying it to several test cases. Below are the results I've gathered:

ORIGINAL / FFT







square / square annulus / donut / slits / dots









several cases of the sin function:
3 vertical sin waves of decreasing frequencies /
biased / rotated / sum of sines in x and y / previous case + rotated sines

Both the FT's of the square and the square annulus look like stars/crosses. They only differ in that the latter looks striped such that it jumps over some values. The symmetry of the FTs reflect the symmetry of the squares. The FT of a donut has a strong peak at the center and is surrounded by a symmetric pattern covering some circular area around it. This reflects the circular symmetry of the object. The FT of the slits looks like a vertical interference pattern at the center. It is as if we have recorded the interference pattern after ltting light pass through the slits. This is expected as the diffraction pattern we get from an aperature is the same its FT.

The FT of the two dots look like the superposistion of two airy disks symmetrically off center. We already found in the previous experiment that the FT of a circle is an airy disk. It is expected then that the FT of two circles is simply a superposition of two airy disks.

As the frequency of the sine wave decreases, the point corresponding to the frequency of the sine wave come closer to the center. As expected, the FT is symmetric along the X axis. Adding a bias simply adds a point at the center of the FT. Rotating the sine wave rotates the FT by the same angle. The last FT turned out as expected since it is simply a superposition of the previous FT's. It outlines a circle since we sum up rotations of the same frequencies. In addition, we still retain the four points from the previous FT.

I give myself a grade of 9/10 because I was able to execute the procedure but I feel I could do better in interpreting the results.