Announce

PukiWiki contents have been moved into SONOTS Plugin (20070703)

Scientific software or projects or experiments on my graduate days

Table of Contents

Project: Object Tracking using Particle filtering

Particle filter is a method to implement a Bayesian inference filter using Monte Carlo Simulation. It is well known that the Kalman filter provides an analytically optimal Bayesian solution for the linear/Gaussian case. Particle filter is more general, and can model non-linear/non-Gaussian case. Particle filter gained popularity for object tracking because it was introduced as the Condensation algorithm for object tracking in the computer vision community.

Moved to http://code.google.com/p/opencvx/wiki/ParticleFilter

Tag: ComputerVision ParticleFilter ConDensation C++

Project: Individual Voice Activity Detection

Individual voice activity detection (IVAD) detects speech regions of an interest person in audio. I use a VAD method named PARADE [1] and a speaker identification method based on GMM [2] to construct an IVAD system.

The report is available at filereport.pdf

ONEFOURZERO_STFT_white0db.png
First Edition: May 2008. Last Modified: May 2008
Tag: Scientific SoundProcessing VAD SpeakerIdentification GMM

Project: A Comparison of Multi-class Support Vector Machine Methods with respect to Face Recognition Problem

Support vector machines (SVMs) are originally designed for binary classification problem. How to effectively extend it for multi-class classification problem is still an on-going research issue. In this project, several multi-class SVM methods, one-against-all, one-against-one, DAGSVM, Weston's multi-class SVM, and Crammer's multi-class SVM were compared subject to Face Recognition problem.

The report is available at filereport.pdf

First Edition: Dec 2007. Last Modified: Dec 2007
Tag: Scientific ComputerVision PatternRecognition SVM Matlab LIBSVM

Project: Source-Filter Separation of Speech Signal (Small)

Deconvolution of speech signal into source (vocal codes, white noise) and filter (oral cavity, coloring, envelope) component.

spSeparationCepstrumDemo.png

First Edition: April 2008. Last Modified: April 2008
Tag: Scientific SoundProcessing Deconvolution Matlab

Project: Pitch Detection

The pitch determination is very important for many speech processing algorithms. In this project, pitch detection methods via autocorrelation method, cepstrum method, harmonic product specturm (HPS), and linear predictive coding (LPC) are examined.

The report is available at filereport
Download matlab codes and data

PitchFrameLPC.png

First Edition: April 2008. Last Modified: April 2008
Tag: Scientific SoundProcessing Pitch Formant Matlab

Project: Short-Time Fourier Transform (Small)

Short-Time Fourier Transform is a well studied filter bank. It can be seen in various ways, simply taking fourier transform in short time, low-pass filter applied for modulated signal, filter bank.

stftTest_signal.png

First Edition: April 2006. Last Modified: April 2008
Tag: Scientific SoundProcessing SignalProcessing STFT Matlab FFT

Project: Image Matching using Scale Invariant Feature Transform (SIFT)

Image matching is a fundamental aspect of many problems in computer vision, including object or scene recognition, solving for 3D structure form multiple images, stereo correspondence, and motion tracking. Scale-invariant feature transform (or SIFT) proposed by David Lowe in 2003 is an algorithm for extracting distinctive features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale, rotation, and partially invariant (i.e. robust) to change in 3D viewpoint, addition of noise, and change in illumination. They are well localized in both the spatial and frequency domains, reducing the probability of disruption by occlusion, clutter, or noise. Large numbers of features can be extracted from typical images with efficient algorithms. In addition, the features are highly distinctive, which allows a single feature to be correctly matched with high probability against a large database of features, providing a basis for object and scene recogition.

BookSceneMatchLowe.png
First Edition: May 2007. Last Modified: May 2007
Tag: Scientific ComputerVision InterestPointDetection ImageMatching Matlab

Project: Shape and motion from image streams: A factorization method

The structure from motion - recovering scene geometry and camera motion from a sequence of images - is an important task and has wide applicability in many tasks, such as navigation and robot manipulation. Tomasi and Kanade [1] first developed a factorization method to recover shape and motion under an orthographic projection model, and obtained robust and accurate results. Poelman and Kanade [2] have extended the factorization method to paraperspective projection. Triggs [3] further extended the factorization method to fully perspective projection. This method recovers a consistent set of projective depths (projective scale factors) for the image points.

In this project, we implemented these three factorization methods, and comparisons are shown.

The report is available at fileFactorization.pdf

hotel.seq0.png
hotel.seq50.png
hotel.seq99.png
1st frame50th100th

orthohotel_upper3d.png

First Edition: Nov 2006. Last Modified Nov 2006
Tag: Scientific ComputerVision SfM Matlab

Project: Texture Segmentation using Gabor Filters and K-means Clustering

This is a report of a course project to implement texture segmentation system using filtering methods. I basically followed "Unsupervised Texture Segmentation using Gabor Filters" by A. K. Jain and F. Farrokhnia [1].

The completed report is available at fileGaborTextureSegment.pdf

data.20.png
seg.data.20.png
First Edition: Oct 2006. Last Modified: Oct 2006
Tag: Scientific ComputerVision Segmentation Matlab

Project: Normalized Cuts and Image Segmentation

An Image Segmentation technique based on Graph Theory, Normalized Graph Cut.

s42049.jpg
s42049-1.pngs42049-2.pngs42049-3.png
s42049-4.pngs42049-5.pngs42049-6.png

The report is available at fileNcutImageSegment.pdf

First Edition: Oct 2006 Last Modified: Oct 2006
Tag: Scientific ComputerVision Segmentation Matlab

Project: Jpeg2000-like wavelet based codec

Wavelet-based image coding has gone through significant advancement since the DCT-based codec became the first JPEG standard. In this project, we explore the wavelet-based image coding and examine major factors the affect the coding performance. The Embedded Zero-Tree Wavelet (EZW) codec for image compression is implemented. In addition, Block-based EZW coding is implemented and comparison between different block sizes are performed,

encode

  1. wavelet transform
  2. EZW coding
  3. huffman coding

decode

  1. inverse wavelet transform
  2. EZW decoding
  3. huffman decoding
First Edition: April 2007. Last Modified: April 2007
Tag: Scientific ComputerVision Compression Matlab

Project: Jpeg-like codec

encode

  1. RGB to YCbCr
  2. downsample Cb, Cr
  3. 2-D Block DCT
  4. Quantization (Quantization Table)
  5. zigzag scan
  6. entropy coding

decode

  1. entropy decoding
  2. Inverse zigzag
  3. Inverse Quantization
  4. Inverse 2-D Block DCT
  5. upsample Cb, Cr
  6. YCbCr to RGB
First Edition: March 2007. Last Modified: March 2007
Tag: Scientific ComputerVision Compression Matlab

Project: Eigenfaces and Fisherfaces

This project describes a study of two traditional face recognition methods, the Eigenface [10] and the Fisherface [7]. The Eigenface is the first method considered as a successful technique of face recognition. The Eigenface method uses Principal Component Analysis (PCA) to linearly project the image space to a low dimensional feature space. The Fisherface method is an enhancement of the Eigenface method that it uses Fisher’s Linear Discriminant Analysis (FLDA or LDA) for the dimensionality reduction. The LDA maximizes the ratio of between-class scatter to that of within-class scatter, therefore, it works better than PCA for purpose of discrimination. The Fisherface is especially useful when facial images have large variations in illumination and facial expression. In this project, a comparison of the Eigenface and the Fisherface methods respect to facial images having large illumination variations is examined.

eigenface01.pngeigenface02.pngeigenface03.pngeigenface04.png

The report is available at fileEigenFisherFace.pdf

Tag: Scientific ComputerVision Face Recognition Matlab

Tutorial: OpenCV haartraining (Rapid Object Detection With A Cascade of Boosted Classifiers Based on Haar-like Features)

The OpenCV library provides us a greatly interesting demonstration for a face detection. Furthermore, it provides us programs (or functions) that they used to train classifiers for their face detection system, called HaarTraining, so that we can create our own object classifiers using these functions. It is interesting.

However, I could not follow how OpenCV developers performed the haartraining for their face detection system exactly because they did not provide us several information such as what images and parameters they used for training. The objective of this report is to provide step-by-step procedures for following people.

My working environment is Visual Studio + cygwin on Windows XP, or on Linux. The cygwin is required because I use several UNIX commands. I am sure that you will use the cygwin (especially I mean UNIX commands) not only for this haartraining but also for others in the future if you are one of engineer or science people.

FYI: I recommend you to work haartrainig with something different concurrently because you have to wait so many days during training (it would possibly take one week). I typically experimented as 1. run haartraining on Friday 2. forget about it completely 3. see results on next Friday 4. run another haartraining (loop).

figure_3.gif
A picture from the OpenCV website

History

  • 10/16/2008 - Additional experimental results.
  • 08/28/2008 - Revised entirely.
  • 06/05/2007 - opencv-1.0.0
  • 03/12/2006 - First Edition (opencv-0.9.7)

Tag: SciSoftware ComputerVision FaceDetection OpenCV

Demo: 2-D Haar Wavelet

This is a note rather than a project to use wavedec2 from wavelet toolbox.

raw10%
lena.pnglenadwt10.png
5%2%
lenadwt05.pnglenadwt02.png

Demo: Supervised classification using Self-organizing Map

This is an experiment or a note rather than a software of SOM. SOM itself is an unsupervised clustering technique, but we are using it for supervised classification.

somTest.png

Demo: Singluar Value Decomposition

This is a note rather than a project to use svd function.

raw10%
lena.pnglenasvd10.png
5%2%
lenasvd05.pnglenasvd02.png

Demo: Error Backpropagation Neural Networks

This is an experiment rather than a software of error backpropagation NN. We use Neural Network toolbox.

In ebpnn.m, we set the # of neurons of output layer only 1, and we used a characteristic which error backpropagation NN outputs real values, that is, we classified a class if the ouput values is almost equal to the class interger label.

In ebpnn2.m, we set the # of neurons of output layer = # of classes like perceptron neural networks, and we classified a class if it's associated neuron is fired. Actually, because the output value is a real value, we defined 'the fired neuron' as a neuron which output the nearest value to 1.

ebpnn2Test.png

Code: Mahalanobis distance

This matlab function calculates mahalanobis distance among each vector between two data sets. Mahalanobis distance is a distance measure based on correlations between variables [1].

d(\vec{x},\vec{y})=\sqrt{(\vec{x}-\vec{y})^T\Sigma^{-1} (\vec{x}-\vec{y})},

where \Sigma is the covariance matrix.

Code: K-means clustering using euclid distance

Implementation of K-means algorithm and experiments. Clustering, Vector Quantization, Classification (supervised), and Speaker identification using k-means vector quantization were experimented.

kmeans_classifiTest.png

Code: K-Nearest neighbor classification using euclid distance

This matlab function does K-Nearest neighbor classification. We use euclid distance for easiness.

knnTest.png

Code: Spatial Filters

Basic Image Processing tools. Spatial filters such as average filter, sobel filter, high boost filter, median filter.

Lena.png

IV1LenaRoberts45_96.png

First Edition: 2003/05(c++). Last Modified: 2007/02(matlab).
Tag: Scientific ImagePocessing EdgeDetection Smoothing Sharpening

Code: Gray Level Transformation

Basic Image Processing tools. Gray Level Transformation such as Histogram Equalization, Histogram Stretching.

LenaDark.png

IIILena1Histeq.png

First Edition(c++): 2003/05. Last Modified: 2007/02.
Tag: Scientific ImagePocessing Histogram Equalization

Code: Newton Interpolation

In the mathematical field of numerical analysis, a Newton polynomial, named after its inventor Isaac Newton, is the interpolation polynomial for a given set of data points in the Newton form. The Newton polynomial is sometimes called Newton's divided differences interpolation polynomial because the coefficients of the polynomial are calculated using divided differences.

quoted from wikipedia.

run_newtoninter_chebyshev2.png

First Edition: 2007/02. Last Modified: 2007/02.
Tag: Scientific NumericalAnalysis Interpolation Newton

Article: The 2-D Gaussain Filter

The Gaussian filter is a smoothing filter used to blur images to suppress noises. The effect of the Gaussian filter is similar to the average filter in this sense, however, the Gaussian filter is more ideal low-pass filter than the average filter.

In this report, I describe properties or practical issues of the Gaussian filter which we have to care when we implement a Gaussian filter.

InputOutput
texture.20.pngcvGaussFilter2Demo.png
First Edition: 10/01/2006. Last Modified: 08/14/2008.
Tag: SciSoftware ComputerVision Filter

Article: Most cited authors in Computer Vision

Partial Lists of Researchers (mainly in Computer Vision) from Most cited authors in Computer Science (CiteSeer). An entry may correspond to multiple authors.

First Edition: May 2007: Last Modified: May 2007.
Tag: Scientific ComputerVision Article

Project: Image retrieval by similarity - World flag image (Undergrad)

Query a national flag, and find the most similar national flag.

This is a course project of Digital Image Processing (Lecturer: Nadia Berthouze) which is a junior level course at the University of Aizu, Japan.

Please go to DIP Project.

Footnote: I remember I was wondered to find the best threshold or weight values. I can find them automatically using Neural Networks or something now, though.