- Project: Object Tracking using Particle filtering
- Project: Individual Voice Activity Detection
- Project: A Comparison of Multi-class Support Vector Machine Methods with respect to Face Recognition Problem
- Project: Source-Filter Separation of Speech Signal (Small)
- Project: Pitch Detection
- Project: Short-Time Fourier Transform (Small)
- Project: Image Matching using Scale Invariant Feature Transform (SIFT)
- Project: Shape and motion from image streams: A factorization method
- Project: Texture Segmentation using Gabor Filters and K-means Clustering
- Project: Normalized Cuts and Image Segmentation
- Project: Jpeg2000-like wavelet based codec
- Project: Jpeg-like codec
- Project: Eigenfaces and Fisherfaces
- Tutorial: OpenCV haartraining (Rapid Object Detection With A Cascade of Boosted Classifiers Based on Haar-like Features)
- Demo: 2-D Haar Wavelet
- Demo: Supervised classification using Self-organizing Map
- Demo: Singluar Value Decomposition
- Demo: Error Backpropagation Neural Networks
- Code: Mahalanobis distance
- Code: K-means clustering using euclid distance
- Code: K-Nearest neighbor classification using euclid distance
- Code: Spatial Filters
- Code: Gray Level Transformation
- Code: Newton Interpolation
- Article: The 2-D Gaussain Filter
- Article: Most cited authors in Computer Vision
- Project: Image retrieval by similarity - World flag image (Undergrad)
Particle filter is a method to implement a Bayesian inference filter using Monte Carlo Simulation. It is well known that the Kalman filter provides an analytically optimal Bayesian solution for the linear/Gaussian case. Particle filter is more general, and can model non-linear/non-Gaussian case. Particle filter gained popularity for object tracking because it was introduced as the Condensation algorithm for object tracking in the computer vision community.
Individual voice activity detection (IVAD) detects speech regions of an interest person in audio. I use a VAD method named PARADE  and a speaker identification method based on GMM  to construct an IVAD system.
The report is available at report.pdf
Tag: Scientific SoundProcessing VAD SpeakerIdentification GMM
Project: A Comparison of Multi-class Support Vector Machine Methods with respect to Face Recognition Problem
Support vector machines (SVMs) are originally designed for binary classification problem. How to effectively extend it for multi-class classification problem is still an on-going research issue. In this project, several multi-class SVM methods, one-against-all, one-against-one, DAGSVM, Weston's multi-class SVM, and Crammer's multi-class SVM were compared subject to Face Recognition problem.
The report is available at report.pdf
Tag: Scientific ComputerVision PatternRecognition SVM Matlab LIBSVM
Deconvolution of speech signal into source (vocal codes, white noise) and filter (oral cavity, coloring, envelope) component.
Tag: Scientific SoundProcessing Deconvolution Matlab
The pitch determination is very important for many speech processing algorithms. In this project, pitch detection methods via autocorrelation method, cepstrum method, harmonic product specturm (HPS), and linear predictive coding (LPC) are examined.
Tag: Scientific SoundProcessing Pitch Formant Matlab
Short-Time Fourier Transform is a well studied filter bank. It can be seen in various ways, simply taking fourier transform in short time, low-pass filter applied for modulated signal, filter bank.
Tag: Scientific SoundProcessing SignalProcessing STFT Matlab FFT
Image matching is a fundamental aspect of many problems in computer vision, including object or scene recognition, solving for 3D structure form multiple images, stereo correspondence, and motion tracking. Scale-invariant feature transform (or SIFT) proposed by David Lowe in 2003 is an algorithm for extracting distinctive features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale, rotation, and partially invariant (i.e. robust) to change in 3D viewpoint, addition of noise, and change in illumination. They are well localized in both the spatial and frequency domains, reducing the probability of disruption by occlusion, clutter, or noise. Large numbers of features can be extracted from typical images with efficient algorithms. In addition, the features are highly distinctive, which allows a single feature to be correctly matched with high probability against a large database of features, providing a basis for object and scene recogition.
Tag: Scientific ComputerVision InterestPointDetection ImageMatching Matlab
The structure from motion - recovering scene geometry and camera motion from a sequence of images - is an important task and has wide applicability in many tasks, such as navigation and robot manipulation. Tomasi and Kanade  first developed a factorization method to recover shape and motion under an orthographic projection model, and obtained robust and accurate results. Poelman and Kanade  have extended the factorization method to paraperspective projection. Triggs  further extended the factorization method to fully perspective projection. This method recovers a consistent set of projective depths (projective scale factors) for the image points.
In this project, we implemented these three factorization methods, and comparisons are shown.
The report is available at Factorization.pdf
This is a report of a course project to implement texture segmentation system using filtering methods. I basically followed "Unsupervised Texture Segmentation using Gabor Filters" by A. K. Jain and F. Farrokhnia .
The completed report is available at GaborTextureSegment.pdf
An Image Segmentation technique based on Graph Theory, Normalized Graph Cut.
The report is available at NcutImageSegment.pdf
Wavelet-based image coding has gone through significant advancement since the DCT-based codec became the first JPEG standard. In this project, we explore the wavelet-based image coding and examine major factors the affect the coding performance. The Embedded Zero-Tree Wavelet (EZW) codec for image compression is implemented. In addition, Block-based EZW coding is implemented and comparison between different block sizes are performed,
- wavelet transform
- EZW coding
- huffman coding
- inverse wavelet transform
- EZW decoding
- huffman decoding
Tag: Scientific ComputerVision Compression Matlab
- RGB to YCbCr
- downsample Cb, Cr
- 2-D Block DCT
- Quantization (Quantization Table)
- zigzag scan
- entropy coding
- entropy decoding
- Inverse zigzag
- Inverse Quantization
- Inverse 2-D Block DCT
- upsample Cb, Cr
- YCbCr to RGB
Tag: Scientific ComputerVision Compression Matlab
This project describes a study of two traditional face recognition methods, the Eigenface  and the Fisherface . The Eigenface is the first method considered as a successful technique of face recognition. The Eigenface method uses Principal Component Analysis (PCA) to linearly project the image space to a low dimensional feature space. The Fisherface method is an enhancement of the Eigenface method that it uses Fisher’s Linear Discriminant Analysis (FLDA or LDA) for the dimensionality reduction. The LDA maximizes the ratio of between-class scatter to that of within-class scatter, therefore, it works better than PCA for purpose of discrimination. The Fisherface is especially useful when facial images have large variations in illumination and facial expression. In this project, a comparison of the Eigenface and the Fisherface methods respect to facial images having large illumination variations is examined.
The report is available at EigenFisherFace.pdf
Tutorial: OpenCV haartraining (Rapid Object Detection With A Cascade of Boosted Classifiers Based on Haar-like Features)
The OpenCV library provides us a greatly interesting demonstration for a face detection. Furthermore, it provides us programs (or functions) that they used to train classifiers for their face detection system, called HaarTraining, so that we can create our own object classifiers using these functions. It is interesting.
However, I could not follow how OpenCV developers performed the haartraining for their face detection system exactly because they did not provide us several information such as what images and parameters they used for training. The objective of this report is to provide step-by-step procedures for following people.
My working environment is Visual Studio + cygwin on Windows XP, or on Linux. The cygwin is required because I use several UNIX commands. I am sure that you will use the cygwin (especially I mean UNIX commands) not only for this haartraining but also for others in the future if you are one of engineer or science people.
FYI: I recommend you to work haartrainig with something different concurrently because you have to wait so many days during training (it would possibly take one week). I typically experimented as 1. run haartraining on Friday 2. forget about it completely 3. see results on next Friday 4. run another haartraining (loop).
A picture from the OpenCV website
- 10/16/2008 - Additional experimental results.
- 08/28/2008 - Revised entirely.
- 06/05/2007 - opencv-1.0.0
- 03/12/2006 - First Edition (opencv-0.9.7)
This is a note rather than a project to use wavedec2 from wavelet toolbox.
This is an experiment or a note rather than a software of SOM. SOM itself is an unsupervised clustering technique, but we are using it for supervised classification.
This is a note rather than a project to use svd function.
This is an experiment rather than a software of error backpropagation NN. We use Neural Network toolbox.
In ebpnn.m, we set the # of neurons of output layer only 1, and we used a characteristic which error backpropagation NN outputs real values, that is, we classified a class if the ouput values is almost equal to the class interger label.
In ebpnn2.m, we set the # of neurons of output layer = # of classes like perceptron neural networks, and we classified a class if it's associated neuron is fired. Actually, because the output value is a real value, we defined 'the fired neuron' as a neuron which output the nearest value to 1.
This matlab function calculates mahalanobis distance among each vector between two data sets. Mahalanobis distance is a distance measure based on correlations between variables .
where is the covariance matrix.
Implementation of K-means algorithm and experiments. Clustering, Vector Quantization, Classification (supervised), and Speaker identification using k-means vector quantization were experimented.
This matlab function does K-Nearest neighbor classification. We use euclid distance for easiness.
Basic Image Processing tools. Spatial filters such as average filter, sobel filter, high boost filter, median filter.
Tag: Scientific ImagePocessing EdgeDetection Smoothing Sharpening
Basic Image Processing tools. Gray Level Transformation such as Histogram Equalization, Histogram Stretching.
Tag: Scientific ImagePocessing Histogram Equalization
In the mathematical field of numerical analysis, a Newton polynomial, named after its inventor Isaac Newton, is the interpolation polynomial for a given set of data points in the Newton form. The Newton polynomial is sometimes called Newton's divided differences interpolation polynomial because the coefficients of the polynomial are calculated using divided differences.
quoted from wikipedia.
Tag: Scientific NumericalAnalysis Interpolation Newton
The Gaussian filter is a smoothing filter used to blur images to suppress noises. The effect of the Gaussian filter is similar to the average filter in this sense, however, the Gaussian filter is more ideal low-pass filter than the average filter.
In this report, I describe properties or practical issues of the Gaussian filter which we have to care when we implement a Gaussian filter.
Partial Lists of Researchers (mainly in Computer Vision) from Most cited authors in Computer Science (CiteSeer). An entry may correspond to multiple authors.
Query a national flag, and find the most similar national flag.
This is a course project of Digital Image Processing (Lecturer: Nadia Berthouze) which is a junior level course at the University of Aizu, Japan.
Please go to DIP Project.
Footnote: I remember I was wondered to find the best threshold or weight values. I can find them automatically using Neural Networks or something now, though.