English abstract
Biometrics refers to the automatic identification of a person based on his/her physiological or behavioural characteristics. By replacing passwords, biometric SSW techniques can potentially prevent unauthorized access to use of ATMs, cellular phones, desktop PCs and computer networks. Various types of biometric systems are being used for real-time identification among them, fingerprint matching, face recognition and voice recognition.
Speaker recognition, which is based on voice recognition, is a very interesting research area that has developed well over the years. A speaker recognition system typically consists of two main parts: feature-set extraction and pattern matching. There are several known feature sets for text-independent speaker identification systems, most of which are dependent on spectral information. Among these feature sets and one of the most successful is the Mel-Frequency Cepstrum Coefficients (MFCC). This thesis introduces a new feature set (Histogram of DCT-Cepstrum Coefficients) inspired by the MFCC, but simpler and faster in computation. A text independent speaker identification system based on DCT-Cepstrum Histogram and Gaussian Mixture Model (GMM) is implemented. The testing of the new feature was done using speech files from the ELSDSR database and TIMIT corpus. The new feature set managed to achieve high efficiency rates with identification accuracy of 100% on 23 speakers from the ELSDSR database, and 99% on 630 speakers from the TIMIT corpus.