Method and apparatus for recognizing unknown spoken words and by feature extraction and comparison with reference words