A speech pattern recognition system includes sets of radiation modulation members that transmit radiation of two different types and two radiation responsive members each for receiving one of the radiation types. A shutter is supported adjacent each set on a reed having a resonant frequency matched to the frequency of a particular segment of a spoken syllable or word. In the use of the system, the vibrational amplitudes of the shutters control the amount of each type of radiation that is received by the radiation responsive members. When the difference between the outputs of the radiation responsive members is at a maximum, the system produces an output indicative of the correct enunciation of the word or syllable.