A system having multiple speech recognition engines, each operable to recognize spoken data, is described. A speech recognition engine manager detects the speech recognition engines, and selects at least one for recognizing spoken input from a user, via a user interface. In this way, a speech recognition engine that is particularly suited to a current environment may be selected. For example, a speech recognition engine that is particularly suited for, or preferred by, the user may be selected, or a speech recognition engine that is particularly suited for a particular type of interface, interface element, or application, may be selected. Multiple ones of the speech recognition engines may be selected and simultaneously maintained in an active state, by maintaining a session associated with each of the engines. Accordingly, users' experience with voice applications may be enhanced, and, in particular, users with physical disabilities may more easily interact with software applications.