Voice Activated Home Control System








Overall Block-Diagram


















By Kyle Joseph

Troy Resetich

Advisor: Dr. Alexander Malinowski

Assisted by Dr. Don Schertz

Bradley University Electrical Engineering Department

Senior Project

November 18, 2003











The scope of this project is to create a voice-activated system that remotely controls electronic appliances in a home. Figure 1 shows a hardware block diagram of the proposed system. The system will utilize a voice recognition integrated chip, which has the functionality to learn up to 60 separate commands. The circuit communicates through a hardware interface with an 8051 based-microcontroller. The microcontroller interprets the inputted signal which triggers a hardware device outputting the corresponding infrared (IR) signal. The microcontroller will learn IR commands through a second IR hardware subsystem connected to the microcontroller. If time permits, an X-10 configuration will also be added to the system.







Flowchart: Connector: Speaker
Flowchart: Connector: Micro-phone

Sensory Voice Direct II

(Voice- recognition chip)








(liquid crystal display)





Fig. 1 Overall Hardware Block Diagram




Please refer to the block diagram (Fig.1) to identify all subsystems discussed.



This hardware package contains the LCD and keyboard. The microprocessor is also embedded on this board. We will take a systems approach during the discussion of this subsystem. This means we will not concern ourselves with the actual signals being passed between the keyboard/LCD and the processor. The LCD and keyboard all communicate with the microprocessor via a pre-programmed signal processing function. The first real input to this subsystem are the IR TTL level pulses received from the IR Receiver. The second input is from the voice activation chip. These signals are also transmitted to the EMAC board via TTL levels. These TTL signals are passed to the microprocessor by way of serial communication. These signals alert the microprocessor to a received command word and which causes it to output IR pulses accordingly.








The first output of the EMAC board is TTL level pulses to the IR transmitter. IR codes are stored in memory located on the EMAC board. When the system gets a queue to transmit the IR codes TTL pulses (representing the IR codes) will be sent to the IR transmitter.

The second output of the EMAC board is TTL level communication to the Sensory Voice Direct II. These signals will request the voice direct II to learn a new word, listen for a previously learned word, and to replay known words.


Subsystem: IR Receiver

The IR receiver has two purposes. The first is the retrieval of IR signals passed to it via an IR transmitter (from a remote control.) These signals are made up of photons and electromagnetic waves. The second task of this subsystem is to demodulate the IR signal. This means that the receiver will transform a 38 kHz sine wave into 0 volts and the absence of that sine wave into 5 volts, these TTL level signals will be the outputs of this system. Since the sine wave will vary in frequency (dependant on manufacturer of the remote control) it will be necessary to include 3 or 4 different receivers to demodulate the different frequencies.


Subsystem: IR Transmitter

The IR transmitter will output modulated IR codes. These IR codes will be in the form of photons and electromagnetic waves. The input of this system is received from the EMAC board. The input signal will be in the form of 5-volt TTL level pulses.


Subsystem: Sensory Voice Direct II Voice Activation Chip

This input of this subsystem is from a microphone; the mill volt signal (from the microphone) is a voltage waveform of a spoken word. This subsystem will decode the signal and compare it with known words. It could also store the word in its memory to be compared with later words. The next input to this subsystem is from the EMAC board. These inputs command this subsystem to do certain tasks, refer to the explanation on the EMAC board for a description of these tasks.


The first output of this subsystem is to the speaker. The chip has the capability of playing back the commands that were stored into its memory. Therefore, the electrical output is an analog signal sent to the speaker. The second output of this subsystem is to the EMAC board. This output is a serial TTL link that will alert the EMAC of a known word being received.


Subsystem: Microphone

The input of this subsystem is a spoken word in the form of sound waves. The output of this subsystem is an analog voltage of the received word.


Subsystem: Speaker

The input of this subsystem is an analog voltage wave that carriers a spoken word. The output of this subsystem is a spoken word in the form of sound waves.










Description of User interface:


The software flowchart shown in fig. 2 briefly describes the layout for the user interface. The system opens with a menu allowing the user to recall or learn a command, the user selects their option using either the 1 or 2 keys on the keypad. If a 1 is received by the microcontroller the Voice Direct II is contacted to begin recording. If there is no signal received over the 2.5 second record period an error signal is displayed and the user is returned to the main menu. If any signal is received the microcontroller takes that signal and compares it to the other voice commands saved in external memory. If the signal does not match any of the prerecorded voice commands an error message is displayed and the user is asked to rerecord. To notify the user their command has been identified the stored voice command is played back through a speaker attached to the Voice Direct II.


The corresponding IR signal to that voice command is output through the IR hardware. If a 2 is received by the microcontroller from the main menu screen, the Voice Direct II is contacted to begin recording. If there is no signal received over the 2.5 second record period an error signal is displayed and the user is returned to the main menu. The LCD asks for the user to input the corresponding IR signal. If no signal is received in the recording period then an error message is transmitted and the user is asked to retry the recording. If a signal is received it is stored in the external memory with a reference to the voice command storage location. A message appears informing the user that the message has been saved and itemized.


If time permits a full menu will be added, which will allow the user to view all voice commands by their itemized storage location. The user will be able to delete and move voice commands and play the voice command over the speaker.









































































Fig 2 Software Flow chart









IR demodulator



IR sensor





Remote IR command modulated sin wave pulses

From user remote




Equate IR command sequence with voice command



Learned voice command

Learn IR


TTL pulses TTL pulses



learn IR



Voice Activation

(external voice control chip)


Voice command


Button press command

Learn voice command





IR command modulator



IR code transmitter




IR transmitter

TTL sin pulses







Fig. 3 IR Transmitter/Receiver Informational Chart


The diagram in Fig. 3 displays the transmission and reception of the IR signal. This device outputs the IR signal as a function of amplified voltage (0 to 5 volts.) The IR receiver outputs 0 or 5 volts depending if the input is a sin wave or ground signal respectively. The signal is at TTL levels and can be interfaced to a micro-controller board. The micro-controller records the time low and high of each pulse (done multiple times as the average IR command is 10 bytes, but they can be as long as 50). The microcontroller then equates these IR commands with voice commands which have been learned by the voice activation chip. The voice activation chip learns commands based on the status of a user controlled interface. The above processes are for both learning IR and voice commands and normal (or IR transmitting) status.