A Voice and Gesture Based Recognition System for Windows System Commands: A tool for the Specially Abled

Vaidheeswaran Archana -

Vaidheeswaran Archana -

Chennai, Tamil Nadu

5 0
  • 0 Collaborators

Individuals with special needs often find it difficult to handle activities on a computer independently. They often need assistance in various activities like using providing computer inputs, navigating web browsers, etc. This Windows Systems Command Tool will use Intel's powerful Smart Sound Technology coupled with the Speech Enabling Developer Kit. This will further use Intel's OpenVino Distirbution Toolkit along with the Intel Real Sense Depth Camera for Gesture recognition. ...learn more

Project status: Under Development

RealSense™, Internet of Things, Artificial Intelligence

Groups
Internet of Things, DeepLearning

Intel Technologies
OpenVINO

Overview / Usage

PCs have grown smarter and more reliable over the last few decades. However, little work or research is actually done to help the specially abled/disabled individual; who constitute more than 15% of the population in the world. This project aims to bring that independence to the specially abled.

In this project, we use voice and gesture recognition to help them use computers. By using specific gestures or voice commands, they can access common applications and programs easily.

Methodology / Approach

The project has a two-fold system of Gesture and Voice recognition.
Gesture Recognition: The Gesture Recognition will use a Real Sense Depth Camera D415 to capture various hands gestures. Machine Learning Algorithms like Convolutional Neural Networks will be used to identify the gestures. This will also use OpenCV libraries to segment, rotate and augment the images. Intel OpenVino Distribution Toolkit will then used to run the entire model and give an optimised solution for the gesture. This will be further mapped to a set of windows system commands. For example, "cleanmgr" which is a command used to clean the disk can be mapped to a swipe gesture. Furthermore, with the usage of real sense depth vision, it will help to capture the swipe with the optimum precision.

Voice Recognition: In this system, a voice-based virtual assistant will be enabled to identify which command the person wants to execute. For instance, saying a command like "increase volume" will increase the volume of the PC. This will also be implemented using OpenVino and Convolutional Neural Networks. The audio will be converted into mel-spectograms and then a CNN will be used to classify it into the specific command.

Technologies Used

OpenVino
Intel RealSense
WinML
Python3

Comments (0)