Document | Author Wen-Chin Li, Kuang-Lin Hsieh, Jeremia Pramudya & Declan Saunders |
Abstract The development of Artificial Intelligence (AI) in the field of voice recognition has prompted interest in the field of emotional voice recognition (EVR); EVR is now one of the key challenges in the applications of Natural Language Processing (NLP). When conducting an accident investigation, the voice data from the cockpit voice recorder (CVR) usually provide significant evidence of the pilot’s mental state, which can support some hypotheses on occurrences by investigators, especially the verbal speech between pilots and air traffic controllers. In the past, emotion analysis mainly relies on image and text analysis technology. With the development of large language models (LLM) and AI; such as ChatGPT, Llama, and Perplexity, EVR has become possible. This research aims to explore the potential of using a Recurrent Neural Network (RNN) and the open-source dataset, Toronto Emotional Speech Set (TESS), to identify the pilot’s speech and emotions in emergencies. Further research may combine voice with physiological data, and with facial expressions to serve the purposes of operational and safety monitoring. |