Higher Degree by Research Application Portal
Title | Spiking Neural Networks for Ultra-Low-Power Speech and Audio Event Recognition |
---|---|
Supervisor | Prof Andreas Wicenec |
Dr Dylan Muir | |
A/Pro Richard Dodson | |
Course | Doctor of Philosophy |
Keywords | Neuromorphic computing |
Audio processing | |
Speech recognition | |
Machine learning optimisation | |
Research area | Physical Sciences |
Project description | This is an opportunity for a PhD in a rapidly growing new area in Machine Learning, using ML model that much more closely mimics the operation of a real brain, by reproducing the spike-triggering nature of real neurons (Spiking Neural Networks, SNN). The PhD will collaborate closely with a local start up that is developing finely tuned hyper-efficient implementations of these networks. Keyword spotting (KWS) and audio event spotting are increasingly important audio inference tasks in battery-powered, IoT and “extreme edge” devices. These are always-on tasks, and so energy-efficient pro- cessing is crucial. This project will explore SNN architectures and training approaches for improving the energy efficiency, resource efficiency, performance and capability of audio inference SNNs. KWS and audio event SNNs for these use cases are currently trained in a “template-matching” approach, where the entirety of a keyword or audio event is identified in an audio input sequence. Research work is needed to identify the capacity of the networks for this approach, and to identify ways to increase capacity and improve resource efficiency. Large-vocabulary automatic speech recognition (LVASR) The current approach for LVASR trains ANNs to recognise sequences of phonemes, tri-phones, or letters from continuous audio. The output of a network is then a sequence of tokens, which are post-processed to identify word sequences. A similar approach for energy-efficient LVASR could be taken with SNNs, to output low-level tokens. A post-processing SNN could also be trained to output probable word sequences. |
School | Graduate Research School |
Contact |
Please contact Andreas Wicenec in the first instance. |
Specific project requirement | The ideal candidate would have a Bachelor Honours or Masters degree in Mathematics, Computer Science, Data Science / ML, or Electronic Engineering, with experience working with Neural Network models. Experience with audio processing and/or machine learning optimisation would be desirable. |
Additional information |
Skills • Python coding, in ML pipelines • Machine Learning optimisation methods • ANN/DNN/CNN experience
Training and Development The candidate will be responsible for designing, implementing and evaluating new spiking NN architectures for audio- and speech processing. The capacity of these architectures would be analysed, and architectures designed to improve capacity and efficiency. Deployment of networks to energy-efficient SNN processors would result in new state-of-the-art benchmark publications. Possibility for collaboration with SynSense, a Neuromorphic processor hardware company. |
Description | The Doctor of Philosophy (PhD) is a program of independent, supervised research that is assessed solely on the basis of a thesis, sometimes including a creative work component, that is examined externally. The work presented for a PhD must be a substantial and original contribution to scholarship, demonstrating mastery of the subject of interest as well as an advance in that field of knowledge. Visit the course webpage for full details of this course including admission requirements, course rules and the relevant CRICOS code/s. |
Duration | 4 years |