Llew Adamson's blog :


Deep learning techniques in artificial intelligence for improved neural network based pattern recognition 

For more than half a century computer scientists have attempted to create a computational model of intelligence capable of representing the world and making accurate predictions. However, we are still a long way from this – especially in domains such as vocal interpretation which require pattern recognition, context awareness and processing of semantic concepts. Deep learning architectures offer a way forward by more accurately mimicking the structure of the human brain – although it is only recently that advances in training algorithms have made such approaches feasible.

Neural computation has been described as “embarrassingly parallel” as each neuron can be thought of as an independent system, with behaviour described by a mathematical model. However, the real challenge lies in modelling neural communication. While the connectivity of neurons has some parallels with that of electrical systems, its high fan-out results in massive data processing and communication requirements when modelling neural communication, particularly for real-time computations.

It is shown that memory bandwidth is the most significant constraint to the scale of real-time neural computation, followed by communication bandwidth, which leads to a decision to implement a neural computation system on a platform based on a network of Field Programmable Gate Arrays (FPGAs), using commercial off-the-shelf components with some custom supporting infrastructure. This brings implementation challenges, particularly lack of on-chip memory, but also many advantages, particularly high-speed transceivers. An algorithm to model neural communication that makes efficient use of memory and communication resources is developed and then used to implement a neural computation system on the multi-FPGA platform.

Finding suitable benchmark neural networks for a massively parallel neural computation system proves to be a challenge. A synthetic benchmark that has biologically plausible fan-out, spike frequency and spike volume is proposed and used to evaluate the system. It is shown to be capable of computing the activity of a network of 256k Izhikevich spiking neurons with a fan-out of 1k in real-time using a network of 4 FPGA boards. This compares favourably with previous work, with the added advantage of scalability to larger neural networks using more FPGAs.

It is concluded that communication must be considered as a first-class design constraint when implementing massively parallel neural computation systems. This thesis aims to build upon the current state-of-the-art in deep-learning to develop improved real-world classifiers.