Delving into neural networks and deep learning
An IT industry analyst article published by SearchITOperations.
Machine learning is coming to the data center both to improve internal IT management and embed intelligence into key business processes. You have probably heard of a mystical deep learning, threatening to infuse everything from systems management to self-driving cars. Is this deep learning some really smart artificial intelligence that was just created and about to be unleashed on the world, or simply marketing hype aiming to re-launch complex machine learning algorithms in a better light?
It definitely fires the imagination, but it’s actually not that complicated. At a technical level, deep learning mostly refers to large compute-intensive neural networks running at scale. These networks are often trained over big data sets that might, for example, include imagery, speech, video and other dense data with inherently complex patterns difficult for more logical, rules-based machine learning approaches to master.
Neural networks and deep learning themselves are not new. Almost from the beginning of the modern computer age, neural network algorithms have been researched to help recognize deep patterns hidden in complex data streams. In that sense, deep learning is built on familiar machine learning techniques. Yet the application of newer, more computationally complex forms of neural network algorithms to today’s big data sets creates significant new opportunities. These “deep” models can be created and applied in real-time (at least faster than human time) at large scales, using affordable clouds or commodity scale-out big data architectures.
Impressionable neural networks
Neural networks were first explored back in the ’50s and ’60s as a model for how the human brain works. They consist of layers of nodes that are linked together like neurons in the brain into a large network. Each node receives input signals, and in turn, activates an outgoing signal sent to other nodes according to a pre-defined “activation function” that determines when that node should turn on. Basically you can think of how a node works in terms of excitement — as a node gets increasingly excited by the combination of its inputs, it can generate some level of output signal to send downstream. Interestingly, a node can get excited and signal either positively or negatively; some nodes when activated actually inhibit other nodes from getting excited.
Nodes are interconnected by links that each have their own weight variable. A link’s weight modifies any signal it carries. Neural networks adapt and learn to recognize patterns by incrementally adjusting their whole network of link weights, so that eventually only recognized patterns create a full cascade of excitement through the network.
Generally, input data is formatted into an incoming signal linked into a first layer of exposed nodes. These nodes in turn send signals into one or more hidden layers, with a final output layer of nodes assembling an “answer” to the outside world. As the learning (i.e., the intelligence) becomes embedded in the link weights, the key to practical use is figuring out to how to adjust or train all the hidden link weights to respond to the right patterns. Today, neural networks mainly learn to recognize patterns found in training data by using an incremental technique called back-propagation. This method proportionally “rewards” links when they contribute in a positive way towards recognizing good examples and penalizes them when they identify negative examples.
However there is no one right network architecture for any given problem. This is one area in which a machine learning expertise is invaluable…(read the complete as-published article there)