Big data and IoT benefit from machine learning, AI apocalypse not imminent

An IT industry analyst article published by SearchITOperations.

Suddenly, everybody is talking about machine learning, AI bots and deep learning. It’s showing up in new products to look at “call home data,” in cloud-hosted optimization services and even built into new storage arrays!

So what’s really going on? Is this something brand new or just the maturation of ideas spawned out of decades-old artificial intelligence research? Does deep learning require conversion to some mystical new church to understand it, or do our computers suddenly get way smarter overnight? Should we sleep with a finger on the power off button? But most importantly for IT folks, are advances in machine learning becoming accessible enough to readily apply to actual business problems — or is it just another decade of hype?

There are plenty of examples of highly visible machine learning applications in the press recently, both positive and negative. Microsoft’s Tay AI bot, designed to actively learn from 18 to 24 year olds on Twitter, Kik and GroupMe, unsurprisingly achieved its goal. Within hours of going live, it became a badly behaved young adult, both learning and repeating hateful, misogynistic, racist speech. Google’s AlphaGo beat a world champion at the game of Go by learning the best patterns of play from millions of past games, since the game can’t be solved through brute force computation with all the CPU cycles in the universe. Meanwhile, Google’s self-driving car hit a bus, albeit at slow speed. It clearly has more to learn about the way humans drive.

Before diving deeper, let me be clear, I have nothing but awe and respect for recent advances in machine learning. I’ve been directly and indirectly involved in applied AI and predictive modeling in various ways for most of my career. Although my current IT analyst work isn’t yet very computationally informed, there are many people working hard to use computers to automatically identify and predict trends for both fun and profit. Machine learning represents the brightest opportunity to improve life on this planet — today leveraging big data, tomorrow optimizing the Internet of Things (IoT).

Do machines really learn?

First, let’s demystify machine learning a bit. Machine learning is about finding useful patterns inherent in a given historical data set. These usually identify correlations between input values that you can observe, and output values that you’d eventually like to predict. Although precise definitions depend on the textbook, a model can be a particular algorithm with specific parameters that are tuned, or one that comes to “learn” useful patterns.

There are two broad kinds of machine learning:

…(read the complete as-published article there)

Google Drives a Stake Into The Human Heart? – AlphaGo Beats Human Go Champion

(Excerpt from original post on the Taneja Group News Blog)

Google’s AlphaGo program has just whipped a top human Go playing champion (Lee Sedol – a 9 Dan ranked pro and winner of many top titles) four games to one in a million dollar match. You might shrug if you are not familiar with the subtleties of playing Go at a champion level, but believe me that this is a significant milestone in Machine Learning. Software has long proven to be able to master games whose moves can be calculated out to the end game with enough computing power (like checkers and chess), but Go is (as of yet) not fully computable due to its board size (19×19) and seemingly “intuitive” beginning and even mid-game move options.

…(read the full post)

Big data analytics applications impact storage systems

An IT industry analyst article published by SearchStorage.

Whether driven by direct competition or internal business pressure, CIOs, CDOs and even CEOs today are looking to squeeze more value, more insight and more intelligence out of their data. They no longer can afford to archive, ignore or throw away data if it can be turned into a valuable asset. At face value, it might seem like a no-brainer — “we just need to analyze all that data to mine its value.” But, as you know, keeping any data, much less big data, has a definite cost. Processing larger amounts of data at scale is challenging, and hosting all that data on primary storage hasn’t always been feasible.

Historically, unless data had some corporate value — possibly as a history trail for compliance, a source of strategic insight or intelligence that can optimize operational processes — it was tough to justify keeping it. Today, thanks in large part to big data analytics applications, that thinking is changing. All of that bulky low-level bigger data has little immediate value, but there might be great future potential someday, so you want to keep it — once it’s gone, you lose any downstream opportunity.

To extract value from all that data, however, IT must not only store increasingly large volumes of data, but also architect systems that can process and analyze it in multiple ways.

…(read the complete as-published article there)

Is It Still Artificial Intelligence? Knowm Rolls Out Adaptive Machine Learning Stack

(Excerpt from original post on the Taneja Group News Blog)

When we want to start computing at biological scales and speeds – evolving today’s practical machine learning forward towards long-deferred visions of a more “artificial intelligence” – we’ll need to take advantage of new forms of hardware that transcend the strictly digital.

Digital computing infrastructure, based on switching digital bits and separating the functions of persisting data from processing, is now facing some big hurdles with Moore’s law. Even if there are a couple of magnitudes of improvement yet to be squeezed out of the traditional digital design paradigm, it has inherent limits in power consumption, scale, and speed. For example, there simply isn’t enough power available to meet the desires of those wishing to reach biological scale and density with traditional computing infrastructure, whether evolving artificial intelligence or more practically scaling machine learning to ever larger big data sets.

Knowm Inc. is pioneering a brilliant new form of computing that leverages the adaptive “learning” properties of memristive technology to not only persist data in fast memory (as others in the industry like HP are researching), but to inherently – and in one operation – calculate serious compute functions that would otherwise require the stored data to be offloaded into CPU’s, processed, and written back (taking more time and power).

The Knowm synapse, their circuit-level integrated unit of calculation and data persistence, was inspired by biological and natural world precedent. At a philosophical level this takes some deep thinking to fully appreciate the implications and opportunities, but this is no longer just a theory. Today, Knowm is announcing their “neuromemristive” solution to market supported with a full stack of  technologies – discrete chips, scalable simulators, defined low-level API’s and higher-level machine learning libraries, and a service that can help layer large quantities of Knowm synapses directly onto existing CMOS (Back End of Line or BEOL) designs.

Knowm is aiming squarely at the machine learning market, but here at Taneja Group we think the opportunity is much larger. This approach that intelligently harnesses analog hardware functions for extremely fast, cheap, dense and memory-inherent computing could represent a truly significant change and turning point for the whole computing industry.

I look forward to finding out who will take advantage of this solution first, and potentially cause a massively disruptive shift in not just machine learning, but in how all computing is done.

…(read the full post)

IT pros get a handle on machine learning and big data

An IT industry analyst article published by SearchDataCenter.

Machine learning is the force behind many big data initiatives. But things can go wrong when implementing it, with significant effects on IT operations.

Unfortunately, predictive modeling can be fraught with peril if you don’t have a firm grasp of the quality and veracity of the input data, the actual business goal and the real world limits of prediction (e.g., you can’t avoid black swans).

It’s also easy for machine learning and big data beginners to either make ineffectively complex models or “overtrain” on the given data (learning too many details of the specific training data that don’t apply generally). In fact, it’s quite hard to really know when you have achieved the smartest yet still “generalized” model to take into production.

Another challenge is that the metrics of success vary widely depending on the use case. There are dozens of metrics used to describe the quality and accuracy of the model output on test data. Even as an IT generalist, it pays to at least get comfortable with the matrix of machine learning outcomes, expressed with quadrants for the counts of true positives, true negatives, false positives (items falsely identified as positive) and false negatives (positives that were missed).

…(read the complete as-published article there)

Intro to machine learning algorithms for IT professionals

An IT industry analyst article published by SearchDataCenter.

Our data center machines, due to all the information we feed them, are getting smarter. How can you use machine learning to your advantage?

Machine learning is a key part of how big data brings operational intelligence into our organizations. But while machine learning algorithms are fascinating, the science gets complex very quickly. We can’t all be data scientists, but IT professionals need to learn about how our machines are learning.

We are increasingly seeing practical and achievable goals for machine learning, such as finding usable patterns in our data and then making predictions. Often, these predictive models are used in operational processes to optimize an ongoing decision-making process, but they can also provide key insight and information to inform strategic decisions.

The basic premise of machine learning is to train an algorithm to predict an output value within some probabilistic bounds when it is given specific input data. Keep in mind that machine learning techniques today are inductive, not deductive — it leads to probabilistic correlations, not definitive conclusions.

…(read the complete as-published article there)

IoT Goes Real-Time, Gets Predictive – Glassbeam Launches Spark-based Machine Learning

(Excerpt from original post on the Taneja Group News Blog)

In-Memory processing was all the rage at Strata 2014 NY last month, and the hottest word was Spark! Spark is big data scale-out cluster solution that provides a way to speedily analyze large data sets in-memory using a “resilient distributed data” design for fault-tolerance.  It can deploy into its own optimized cluster, or ride on top of Hadoop 2.0 using YARN, (although it is a different processing platform/paradigm from MapReduce – see this post on GridGain for a Hadoop MR In-memory solution).

…(read the full post)