First – Does flash storage really impact a business’s predictive analytics applications?
Then – How exactly?
First – Does flash storage really impact a business’s predictive analytics applications?
Then – How exactly?
Here’s another video quick take on AI and Machine Learning.
Watch this to hear about how Machine Learning is starting to seriously impact IT, both from a user workload perspective, and now as it’s being used in IT systems management.
Algorithms control our lives in many and increasingly mysterious ways. While machine learning algorithms change IT, you might be surprised at the algorithms at work in your nondigital life as well.
When I pull a little numbered ticket at the local deli counter, I know with some certainty that I’ll eventually get served. That’s a queuing algorithm in action — it preserves the expected first-in, first-out ordering of the line. Although wait times vary, it delivers a predictable average latency to all shoppers.
Now compare that to when I buy a ticket for the lottery. I’m taking a big chance on a random-draw algorithm, which is quite unlikely to ever go my way. Winning is not only uncertain, but improbable. Still, for many folks, the purchase of a lottery ticket delivers a temporary emotional salve, so there is some economic utility — as you might have heard in Economics 101.
People can respond well to algorithms that have guaranteed certainty and those with arbitrary randomness in the appropriate situations. But imagine flipping those scenarios. What if your deli only randomly selected people to serve? With enough competing shoppers, you might never get your sliced bologna. What if the lottery just ended up paying everyone back their ticket price minus some administrative tax? Even though this would improve almost everyone’s actual lottery return on investment, that kind of game would be no fun at all.
Without getting deep into psychology or behavioral economics, there are clearly appropriate and inappropriate uses of randomization. When we know we are taking a long-shot chance at a big upside, we might grumble if we lose. But our reactions are different when the department of motor vehicles closes after we’ve already spent four hours waiting.
Now imagine being subjected to opaque algorithms in various important facets of your life, as when applying for a mortgage, a car loan, a job or school admission. Many of the algorithms that govern your fate are seemingly arbitrary. Without transparency, it’s hard to know if any of them are actually fair, much less able to predict your individual prospects. (Consider the fairness concept the next time an airline randomly bumps you from a flight.)
Machine learning algorithms overview — machines learn what?
So let’s consider the supposedly smarter algorithms designed at some organizational level to be fair. Perhaps they’re based on some hard, rational logic leading to an unbiased and random draw, or more likely on some fancy but operationally opaque big data-based machine learning algorithm.
With machine learning, we hope things will be better, but they can also get much worse. In too many cases, poorly trained or designed machine learning algorithms end up making prejudicial decisions that can unfairly affect individuals.
I’m not exaggerating when I predict that machine learning will touch every facet of human existence.
This is a growing — and significant — problem for all of us. Machine learning is influencing a lot of the important decisions made about us and is steering more and more of our economy. It has crept in behind the scenes as so-called secret sauce or as proprietary algorithms applied to key operations.
But with easy-to-use big data, machine learning tools like Apache Spark and the increasing streams of data from the internet of things wrapping all around us, I expect that every data-driven task will be optimized with machine learning in some important way…(read the complete as-published article there)
Infrastructure is getting smarter by the day. It’s reached the point where I’m afraid artificially intelligent IT will soon turn the tables and start telling me how to manage my own personal “lifecycle.” Well, I would be afraid if I believed all those AI vendors suddenly claiming they offer AI-powered infrastructure.
Now, we all want smarter, more automated, self-optimizing infrastructure — especially with storage — but I don’t see storage infrastructure components engaging in a human conversation with people about anything anytime soon. Storage is definitely getting smarter in more practical ways, however, and these changes are being seen in places such as data center storage architecture.
I’m excited by the hot storage trend toward embedding machine learning algorithms aimed at key optimization, categorization, search and pattern detection tasks. Corporate data assets are growing, and so is the potential value that comes from gathering and analyzing big data. It’s difficult to manually find those nuggets of data gold, though. And with the coming onslaught of the internet of things (IoT), data prospecting challenges will add mining huge amounts of fast streaming, real-time machine-generated and operational transactional data to the mix.
To help us take advantage of these potential information riches, storage vendors have started inserting intelligent algorithms into the storage layer directly…(read the complete as-published article there)
One of the hottest IT trends today is augmenting traditional business applications with artificial intelligence or machine learning capabilities. I predict the next generation of data center application platforms will natively support the real-time convergence of online transaction processing with analytics. Why not bring the point of the sword on operational insight to the frontline where business actually happens?
But modifying production application code that is optimized for handling transactions to embed machine learning algorithms is a tough slog. As most IT folks are reluctant — OK, absolutely refuse — to take apart successfully deployed operational applications to fundamentally rebuild them from the inside out, software vendors have rolled out some new ways to insert machine intelligence into business workflows. Microsoft is among them, pushing SQL Server machine learning tools tied to its database software.
Basically, adding intelligence to an application means folding in a machine learning model to recognize patterns in data, automatically label or categorize new information, recommend priorities for action, score business opportunities or make behavioral predictions about customers. Sometimes this intelligence is overtly presented to the end user, but it can also transparently supplement existing application functionality.
In conventional data science and analytics activities, machine learning models typically are built, trained and run in separate analytics systems. But models applied to transactional workflows require a method that enables them to be used operationally at the right time and place, and may need another operational method to support ongoing training (e.g., to learn about new data).
Closeness counts in machine learning
In the broader IT world, many organizations are excited by serverless computing and lambda function cloud services in which small bits of code are executed in response to data flows and event triggers. But this isn’t really a new idea in the database world, where stored procedures have been around for decades. They effectively bring compute processes closer to data, the core idea behind much of today’s big data tools.
Database stored procedures offload data-intensive modeling tasks such as training, but can also integrate machine learning functionality directly into application data flows. With such injections, some transactional applications may be able to take advantage of embedded intelligence without any upstream application code which needs to be modified. Additionally, applying machine learning models close to the data in a database allows the operational intelligence to be readily shared among different downstream users…(read the complete as-published article there)
Big data and artificial intelligence will affect the world — and already are — in mind-boggling ways. That includes, of course, our data centers.
The term artificial intelligence (AI) is making a comeback. I interpret AI as a larger, encompassing umbrella that includes machine learning — which in turn includes deep learning methods — but also implies thought. Meanwhile, machine learning is somehow safe to talk about. It’s just some applied math — e.g., built-over probabilities, linear algebra, differential equations — under the hood. But use the term AI and, suddenly, you get wildly different emotional reactions —for example, the Terminator is coming. However, today’s broader field of AI is working toward providing humanity with enhanced and automated vision, speech and reasoning.
If you’d like to stay on top of what’s happening practically in these areas, here are some emerging big data and AI trends to watch that might affect you and your data center sooner rather than later:
Where there is a Spark… Apache Spark is replacing basic Hadoop MapReduce for latency-sensitive big data jobs with its in-memory, real-time queries and fast machine learning at scale. And with familiar, analyst-friendly data constructs and languages, Spark brings it all within reach of us middling hacker types.
As far as production bulletproofing, it’s not quite fully baked. But version two of Spark was just released in mid-2016, and it’s solidifying fast. Even so, this fast-moving ecosystem and potential “Next Big Things” such as Apache Flink are already turning heads.
Even I can do it. A few years ago, all this big data and AI stuff required doctorate-level data scientists. In response, a few creative startups attempted to short-circuit those rare and expensive math geeks out of the standard corporate analytics loop and provide the spreadsheet-oriented business intelligence analyst some direct big data access.
Today, as with Spark, I get a real sense that big data analytics is finally within reach of the average engineer or programming techie. The average IT geek may still need to apply him or herself to some serious study but can achieve great success creating massive organizational value. In other words, there is now a large and growing middle ground where smart non-data scientists can be very productive with applied machine learning even on big and real-time data streams…(read the complete as-published article there)
Machine learning is coming to the data center both to improve internal IT management and embed intelligence into key business processes. You have probably heard of a mystical deep learning, threatening to infuse everything from systems management to self-driving cars. Is this deep learning some really smart artificial intelligence that was just created and about to be unleashed on the world, or simply marketing hype aiming to re-launch complex machine learning algorithms in a better light?
It definitely fires the imagination, but it’s actually not that complicated. At a technical level, deep learning mostly refers to large compute-intensive neural networks running at scale. These networks are often trained over big data sets that might, for example, include imagery, speech, video and other dense data with inherently complex patterns difficult for more logical, rules-based machine learning approaches to master.
Neural networks and deep learning themselves are not new. Almost from the beginning of the modern computer age, neural network algorithms have been researched to help recognize deep patterns hidden in complex data streams. In that sense, deep learning is built on familiar machine learning techniques. Yet the application of newer, more computationally complex forms of neural network algorithms to today’s big data sets creates significant new opportunities. These “deep” models can be created and applied in real-time (at least faster than human time) at large scales, using affordable clouds or commodity scale-out big data architectures.
Impressionable neural networks
Neural networks were first explored back in the ’50s and ’60s as a model for how the human brain works. They consist of layers of nodes that are linked together like neurons in the brain into a large network. Each node receives input signals, and in turn, activates an outgoing signal sent to other nodes according to a pre-defined “activation function” that determines when that node should turn on. Basically you can think of how a node works in terms of excitement — as a node gets increasingly excited by the combination of its inputs, it can generate some level of output signal to send downstream. Interestingly, a node can get excited and signal either positively or negatively; some nodes when activated actually inhibit other nodes from getting excited.
Nodes are interconnected by links that each have their own weight variable. A link’s weight modifies any signal it carries. Neural networks adapt and learn to recognize patterns by incrementally adjusting their whole network of link weights, so that eventually only recognized patterns create a full cascade of excitement through the network.
Generally, input data is formatted into an incoming signal linked into a first layer of exposed nodes. These nodes in turn send signals into one or more hidden layers, with a final output layer of nodes assembling an “answer” to the outside world. As the learning (i.e., the intelligence) becomes embedded in the link weights, the key to practical use is figuring out to how to adjust or train all the hidden link weights to respond to the right patterns. Today, neural networks mainly learn to recognize patterns found in training data by using an incremental technique called back-propagation. This method proportionally “rewards” links when they contribute in a positive way towards recognizing good examples and penalizes them when they identify negative examples.
However there is no one right network architecture for any given problem. This is one area in which a machine learning expertise is invaluable…(read the complete as-published article there)
So what’s really going on? Is this something brand new or just the maturation of ideas spawned out of decades-old artificial intelligence research? Does deep learning require conversion to some mystical new church to understand it, or do our computers suddenly get way smarter overnight? Should we sleep with a finger on the power off button? But most importantly for IT folks, are advances in machine learning becoming accessible enough to readily apply to actual business problems — or is it just another decade of hype?
There are plenty of examples of highly visible machine learning applications in the press recently, both positive and negative. Microsoft’s Tay AI bot, designed to actively learn from 18 to 24 year olds on Twitter, Kik and GroupMe, unsurprisingly achieved its goal. Within hours of going live, it became a badly behaved young adult, both learning and repeating hateful, misogynistic, racist speech. Google’s AlphaGo beat a world champion at the game of Go by learning the best patterns of play from millions of past games, since the game can’t be solved through brute force computation with all the CPU cycles in the universe. Meanwhile, Google’s self-driving car hit a bus, albeit at slow speed. It clearly has more to learn about the way humans drive.
Before diving deeper, let me be clear, I have nothing but awe and respect for recent advances in machine learning. I’ve been directly and indirectly involved in applied AI and predictive modeling in various ways for most of my career. Although my current IT analyst work isn’t yet very computationally informed, there are many people working hard to use computers to automatically identify and predict trends for both fun and profit. Machine learning represents the brightest opportunity to improve life on this planet — today leveraging big data, tomorrow optimizing the Internet of Things (IoT).
First, let’s demystify machine learning a bit. Machine learning is about finding useful patterns inherent in a given historical data set. These usually identify correlations between input values that you can observe, and output values that you’d eventually like to predict. Although precise definitions depend on the textbook, a model can be a particular algorithm with specific parameters that are tuned, or one that comes to “learn” useful patterns.
There are two broad kinds of machine learning:
…(read the complete as-published article there)
Google’s AlphaGo program has just whipped a top human Go playing champion (Lee Sedol – a 9 Dan ranked pro and winner of many top titles) four games to one in a million dollar match. You might shrug if you are not familiar with the subtleties of playing Go at a champion level, but believe me that this is a significant milestone in Machine Learning. Software has long proven to be able to master games whose moves can be calculated out to the end game with enough computing power (like checkers and chess), but Go is (as of yet) not fully computable due to its board size (19×19) and seemingly “intuitive” beginning and even mid-game move options.
…(read the full post)
Whether driven by direct competition or internal business pressure, CIOs, CDOs and even CEOs today are looking to squeeze more value, more insight and more intelligence out of their data. They no longer can afford to archive, ignore or throw away data if it can be turned into a valuable asset. At face value, it might seem like a no-brainer — “we just need to analyze all that data to mine its value.” But, as you know, keeping any data, much less big data, has a definite cost. Processing larger amounts of data at scale is challenging, and hosting all that data on primary storage hasn’t always been feasible.
Historically, unless data had some corporate value — possibly as a history trail for compliance, a source of strategic insight or intelligence that can optimize operational processes — it was tough to justify keeping it. Today, thanks in large part to big data analytics applications, that thinking is changing. All of that bulky low-level bigger data has little immediate value, but there might be great future potential someday, so you want to keep it — once it’s gone, you lose any downstream opportunity.
…(read the complete as-published article there)