5 trends driving the big data evolution

An IT industry analyst article published by SearchDataManagement.


article_5-trends-driving-the-big-data-evolution
The speedy evolution of big data technologies is connected to five trends, including practical applications of machine learning and cheap, abundantly available compute resources.

Mike Matchett
Small World Big Data

I’ve long said that all data will eventually become big data, and big data platforms will evolve into our next-generation data processing platform. We have reached a point in big data evolution where it is now mainstream, and if your organization is not neck-deep in figuring out how to implement big data technologies, you might be running out of time.

Indeed, the big data world continues to change rapidly, as I observed recently at the Strata Data Conference in New York. While there, I met with over a dozen key vendors in sessions and on the show floor.

Overall, the folks attending conferences like this one are less and less those slightly goofy and idealistic, open source research-focused geeks, and are more real-world big data and machine learning practitioners looking to solve real business problems in enterprise production environments. Given that basic vibe, here are my top five takeaways from Strata on the big data trends that are driving the big data evolution.

1. Structured data

Big data isn’t just about unstructured or semi-structured data anymore. Many of the prominent vendors, led by the key platform providers like Hortonworks, MapR and Cloudera, are now talking about big data implementations as full enterprise data warehouses (EDWs). The passive, often swampy data lake idea seems a bit passé, while there is a lot of energy aimed at providing practical, real-time business intelligence to a wider corporate swath of BI consumers.

I noted a large number of the big data-based acceleration competitors are applying on-demand analytics against tremendous volumes — both historical and streaming IoT style — of structured data.

Clearly, there is a war going on for the corporate BI and EDW investment. Given what I’ve seen, my bet is on big data platforms to inevitably outpace and outperform monolithic and proprietary legacy EDW.

2. Converged system of action

This leads into the observation that big data evolution includes implementations that host more and more of a company’s entire data footprint — structured and unstructured data together.

We’ve previously noted that many advanced analytical approaches can add tremendous value when they combine many formerly disparate corporate data sets of all different types…(read the complete as-published article there)

Machine learning algorithms make life easier — until they don’t

An IT industry analyst article published by SearchITOperations.


article_Machine-learning-algorithms-make-life-easier-until-they-dont
Algorithms govern many facets of our lives. But imperfect logic and data sets can make results worse instead of better, so it behooves all of us to think like data scientists.

Mike Matchett

Algorithms control our lives in many and increasingly mysterious ways. While machine learning algorithms change IT, you might be surprised at the algorithms at work in your nondigital life as well.

When I pull a little numbered ticket at the local deli counter, I know with some certainty that I’ll eventually get served. That’s a queuing algorithm in action — it preserves the expected first-in, first-out ordering of the line. Although wait times vary, it delivers a predictable average latency to all shoppers.

Now compare that to when I buy a ticket for the lottery. I’m taking a big chance on a random-draw algorithm, which is quite unlikely to ever go my way. Winning is not only uncertain, but improbable. Still, for many folks, the purchase of a lottery ticket delivers a temporary emotional salve, so there is some economic utility — as you might have heard in Economics 101.

People can respond well to algorithms that have guaranteed certainty and those with arbitrary randomness in the appropriate situations. But imagine flipping those scenarios. What if your deli only randomly selected people to serve? With enough competing shoppers, you might never get your sliced bologna. What if the lottery just ended up paying everyone back their ticket price minus some administrative tax? Even though this would improve almost everyone’s actual lottery return on investment, that kind of game would be no fun at all.

Without getting deep into psychology or behavioral economics, there are clearly appropriate and inappropriate uses of randomization. When we know we are taking a long-shot chance at a big upside, we might grumble if we lose. But our reactions are different when the department of motor vehicles closes after we’ve already spent four hours waiting.

Now imagine being subjected to opaque algorithms in various important facets of your life, as when applying for a mortgage, a car loan, a job or school admission. Many of the algorithms that govern your fate are seemingly arbitrary. Without transparency, it’s hard to know if any of them are actually fair, much less able to predict your individual prospects. (Consider the fairness concept the next time an airline randomly bumps you from a flight.)
Machine learning algorithms overview — machines learn what?

So let’s consider the supposedly smarter algorithms designed at some organizational level to be fair. Perhaps they’re based on some hard, rational logic leading to an unbiased and random draw, or more likely on some fancy but operationally opaque big data-based machine learning algorithm.

With machine learning, we hope things will be better, but they can also get much worse. In too many cases, poorly trained or designed machine learning algorithms end up making prejudicial decisions that can unfairly affect individuals.

I’m not exaggerating when I predict that machine learning will touch every facet of human existence.

This is a growing — and significant — problem for all of us. Machine learning is influencing a lot of the important decisions made about us and is steering more and more of our economy. It has crept in behind the scenes as so-called secret sauce or as proprietary algorithms applied to key operations.

But with easy-to-use big data, machine learning tools like Apache Spark and the increasing streams of data from the internet of things wrapping all around us, I expect that every data-driven task will be optimized with machine learning in some important way…(read the complete as-published article there)

Data center storage architecture gets smarter with AI

An IT industry analyst article published by SearchStorage.


article_Data-center-storage-architecture-gets-smarter-with-AI
Trends, such as event-triggered computing, as exemplified by Lambda Architectures, converge on data center storage to hasten data center intelligence evolution.

Mike Matchett

Infrastructure is getting smarter by the day. It’s reached the point where I’m afraid artificially intelligent IT will soon turn the tables and start telling me how to manage my own personal “lifecycle.” Well, I would be afraid if I believed all those AI vendors suddenly claiming they offer AI-powered infrastructure.

Now, we all want smarter, more automated, self-optimizing infrastructure — especially with storage — but I don’t see storage infrastructure components engaging in a human conversation with people about anything anytime soon. Storage is definitely getting smarter in more practical ways, however, and these changes are being seen in places such as data center storage architecture.

I’m excited by the hot storage trend toward embedding machine learning algorithms aimed at key optimization, categorization, search and pattern detection tasks. Corporate data assets are growing, and so is the potential value that comes from gathering and analyzing big data. It’s difficult to manually find those nuggets of data gold, though. And with the coming onslaught of the internet of things (IoT), data prospecting challenges will add mining huge amounts of fast streaming, real-time machine-generated and operational transactional data to the mix.

To help us take advantage of these potential information riches, storage vendors have started inserting intelligent algorithms into the storage layer directly…(read the complete as-published article there)