Machine learning algorithms make life easier — until they don’t

An IT industry analyst article published by SearchITOperations.


article_Machine-learning-algorithms-make-life-easier-until-they-dont
Algorithms govern many facets of our lives. But imperfect logic and data sets can make results worse instead of better, so it behooves all of us to think like data scientists.

Mike Matchett

Algorithms control our lives in many and increasingly mysterious ways. While machine learning algorithms change IT, you might be surprised at the algorithms at work in your nondigital life as well.

When I pull a little numbered ticket at the local deli counter, I know with some certainty that I’ll eventually get served. That’s a queuing algorithm in action — it preserves the expected first-in, first-out ordering of the line. Although wait times vary, it delivers a predictable average latency to all shoppers.

Now compare that to when I buy a ticket for the lottery. I’m taking a big chance on a random-draw algorithm, which is quite unlikely to ever go my way. Winning is not only uncertain, but improbable. Still, for many folks, the purchase of a lottery ticket delivers a temporary emotional salve, so there is some economic utility — as you might have heard in Economics 101.

People can respond well to algorithms that have guaranteed certainty and those with arbitrary randomness in the appropriate situations. But imagine flipping those scenarios. What if your deli only randomly selected people to serve? With enough competing shoppers, you might never get your sliced bologna. What if the lottery just ended up paying everyone back their ticket price minus some administrative tax? Even though this would improve almost everyone’s actual lottery return on investment, that kind of game would be no fun at all.

Without getting deep into psychology or behavioral economics, there are clearly appropriate and inappropriate uses of randomization. When we know we are taking a long-shot chance at a big upside, we might grumble if we lose. But our reactions are different when the department of motor vehicles closes after we’ve already spent four hours waiting.

Now imagine being subjected to opaque algorithms in various important facets of your life, as when applying for a mortgage, a car loan, a job or school admission. Many of the algorithms that govern your fate are seemingly arbitrary. Without transparency, it’s hard to know if any of them are actually fair, much less able to predict your individual prospects. (Consider the fairness concept the next time an airline randomly bumps you from a flight.)
Machine learning algorithms overview — machines learn what?

So let’s consider the supposedly smarter algorithms designed at some organizational level to be fair. Perhaps they’re based on some hard, rational logic leading to an unbiased and random draw, or more likely on some fancy but operationally opaque big data-based machine learning algorithm.

With machine learning, we hope things will be better, but they can also get much worse. In too many cases, poorly trained or designed machine learning algorithms end up making prejudicial decisions that can unfairly affect individuals.

I’m not exaggerating when I predict that machine learning will touch every facet of human existence.

This is a growing — and significant — problem for all of us. Machine learning is influencing a lot of the important decisions made about us and is steering more and more of our economy. It has crept in behind the scenes as so-called secret sauce or as proprietary algorithms applied to key operations.

But with easy-to-use big data, machine learning tools like Apache Spark and the increasing streams of data from the internet of things wrapping all around us, I expect that every data-driven task will be optimized with machine learning in some important way…(read the complete as-published article there)

Visualizing (and Optimizing) Cluster Performance

(Excerpt from original post on the Taneja Group News Blog)

Clusters are the scale-out way to go in today’s data center. Why not try to architect an infrastructure that can grow linearly in capacity and/or performance? Well, one problem is that operations can get quite complex especially when you start mixing workloads and tenants on the same cluster. In vanilla big data solutions everyone can compete, and not always fairly, for the same resources. This is a growing problem in production environments where big data apps are starting to underpin key business-impacting processes.

…(read the full post)

Hooked with a Non-Linear Curve – VMTurbo’s Economic Approach

(Excerpt from original post on the Taneja Group News Blog)

As a long-time capacity planner, if you show me a non-linear curve with a real model behind it I’ll tend to bite. Predictive analysis alone would have been enough to get my attention, but VMTurbo also talks about optimizing IT from an economics perspective. I spent a lot of years convincing and cajoling folks that capacity planning and infrastructure optimization is basically about investing your money effectively while ensuring the resulting system is efficiently utilized.

It is invigorating to see an experienced team (folks with a SMARTS heritage) approach virtualized IT environments as an economic system with calculable trade-offs and optimizable peformance-cost curves. We are told this approach works for both real-time optimizing operational control and for forward planning exercises.

It does leave me wondering if virtualized applications and resources are fully rational economic agents. By not having a true “view” of the physical world, perhaps they might obey a virtual kind of irrational “behavioral economics” (e.g. influenced by memory ballooning, virtual clock cycles, virtualized IO…)?

In any case it’s not too early to begin thinking about VMworld 2012 coming up in August. There is so much going on that one needs to have a hit list for whose booths to make sure to search out first –  high on my list this year is VMTurbo.

…(read the full post)