(Excerpt from original post on the Taneja Group News Blog)
Welcome my son, welcome to the machine
Where have you been? -Pink Floyd
…(read the full post)
Welcome my son, welcome to the machine
Where have you been? -Pink Floyd
…(read the full post)
With the rise of software-defined storage, in which storage services are implemented as a software layer, the whole idea of data storage is being re-imagined. And with the resulting increase in the convergence of compute with storage, the difference between a storage platform and a data-processing platform is further eroding.
Storage takes new forms
Let’s look at a few of the ways that storage is driving into new territory:
The changing fundamentals
Powering many of these examples are interesting shifts in underlying technical capabilities. New data processing platforms are handling more metadata per unit of data than ever before. More metadata leads to new, highly efficient ways to innovate …(read the complete as-published article there)
The data privacy and access discussion gets all the more complicated in the age of IoT.
Some organizations might soon suffer from data paucity — getting locked, outbid or otherwise shut out of critical new data sources that could help optimize future business. While I believe that every data-driven organization should start planning today to avoid ending up data poor, this concern is just one of many potential data-related problems arising in our new big data, streaming, internet of things (IoT) world. In fact, issues with getting the right data will become so critical that I predict a new strategic data enablement discipline will emerge to not just manage and protect valuable data, but to ensure access to all the necessary — and valid — data the corporation might need to remain competitive.
In addition to avoiding debilitating data paucity, data enablement will mean IT will also need to consider how to manage and address key issues in internet of things data security, privacy and veracity. Deep discussions about the proper use of data in this era of analytics are filling books, and much remains undetermined. But IT needs to prepare for whatever data policies emerge in the next few years.
Piracy or privacy?
Many folks explore data privacy in depth, and I certainly don’t have immediate advice on how to best balance the personal, organizational or social benefits of data sharing, or where to draw a hard line on public versus private data. But if we look at privacy from the perspective of most organizations, the first requirements are to meet data security demands, specifically the regulatory and compliance laws defining the control of personal data. These would include medical history, salary and other HR data. Many commercial organizations, however, reserve the right to access, manage, use and share anything that winds up in their systems unless specifically protected — including any data stored or created by or about their employees.
If you are in the shipping business, using GPS and other sensor data from packages and trucks seems like fair game. After all, truck drivers know their employers are monitoring their progress and driving habits. But what happens when organizations track our interactions with IoT devices? Privacy concerns arise, and the threat of an internet of things security breach looms.
Many people are working hard to make GPS work within buildings, ostensibly as a public service, using Wi-Fi equipment and other devices to help triangulate the position of handheld devices and thus locate people in real time, all the time, on detailed blueprints.
In a shopping mall, this tracking detail would enable directed advertising and timely deals related to the store a shopper enters. Such data in a business setting could tell your employer who is next to whom and for how long, what you are looking at online, what calls you receive and so on. Should our casual friendships — not to mention casual flirting — bathroom breaks and vending machine selections be monitored this way? Yet the business can make the case that it should be able to analyze those associations in the event of a security breach — or adjust health plan rates if you have that candy bar. And once that data exists, it can be leaked or stolen…(read the complete as-published article there)
Big data is being created everywhere we look, and we are all thinking about how to take advantage of it. I certainly want to come up with some novel new big data application and become fabulously wealthy just for the idea. The thing is, most companies — perhaps all — can profit from big data today just by accelerating or refining some piece of their current business, supposing they can identify and corral the right information in the right time and place.
There is no need to find a new earth-shattering application to get started. I believe a significant big data payback is right in front of any marketing, sales, production or customer-engagement team. One simply needs to find a way to unlock the buried big data treasure. And, of course, that’s where big data concerns from practical to theoretical bubble to the surface.
A big sticking point has been finding the data science expertise, especially experts who could build optimized machine learning models tailored for your exact business needs. But we are seeing some interesting efforts recently to automate and, in some ways, commoditize big data handling and complicated machine learning. These big data automation technologies enable the regular Java Joe or Josie programmer to effectively drop big data analytics into existing, day-to-day operational-focused business applications.
Not only does this have the democratizing effect of unlocking big data value for non-data scientists, but it also highlights the trend toward a new application style. In the next three to five years, we will see most business applications that we’ve long categorized as transactional converge with what we’ve implemented separately as analytical applications. Put simply, with big data power, “business intelligence” is becoming fast enough and automated enough to deliver inside the operational business process in active business timeframes.
As these data processing worlds collide, they will reveal big data concerns for IT staff, and for those making decisions on IT infrastructure and data centers. Storage, databases and even networks will all need to adapt. Along with the rise of the internet of things (IoT), hybrid cloud architectures, persistent memory and containers, 2017 is going to be a pivotal year for challenging long-held assumptions and changing IT directions.
While I will undoubtedly focus a lot of time and energy as an industry analyst on these fast-evolving topics in the near term, there is a longer-term big data concern: Some companies might not be able to take advantage of this democratization of data simply because they can’t get access to the data they need.
We need to think about how we can ensure [big data is] reliable, how we can maintain and ensure privacy — and regulatory compliance — how we can ensure we only implement ethical and moral big data algorithms and so on.
We’ve heard warnings about how hard it is to manage big data as important data. We need to think about how we can ensure it’s reliable, how we can maintain and ensure privacy — and regulatory compliance — how we can ensure we only implement ethical and moral big data algorithms and so on. But before all that, you first need access to the data — assuming it exists or can be created — that is valuable to your company. I call this the data paucity problem — there’s too little big data in use.
As an example, I don’t believe every IoT device manufacturer will end up getting unfettered access to the data streams generated by their own things, much less to the ecosystem of data surrounding their things in the field. I think it is inevitable that some will be getting locked out of their own data flowback…(read the complete as-published article there)
Cheaper and denser CPUs are driving smarter built-in intelligence into each layer of the data storage infrastructure stack.
Take storage, for example. Excess compute power can be harnessed to deploy agile software-defined storage (e.g., Hewlett Packard Enterprise StoreVirtual), transition to hyper-converged architectures (e.g., HyperGrid, Nutanix, Pivot3, SimpliVity), or optimize I/O by smartly redistributing storage functionality between application servers and disk hosts (e.g., Datrium).
There is a downside to all this built-in intelligence, however. It can diminish the visibility we might otherwise have between our data storage infrastructure and, well, changes — any IT change, really, whether due to intentional patching and upgrades, expanding usage and users, or complex bugs and component failures. Or, to put it another way, native, dynamic optimization enabled by powerful and inexpensive processors is making it increasingly difficult for us humans to figure out what’s going on with our infrastructures.
So while it’s really great when we don’t need to know any details, and can simply rely on low-level components to always do the right thing, until there is an absolutely autonomous data center — and, no, today’s public cloud computing doesn’t do away with the need for internal experts — IT may find baked-in intelligence a double-edged sword. Furthermore, while smarter data storage infrastructure helps us with provisioning, optimization, growth plans and troubleshooting, it can blind or fool us and actively work against our best efforts to bend infrastructure to our “will.”
Still, in spite of all these potential negatives, given the choice, I’d rather live in a smarter and more autonomous IT world than not (even if there is some risk of runaway AI). I’ll explain.
It’s all about the data
Remember when analysis used to be an offline process? Capture some data in a file; open Excel, SAS or other desktop tool; and weeks later receive a recommendation. Today, that kind of analysis latency is entirely too long and naïve.
Native, dynamic optimization enabled by powerful and inexpensive processors is making it increasingly difficult for us humans to figure out what’s going on with our infrastructures.
Given the speed and agility of our applications and users nowadays, not to mention bigger data streams and minute-by-minute elastic cloud brokering, we need insight and answers faster than ever. This kind of intelligence starts with plentiful, reliable data, which today’s infrastructures are producing more and more of every day (in fact, we’ll soon be drowning in new data thanks to the internet of things [IoT]), and a way to process and manage all that information.
Storage arrays, for example, have long produced insightful data, but historically required vendor-specific, complex and expensive storage resource management applications to make good use of it. Fortunately, today, there are a series of developments helping us become smarter about IT systems management and better (and faster) users of data generated by our infrastructures: …(read the complete as-published article there)