Internet of things data security proves vital in digitized world

An IT industry analyst article published by SearchITOperations.


article_Internet-of-things-data-security-proves-vital-in-digitized-world
Securing IoT data should become a priority as more companies manipulate the volumes produced by these devices. Seemingly innocuous information could allow privacy invasions.

Mike Matchett

The data privacy and access discussion gets all the more complicated in the age of IoT.

Some organizations might soon suffer from data paucity — getting locked, outbid or otherwise shut out of critical new data sources that could help optimize future business. While I believe that every data-driven organization should start planning today to avoid ending up data poor, this concern is just one of many potential data-related problems arising in our new big data, streaming, internet of things (IoT) world. In fact, issues with getting the right data will become so critical that I predict a new strategic data enablement discipline will emerge to not just manage and protect valuable data, but to ensure access to all the necessary — and valid — data the corporation might need to remain competitive.

In addition to avoiding debilitating data paucity, data enablement will mean IT will also need to consider how to manage and address key issues in internet of things data security, privacy and veracity. Deep discussions about the proper use of data in this era of analytics are filling books, and much remains undetermined. But IT needs to prepare for whatever data policies emerge in the next few years.

Piracy or privacy?

Many folks explore data privacy in depth, and I certainly don’t have immediate advice on how to best balance the personal, organizational or social benefits of data sharing, or where to draw a hard line on public versus private data. But if we look at privacy from the perspective of most organizations, the first requirements are to meet data security demands, specifically the regulatory and compliance laws defining the control of personal data. These would include medical history, salary and other HR data. Many commercial organizations, however, reserve the right to access, manage, use and share anything that winds up in their systems unless specifically protected — including any data stored or created by or about their employees.

If you are in the shipping business, using GPS and other sensor data from packages and trucks seems like fair game. After all, truck drivers know their employers are monitoring their progress and driving habits. But what happens when organizations track our interactions with IoT devices? Privacy concerns arise, and the threat of an internet of things security breach looms.

Many people are working hard to make GPS work within buildings, ostensibly as a public service, using Wi-Fi equipment and other devices to help triangulate the position of handheld devices and thus locate people in real time, all the time, on detailed blueprints.

In a shopping mall, this tracking detail would enable directed advertising and timely deals related to the store a shopper enters. Such data in a business setting could tell your employer who is next to whom and for how long, what you are looking at online, what calls you receive and so on. Should our casual friendships — not to mention casual flirting — bathroom breaks and vending machine selections be monitored this way? Yet the business can make the case that it should be able to analyze those associations in the event of a security breach — or adjust health plan rates if you have that candy bar. And once that data exists, it can be leaked or stolen…(read the complete as-published article there)

The New Big Thing in Big Data: Results From Our Apache Spark Survey

(Excerpt from original post on the Taneja Group News Blog)

In the last few months I’ve been really bullish on Apache Spark as an big enabler of wider big data solution adoption. Recently we got the great opportunity to conduct some deep Spark market research (with Cloudera’s sponsorship) and were able to survey nearly seven thousand (6900+) highly qualified technical and managerial people working with big data from around the world.
   
Some highlights — First, across the broad range of industries, company sizes, and big data maturities, over one-half (54%) of respondents are already actively using Spark to solve a primary organizational use case. That’s an incredible adoption rate, and no doubt due to the many ways Spark makes big data analysis accessible to a much wider audience – not just Phd’s but anyone with a modicum of SQL and scripting skills.
   
When it comes to use cases, in addition to the expected Data Processing/Engineering/ETL use case (55%), we found high rates of forward-looking and analytically sophisticated use cases like Real-time Stream Processing (44%), Exploratory Data Science (33%) and Machine Learning (33%). And support for the more traditional customer intelligence (31%) and BI/DW (29%) use cases weren’t far behind. By adding those numbers up you can see that many organizations indicated that Spark was already being applied to more than one important type of use case at the same time – a good sign that Spark supports nuanced applications and offers some great efficiencies (sharing big data, converging analytical approaches).
 
Is Spark going to replace Hadoop and the Hadoop ecosystem of projects?  A lot of folks run Spark on its own cluster, but we assess mostly only for performance and availability isolation. And that is likely just a matter of platform maturity – its likely future schedulers (and/or something like Pepperdata) will solve the multi-tenancy QoS issues with running Spark alongside and converged with any and all other kinds of data processing solutions (e.g. NoSQL, Flink, search…).
 
In practice already, converged analytics are the big trend with near half of current users (48%) said they used Spark with HBase and 41% again also with Kafka. Production big data solutions are actually pipelines of activities that span from data acquisition and ingest through full data processing and disposition. We believe that as Spark grows its organizational footprint out from initial data processing and ad-hoc data science into advanced operational (i.e. data center) production applications, that it truly blossoms when fully enabled by supporting other big data ecosystem technologies.

…(read the full post)

Virtual Instruments Finally Gets NAS-ty

(Excerpt from original post on the Taneja Group News Blog)

When Virtual Instruments merged in/acquired Load Dynamix recently, we thought good things were going to happen.  VI could now offer its users a full performance management “loop” of monitoring and testing in a common suite. Apparently VI’s clientele agreed because they’ve just finished out a stellar first half of year financially. Now, to sweeten the offer even more, VI is broadening its traditionally Fibre Channel/block focused monitoring (historically rooted in their original FC SAN probes) to fully encompass NAS monitoring too.

…(read the full post)

Oracle ZS5 Throws Down a Cloud Ready Gauntlet

(Excerpt from original post on the Taneja Group News Blog)

Is anyone in storage really paying close enough attention to Oracle? I think too many mistakenly dismiss Oracle’s infrastructure solutions as expensive, custom and proprietarily Oracle database-only hardware. But, surprise, Oracle has been successfully evolving the well respected ZFS as a solid cloud-scale filer, today releasing the fifth version of the ZFS storage array – the Oracle ZS5. And perhaps most surprising, the ZS series powers Oracle’s own fast growing cloud storage services (at huge scale – over 600PBs and growing).

…(read the full post)

Do ROI Calculators Produce Real ROI?

(Excerpt from original post on the Taneja Group News Blog)

As both a vendor product marketer and now an analyst, I’ve often been asked to help produce an “official” ROI (or the full TCO) calculator for some product. I used to love pulling out Excel and chaining together pages of cascading formulas.  But I’m getting older and wiser.  Now I see that ROI calculators are by and large just big rat holes. In fact I was asked again this week and, instead of quickly replying “yes, if you have enough money” and spinning out some rehashed spreadsheet (like some other IT analyst firms), I spent some time thinking about why the time and money spent producing detailed ROI calculators is usually a wasted investment, if not a wasted opportunity (to do better).

…(read the full post)

Hyperconverged Storage Evolves – Or is it Pivoting When it Comes to Pivot3?

(Excerpt from original post on the Taneja Group News Blog)

Pivot3 recently acquired NexGen (Mar 2016). Many folks have been wondering what they are doing. Pivot3 has made a name in the surveillance/video vertical with bulletproof hyperconvergence based on highly reliable data protection(native erasure coding) and large scalability (no additional east/west traffic with scale) as a specialty. So what does NexGen IP bring?  For starters, multi-tier flash performance and enterprise storage features (like snapshots).

…(read the full post)

We Can No Longer Contain Containers!

(Excerpt from original post on the Taneja Group News Blog)

Depsite naysayers (you know who you are!) I’ve been saying this is the year for containers, and half way into 2016 it’s looking like I’m right. The container community is maturing enterprise grade functionality, perhaps modeled on virtualization predecessors, extremely rapidly. ContainerX is one of those interesting solutions that fills in a lot of gaps for enterprises looking to stand up containers in production. In fact, they claim to be the “vSphere for Containers”.

…(read the full post)

Unifying Big Data Through Virtualized Data Services – Iguaz.io Rewrites the Storage Stack

(Excerpt from original post on the Taneja Group News Blog)

One of the more interesting new companies to arrive on the big data storage scene is iguaz.io. The iguaz.io team has designed a whole new, purpose-built storage stack that can store and serve the same master data in multiple formats, at high performance and in parallel streaming speeds to multiple different kinds of big data applications. This promises to obliterate the current spaghetti data flows with many moving parts, numerous transformation and copy steps, and Frankenstein architectures required to currently stitch together increasingly complex big data workflows. We’ve seen enterprises need to build environments that commonly span from streaming ingest and real time processing through interactive query and into larger data lake and historical archive based analysis, and end up making multiple data copies in multiple storage formats in multiple storage services.

…(read the full post)