Cloudy Object Storage Mints Developer Currency

Having just been to VMWorld 2017 and getting pre-briefs for Strata Data NY coming up soon, I’ve noticed a few hot trends aiming to help foster cross-cloud, multi-cloud, hybrid operations. Among these are

  1. A focus on centralized management for both on-prem and cloud infrastructure
  2. The accelerating pace of systems Management as a Service (MaaS!?) offerings
  3. Container management “distributions” with enterprise features
  4. And an increasing demand for fluid, cross-hybrid infrastructure storage

We all know about AWS S3, the current lingua franca of object storage for web-scale application development and de facto REST-based “standard” for object storage API’s.  Only it’s not actually a standard. If you want to write apps for deployment anywhere, on any cloud or on-premise server or even your laptop, you might not appreciate being locked in to coding directly to the ultimately proprietary S3 API (or waiting for your big enterprise storage solutions to support it).

Which is where the very cool object (“blob”) storage solution from Minio comes in to play.  Minio is effectively a software defined object storage layer that on the backend virtually transforms just about any underlying storage into a consistent front-end developer friendly (and open source) object storage interface (still S3 compatible). This means you can code to Minio — and actually deploy Minio WITH your application if needed — onto any cloud, storage infrastructure, or local volume. And it adds all kinds of enterprise features like advanced reliability with erasure coding which will no doubt please those IT/devops folks that want to manage a consistent storage footprint no matter where the application deploys.

So this sounds at first like this could all be complicated, but I installed and ran Minio locally with but two commands in less than a minute.

$ brew install minio/stable/minio

$ minio server --address localhost:9500 /users/mike/minio
Minio Install and Launch

Done. We can immediately code with Minio’s universal REST API based calls to that URL/port. And of course, point a browser at it and you get a nice browser based interface – this is all you need to do if you just want something like a photo “dropbox” repository of your own.

But wait! One more step gets you a lot of really useful functionality. Install the Minio Client (mc) and configure it with your Mino server key. Now you have a powerful, consistent command line interface to manage and muck with both object and file stores.

$ brew install minio/stable/mc

# This line (and keys) is provided as output by the minio server start above

$ mc config host add myminio http://localhost:9500 <ACCESS KEY> <SECRET KEY>
Minio Client

The “mc” interface has supersetted a whole bunch of familiar UNIX command line utilities.  It’s likely many of your existing administrative scripts can be trivially updated to work cross-hybrid_cloud cross-file/object etc.

$ mc

NAME:
  mc - Minio Client for cloud storage and filesystems.

USAGE:
  mc [FLAGS] COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
  ls       List files and folders.
  mb       Make a bucket or a folder.
  cat      Display file and object contents.
  pipe     Redirect STDIN to an object or file or STDOUT.
  share    Generate URL for sharing.
  cp       Copy files and objects.
  mirror   Mirror buckets and folders.
  diff     Show differences between two folders or buckets.
  rm       Remove files and objects.
  events   Manage object notifications.
  watch    Watch for files and objects events.
  policy   Manage anonymous access to objects.
  admin    Manage Minio servers
  session  Manage saved sessions for cp and mirror commands.
  config   Manage mc configuration file.
  update   Check for a new software update.
  version  Print version info.
Minio Client Help Text

One can easily see that Minio delivers a nice, really robust object storage service that insulates developer and resource admin (devops) from infrastructure/architecture.

Minio can be pointed at/deployed on many different cloud providers to enable easier cross-cloud multi-cloud migration. And it’s locally efficient. For example, when in AWS, it uses S3,  when on Azure it uses native Azure blob storage services.

Folks looking at taking advantage of hybridizing operations with something like VMware on AWS might also want to deploy Minio over both their local storage solutions (SANs, NASs, even VSAN et.al) AND on the cloud side (AWS S3). This solves a big challenge when transitioning to hybrid ops – getting consistent, enterprise grade storage services everywhere.

There is a lot more to be said about Minio’s advanced feature set and exciting roadmap, but you can read all about those on the Minio website!  Have fun!

 

What’s a Multi-cloud Really?  Some Insider Notes from VMworld 2017

(Excerpt from original post on the Taneja Group News Blog)

As comfortable 65-70 degree weather blankets New England here as we near end of summer, flying into Las Vegas for VMworld at 110 degrees seemed like dropping into hell. Last time I was in that kind of heat I was stepping off a C-130 into the Desert Shield/Desert Storm theater of operations. At least here, as everyone still able to breathe immediately says -“at least it’s a dry heat.”

…(read the full post)

Persistent data storage in containerized environments

An IT industry analyst article published by SearchStorage.


article_Persistent-data-storage-in-containerized-environments
The most significant challenge to the rise of containerized applications is quickly and easily providing enterprise-class persistent storage for containers.

Mike Matchett

The pace of change in IT is staggering. Fast growing data, cloud-scale processing and millions of new internet of things devices are driving us to find more efficient, reliable and scalable ways to keep up. Traditional application architectures are reaching their limits, and we’re scrambling to evaluate the best new approaches for development and deployment. Fortunately, the hottest prospect — containerization — promises to address many, if not all, of these otherwise overwhelming challenges.

In containerized application design, each individual container hosts an isolatable, and separately scalable, processing component of a larger application web of containers. Unlike monolithic application processes of the past, large, containerized applications can consist of hundreds, if not thousands, of related containers. The apps support Agile design, development and deployment methodologies. They can scale readily in production and are ideally suited for hosting in distributed, and even hybrid, cloud infrastructure.

Unfortunately, containers weren’t originally designed to implement full-stack applications or really any application that requires persistent data storage. The original idea for containers was to make it easy to create and deploy stateless microservice application layers on a large scale. Think of microservices as a form of highly agile middleware with conceptually no persistent data storage requirements to worry about.

Persistence in persisting

Because the container approach has delivered great agility, scalability, efficiency and cloud-readiness, and is lower-cost in many cases, people now want to use it for far more than microservices. Container architectures provide such a better way to build modern applications that we see many commercial software and systems vendors transitioning internal development to container form and even deploying them widely, often without explicit end-user or IT awareness. It’s a good bet that most Fortune 1000 companies already host third-party production IT applications in containers, especially inside appliances, converged approaches and purpose-built infrastructure.

It’s a good bet that most Fortune 1000 companies already host third-party container applications within production IT.

You might find large, containerized databases and even storage systems. Still, designing enterprise persistent storage for these applications is a challenge, as containers can come and go and migrate across distributed and hybrid infrastructure. Because data needs to be mastered, protected, regulated and governed, persistent data storage acts in many ways like an anchor, holding containers down and threatening to reduce many of their benefits.

Container architectures need three types of storage…(read the complete as-published article there)

A serverless architecture could live in your data center

An IT industry analyst article published by SearchITOperations.


article_A-serverless-architecture-could-live-in-your-data-center
Just because you don’t see the server doesn’t mean it’s not there. Serverless frameworks are superseding containers, but is the extra abstraction worth it?

Mike Matchett

Have you figured out everything you need to know about managing and operating container environments already? How to host them in your production data centers at scale? Transform all your legacy apps into containerized versions? Train your developers to do agile DevOps, and turn your IT admins into cloud brokers? Not quite yet?

I hate to tell you, but the IT world is already moving past containers. Now you need to look at the next big thing: serverless computing.

I don’t know who thought it was a good idea to label this latest application architecture trend serverless computing. Code is useless, after all, unless it runs on a computer. There has to be a server in there somewhere. I guess the idea was to imply that when you submit application functionality for execution without caring about servers, it feels completely serverless.

In cloud infrastructure as a service, you don’t have to own or manage your own physical infrastructure. With cloud serverless architecture, you also don’t have to care about virtual machines, operating systems or even containers.

Serving more through serverless architecture?

So what is serverless computing? It’s a service in which a programmer can write relatively contained bits of code and then directly deploy them as standalone, function-sized microservices. You can easily set up these microservices to execute on a serverless computing framework, triggering or scheduling them by policy in response to supported events or API calls.

A serverless framework is designed to scale well with inherently stateless microservices — unlike today’s containers, which can host stateful computing as well as stateless code. You might use serverless functions to tackle applications that need highly elastic, event-driven execution or when you create a pipeline of arbitrary functionality to transform raw input into polished output. This event-pipeline concept meshes well with expected processing needs related to the internet of things. It could also prove useful with applications running in a real-time data stream.

A well-known public cloud example of serverless computing is Amazon Web Service’s Lambda. The Lambda name no doubt refers to anonymous lambda functions used extensively in functional programming. In languages such as JavaScript or Ruby, a function can be a first-class object defined as a closure of some code function within a prescribed variable scope. Some languages have actual lambda operators that a programmer can use to dynamically create new function objects at runtime (e.g., as other code executes).

So with a serverless framework, where does the actual infrastructure come into the picture? It’s still there, just under multiple layers of abstraction. Talk about software-defined computing. With this latest evolution into serverless computing, we now have perhaps several million lines of system- and platform-defining code between application code and hardware. It’s a good thing Moore’s Law hasn’t totally quit on us…(read the complete as-published article there)

Spark speeds up adoption of big data clusters and clouds

An IT industry analyst article published by SearchITOperations.


article_Spark-speeds-up-adoption-of-big-data-clusters-and-clouds
Infrastructure that supports big data comes from both the cloud and clusters. Enterprises can mix and match these seven infrastructure choices to meet their needs.

Mike Matchett

If enterprise IT has been slow to support big data analytics in production for the decade-old Hadoop, there has been a much faster ramp-up now that Spark is part of the overall package. After all, doing the same old business intelligence approach with broader, bigger data (with MapReduce) isn’t exciting, but producing operational time predictive intelligence that guides and optimizes business with machine precision is a competitive must-have.

With traditional business intelligence (BI), an analyst studies a lot of data and makes some hypotheses and a conclusion to form a recommendation. Using the many big data machine learning techniques supported by Spark’s MLlib, a company’s big data can dynamically drive operational-speed optimizations. Massive in-memory machine learning algorithms enable businesses to immediately recognize and act on inherent patterns in even big streaming data.

But the commoditization of machine learning itself isn’t the only new driver here. A decade ago, IT needed to stand up either a “baby” high performance computing cluster for serious machine learning or learn to write low-level distributed parallel algorithms to run on the commodity-based Hadoop MapReduce platform. Either option required both data science and exceptionally talented IT admins that could stand up and support massive physical scale-out clusters in production. Today there are many infrastructure options for big data clusters that can help IT deploy and support big data-driven applications.

Here are seven types of big data infrastructures for IT to consider, each with core strengths and differences:…(read the complete as-published article there)

Machine learning algorithms make life easier — until they don’t

An IT industry analyst article published by SearchITOperations.


article_Machine-learning-algorithms-make-life-easier-until-they-dont
Algorithms govern many facets of our lives. But imperfect logic and data sets can make results worse instead of better, so it behooves all of us to think like data scientists.

Mike Matchett

Algorithms control our lives in many and increasingly mysterious ways. While machine learning algorithms change IT, you might be surprised at the algorithms at work in your nondigital life as well.

When I pull a little numbered ticket at the local deli counter, I know with some certainty that I’ll eventually get served. That’s a queuing algorithm in action — it preserves the expected first-in, first-out ordering of the line. Although wait times vary, it delivers a predictable average latency to all shoppers.

Now compare that to when I buy a ticket for the lottery. I’m taking a big chance on a random-draw algorithm, which is quite unlikely to ever go my way. Winning is not only uncertain, but improbable. Still, for many folks, the purchase of a lottery ticket delivers a temporary emotional salve, so there is some economic utility — as you might have heard in Economics 101.

People can respond well to algorithms that have guaranteed certainty and those with arbitrary randomness in the appropriate situations. But imagine flipping those scenarios. What if your deli only randomly selected people to serve? With enough competing shoppers, you might never get your sliced bologna. What if the lottery just ended up paying everyone back their ticket price minus some administrative tax? Even though this would improve almost everyone’s actual lottery return on investment, that kind of game would be no fun at all.

Without getting deep into psychology or behavioral economics, there are clearly appropriate and inappropriate uses of randomization. When we know we are taking a long-shot chance at a big upside, we might grumble if we lose. But our reactions are different when the department of motor vehicles closes after we’ve already spent four hours waiting.

Now imagine being subjected to opaque algorithms in various important facets of your life, as when applying for a mortgage, a car loan, a job or school admission. Many of the algorithms that govern your fate are seemingly arbitrary. Without transparency, it’s hard to know if any of them are actually fair, much less able to predict your individual prospects. (Consider the fairness concept the next time an airline randomly bumps you from a flight.)
Machine learning algorithms overview — machines learn what?

So let’s consider the supposedly smarter algorithms designed at some organizational level to be fair. Perhaps they’re based on some hard, rational logic leading to an unbiased and random draw, or more likely on some fancy but operationally opaque big data-based machine learning algorithm.

With machine learning, we hope things will be better, but they can also get much worse. In too many cases, poorly trained or designed machine learning algorithms end up making prejudicial decisions that can unfairly affect individuals.

I’m not exaggerating when I predict that machine learning will touch every facet of human existence.

This is a growing — and significant — problem for all of us. Machine learning is influencing a lot of the important decisions made about us and is steering more and more of our economy. It has crept in behind the scenes as so-called secret sauce or as proprietary algorithms applied to key operations.

But with easy-to-use big data, machine learning tools like Apache Spark and the increasing streams of data from the internet of things wrapping all around us, I expect that every data-driven task will be optimized with machine learning in some important way…(read the complete as-published article there)

I’m Going Fission

I just spent a couple of weeks in Boston at Red Hat Summit and OpenStack Summit.  Containers are clearly the big thing this year – Kubernetes, Openshift, etc. And increasingly, IT is learning how to take advantage of remote Management As A Service (MaaS) offerings that free up folks to focus more on business value and less on running complex stacks. On that front I talked with folks like Platform9, who happen to also sponsor a “server-less” computing solution called Fission (- later in this blog post I’ll show how I got Fission deployed to my Mac).

Because I’m an industry analyst (in my day job), here is a big picture of the evolution happening in application infrastructure: Physically hosted apps (server and O/S) –> Virtual machines (in a hypervisor) –> Cloud platforms (e.g. OpenStack) –> Container “ships” (e.g. OpenShift, Docker, Kubernetes) –> Serverless Computing (e.g. AWS Lambda and Fission).

Applications have always been constructed out of multiple tiers and communicating parts, but generally we are moving towards a world in which functionality is both defined and deployed (distributable, scalable) in small, testable bits (i.e. “units” as in unit testing), while an application “blueprint” defines all the related bits and required service properties in operation.  Some folks are calling the blueprinting part “infrastructure as code”.

(BTW – the next evolutionary step is probably some kind of highly intelligent, dynamic IoT/Big Data/Distributed engine that inherently analyzes and distributes compute functionality out as far as it can go towards the IoT edge while centralizing data only as much as required. Kind of like a database query planner on IoT-size steroids).

So, onto my Mac deployment of Fission. I’ve already got VirtualBox installed for running Hadoop cluster sandboxes and other fun projects, but OpenStack is probably not something I really need or want to run on my own Mac (although apparently I could if I wanted more agility in spinning up and down big data clusters). But – Ah ha! – now a mental lightbulb goes on! (or rather, an LED went on – gotta save power these days).

This Fission project means I can run my own lambda services now too on my little desktop Mac, and then easily deploy really cool stuff to really big clouds when someday I create that killer app (with lambdas that happily interface with other coolness like Spark, Neo4j, Ruby on Rails…).  Ok, this is definitely something I want to play with.  And I’m thinking, wait for it –  Ruby lambdas!  (Ruby is not dead, you fools! You’ll all eventually see why Ruby is the one language that will be used in the darkness to bind them all!)

Well, we’ll come back to Ruby later.  First things first – we’ll start with the default node.js example. Let’s aim for a local nested stack that will run like this:

osx (-> virtualbox (-> minikube (-> fission (-> node.js))))

host server – hypervisor – container cluster – lambda services – execution environment

While the lambda execution will be nested, the CLI commands to interface with minikube/kubernetes (kubectl) and fission (fission) will be available locally at the osx command line (in a terminal window).

Ok, I’ve already got VirtualBox, but it’s out of date for minikube. So I directly download the latest off the web and install – oops, first issue! Mac OSX now has some fancy SIP security layer that prevents anyone from actually getting anything done as root (I swear if they keep making my unix-based Mac work like IOS I’m gonna convert to Ubuntu!). So after working around security to get that update in place (and thank you Oracle for VirtualBox) we are moving on!

$ virtualbox
Oh, and make sure to also have kubectl installed locally. The local kubectl will get dynamically linked into the minikube kubernetes environment that will be running inside virtualbox.
$ curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/darwin/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
$ kubectl version

For the minikube install I used brew, which of course I had to update first. And of course, I had to again work around the Mac OSX SIP challenge above (hopefully this is a one time fix) by setting /usr/local directly ownership to myself (then back to root:wheel after the dust settled).

$ brew update
$ brew cask install minikube
$ minikube start 
# minikube stop
# minikube service [-n NAMESPACE] [--url] NAME
$ minikube ip
$ minikube dashboard

At this point you can deploy containerized apps with kubectl into the minikube “cluster”.  This next bit is an example of a simple “echo” server from the minikube github.

$ kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --port=8080
$ kubectl expose deployment hello-minikube --type=NodePort
$ kubectl get pod
$ curl $(minikube service hello-minikube --url)

(If you are following along, you might suggest that I should play here with minishift too, but now is not yet the time! Maybe I’ll climb into that PaaS arena in another post.)

Now it’s time for Fission. These next snippets are taken from fission github readme page. The first curl gets the fission client command lines installed locally. The kubectl lines start the fission services up. The two shell variables are just for convenience of the provided example, and not part of the required install.

$ curl http://fission.io/mac/fission > fission && chmod +x fission && sudo mv fission /usr/local/bin/

$ kubectl create -f http://fission.io/fission.yaml
$ kubectl create -f http://fission.io/fission-nodeport.yaml

$ export FISSION_URL=http://$(minikube ip):31313
$ export FISSION_ROUTER=$(minikube ip):31314    (for these examples)

Ok, so now have our own lambda services host running. Next we can start deploying lambda functions. Fission does a number of things like scale-out our services and keep a few containers ready for fast startup, and probably a bunch of stuff I won’t figure out until some O’Reilly book comes out (oh, I could just read the code…).

$ fission env create --name nodejs --image fission/node-env
$ curl https://raw.githubusercontent.com/fission/fission/master/examples/nodejs/hello.js > hello.js

$ fission function create --name hello --env nodejs --code hello.js
$ fission route create --method GET --url /hello --function hello

First, we create a fission environment associating a fission environment container image with the name “nodejs”. Then we create a fission function with our functional lambda hello.js “code” into that fission environment. Here we are using javascript and node.js, but there are other execution environments available (and we can make our own!). We also then need to map a web services route to our fission function.


module.exports = async function(context) {
    return {
        status: 200,
        body: "Hello, World!\n"
    };
}
hello.js

You can see that a Fission lambda function is just a javascript function. In this case all it does is return a standard HTTP response.

$ curl http://$FISSION_ROUTER/hello
 ->  Hello, World!

Testing it out – we hit the URL with a GET request and tada!  Hello World!

This is quite an onion we’ve built, but you hopefully can appreciate that each layer is adding to the architecture that would enable easy deployment at large scale and wide distribution down the road. Next up though, I personally want Ruby lambdas!

I could build a full native ruby fission environment (should be easy enough to start with an existing RH or docker ruby container). There is a python fission example that wouldn’t be hard to emulate. I’d have to decide on key gems to pre-load, and that leads to a big question on what I’d like to actually do and how big and fat that environment might get (which could be slow and bloated). Or we could try to stay very small – there have been small embeddable ruby’s like mruby (although that one looks dead since 2015). There is also some interesting advice for building minimal ruby app containers .

While not actually ruby, CoffeeScript transpiling ruby-like coffeescript code to javascript seems the easiest route at the moment, and just uses the vanilla fission node.js environment we already have above. I could also see also embedding “coffee” in a fission environment easily enough so that I could send coffeescript code directly to fission (although that would require transpiling on every lambda execution – it’s always a trade-off). To get started with coffee, add it to your local node.js environment (install Node first if you don’t already have that).

$ npm install -g coffee-script
$ coffee

Using coffee is easy enough. Learning it might take a bit of study, although if you like ruby and only suffer when forced to work with native Javascript, it’s well worth it.

But CoffeeScript is not ruby.  Something like Opal (transpiling full ruby syntax to js) is an even more interesting project, and if it was ever solid it could be implemented here with fission in a number of ways – possibly embedding it in a unique Opal ruby fission environment, statically applying it upstream from a node.js fission environment like with CoffeeScript, or even using it dynamically as a wrapper with ruby code sent to the node.js environment.

Another idea is to build a small ruby native fission solution with something like a nested ruby Sinatra design. First create a local “super-fission-sinatra” DSL that would deploy sinatra-like web service definition code to an embedded ruby/sinatra fission environment. Kind of meta-meta maybe, but maybe an interesting way to build scalable, instrumented API’s.

All right – that’s enough for now. Time to play! Let me know if you create any Ruby Fission examples!

Open Wide and Say Ahh!

(Excerpt from original post on the Taneja Group News Blog)

I’ve been immersed in “Open” for the last two weeks here in Boston, attending both Red Hat Summit 2017 and then OpenStack Summit. There are quite a few things worth paying attention to, especially if you are an enterprise IT shop still wondering how your inevitable cloud (and services) transformation is really going to play out, including accelerating application migration to containers and the rise of platform Management as a Service.

…(read the full post)

Storage technologies evolve toward a data-processing platform

An IT industry analyst article published by SearchDataCenter.


article_Storage-technologies-evolve-toward-a-data-processing-platform
Emerging technologies such as containers, HCI and big data have blurred the lines between compute and storage platforms, breaking down traditional IT silos.

Mike Matchett

With the rise of software-defined storage, in which storage services are implemented as a software layer, the whole idea of data storage is being re-imagined. And with the resulting increase in the convergence of compute with storage, the difference between a storage platform and a data-processing platform is further eroding.

Storage takes new forms

Let’s look at a few of the ways that storage is driving into new territory:

  • Now in containers! Almost all new storage operating systems, at least under the hood, are being written as containerized applications. In fact, we’ve heard rumors that some traditional storage systems are being converted to containerized form. This has a couple of important implications, including the ability to better handle massive scale-out, increased availability, cloud-deployment friendliness and easier support for converging computation within the storage.
  • Merged and converged. Hyper-convergence bakes software-defined storage into convenient, modular appliance units of infrastructure. Hyper-converged infrastructure products, such as those from Hewlett Packard Enterprise’s SimpliVity and Nutanix, can greatly reduce storage overhead and help build hybrid clouds. We also see innovative approaches merging storage and compute in new ways, using server-side flash (e.g., Datrium), rack-scale infrastructure pooling (e.g., Drivescale) or even integrating ARM processors on each disk drive (e.g., Igneous).
  • Bigger is better. If the rise of big data has taught us anything, it’s that keeping more data around is a prerequisite for having the opportunity to mine value from that data. Big data distributions today combine Hadoop and Spark ecosystems, various flavors of databases and scale-out system management into increasingly general-purpose data-processing platforms, all powered by underlying big data storage tools (e.g., Hadoop Distributed File System, Kudu, Alluxio).
  • Always faster. If big is good, big and fast are even better. We are seeing new kinds of automatically tiered and cached big data storage and data access layer products designed around creating integrated data pipelines. Many of these tools are really converged big data platforms built for analyzing big and streaming data at internet of things (IoT) scales.

The changing fundamentals

Powering many of these examples are interesting shifts in underlying technical capabilities. New data processing platforms are handling more metadata per unit of data than ever before. More metadata leads to new, highly efficient ways to innovate …(read the complete as-published article there)