Cloudy Object Storage Mints Developer Currency

Having just been to VMWorld 2017 and getting pre-briefs for Strata Data NY coming up soon, I’ve noticed a few hot trends aiming to help foster cross-cloud, multi-cloud, hybrid operations. Among these are

  1. A focus on centralized management for both on-prem and cloud infrastructure
  2. The accelerating pace of systems Management as a Service (MaaS!?) offerings
  3. Container management “distributions” with enterprise features
  4. And an increasing demand for fluid, cross-hybrid infrastructure storage

We all know about AWS S3, the current lingua franca of object storage for web-scale application development and de facto REST-based “standard” for object storage API’s.  Only it’s not actually a standard. If you want to write apps for deployment anywhere, on any cloud or on-premise server or even your laptop, you might not appreciate being locked in to coding directly to the ultimately proprietary S3 API (or waiting for your big enterprise storage solutions to support it).

Which is where the very cool object (“blob”) storage solution from Minio comes in to play.  Minio is effectively a software defined object storage layer that on the backend virtually transforms just about any underlying storage into a consistent front-end developer friendly (and open source) object storage interface (still S3 compatible). This means you can code to Minio — and actually deploy Minio WITH your application if needed — onto any cloud, storage infrastructure, or local volume. And it adds all kinds of enterprise features like advanced reliability with erasure coding which will no doubt please those IT/devops folks that want to manage a consistent storage footprint no matter where the application deploys.

So this sounds at first like this could all be complicated, but I installed and ran Minio locally with but two commands in less than a minute.

$ brew install minio/stable/minio

$ minio server --address localhost:9500 /users/mike/minio
Minio Install and Launch

Done. We can immediately code with Minio’s universal REST API based calls to that URL/port. And of course, point a browser at it and you get a nice browser based interface – this is all you need to do if you just want something like a photo “dropbox” repository of your own.

But wait! One more step gets you a lot of really useful functionality. Install the Minio Client (mc) and configure it with your Mino server key. Now you have a powerful, consistent command line interface to manage and muck with both object and file stores.

$ brew install minio/stable/mc

# This line (and keys) is provided as output by the minio server start above

$ mc config host add myminio http://localhost:9500 <ACCESS KEY> <SECRET KEY>
Minio Client

The “mc” interface has supersetted a whole bunch of familiar UNIX command line utilities.  It’s likely many of your existing administrative scripts can be trivially updated to work cross-hybrid_cloud cross-file/object etc.

$ mc

NAME:
  mc - Minio Client for cloud storage and filesystems.

USAGE:
  mc [FLAGS] COMMAND [COMMAND FLAGS | -h] [ARGUMENTS...]

COMMANDS:
  ls       List files and folders.
  mb       Make a bucket or a folder.
  cat      Display file and object contents.
  pipe     Redirect STDIN to an object or file or STDOUT.
  share    Generate URL for sharing.
  cp       Copy files and objects.
  mirror   Mirror buckets and folders.
  diff     Show differences between two folders or buckets.
  rm       Remove files and objects.
  events   Manage object notifications.
  watch    Watch for files and objects events.
  policy   Manage anonymous access to objects.
  admin    Manage Minio servers
  session  Manage saved sessions for cp and mirror commands.
  config   Manage mc configuration file.
  update   Check for a new software update.
  version  Print version info.
Minio Client Help Text

One can easily see that Minio delivers a nice, really robust object storage service that insulates developer and resource admin (devops) from infrastructure/architecture.

Minio can be pointed at/deployed on many different cloud providers to enable easier cross-cloud multi-cloud migration. And it’s locally efficient. For example, when in AWS, it uses S3,  when on Azure it uses native Azure blob storage services.

Folks looking at taking advantage of hybridizing operations with something like VMware on AWS might also want to deploy Minio over both their local storage solutions (SANs, NASs, even VSAN et.al) AND on the cloud side (AWS S3). This solves a big challenge when transitioning to hybrid ops – getting consistent, enterprise grade storage services everywhere.

There is a lot more to be said about Minio’s advanced feature set and exciting roadmap, but you can read all about those on the Minio website!  Have fun!

 

I’m Going Fission

I just spent a couple of weeks in Boston at Red Hat Summit and OpenStack Summit.  Containers are clearly the big thing this year – Kubernetes, Openshift, etc. And increasingly, IT is learning how to take advantage of remote Management As A Service (MaaS) offerings that free up folks to focus more on business value and less on running complex stacks. On that front I talked with folks like Platform9, who happen to also sponsor a “server-less” computing solution called Fission (- later in this blog post I’ll show how I got Fission deployed to my Mac).

Because I’m an industry analyst (in my day job), here is a big picture of the evolution happening in application infrastructure: Physically hosted apps (server and O/S) –> Virtual machines (in a hypervisor) –> Cloud platforms (e.g. OpenStack) –> Container “ships” (e.g. OpenShift, Docker, Kubernetes) –> Serverless Computing (e.g. AWS Lambda and Fission).

Applications have always been constructed out of multiple tiers and communicating parts, but generally we are moving towards a world in which functionality is both defined and deployed (distributable, scalable) in small, testable bits (i.e. “units” as in unit testing), while an application “blueprint” defines all the related bits and required service properties in operation.  Some folks are calling the blueprinting part “infrastructure as code”.

(BTW – the next evolutionary step is probably some kind of highly intelligent, dynamic IoT/Big Data/Distributed engine that inherently analyzes and distributes compute functionality out as far as it can go towards the IoT edge while centralizing data only as much as required. Kind of like a database query planner on IoT-size steroids).

So, onto my Mac deployment of Fission. I’ve already got VirtualBox installed for running Hadoop cluster sandboxes and other fun projects, but OpenStack is probably not something I really need or want to run on my own Mac (although apparently I could if I wanted more agility in spinning up and down big data clusters). But – Ah ha! – now a mental lightbulb goes on! (or rather, an LED went on – gotta save power these days).

This Fission project means I can run my own lambda services now too on my little desktop Mac, and then easily deploy really cool stuff to really big clouds when someday I create that killer app (with lambdas that happily interface with other coolness like Spark, Neo4j, Ruby on Rails…).  Ok, this is definitely something I want to play with.  And I’m thinking, wait for it –  Ruby lambdas!  (Ruby is not dead, you fools! You’ll all eventually see why Ruby is the one language that will be used in the darkness to bind them all!)

Well, we’ll come back to Ruby later.  First things first – we’ll start with the default node.js example. Let’s aim for a local nested stack that will run like this:

osx (-> virtualbox (-> minikube (-> fission (-> node.js))))

host server – hypervisor – container cluster – lambda services – execution environment

While the lambda execution will be nested, the CLI commands to interface with minikube/kubernetes (kubectl) and fission (fission) will be available locally at the osx command line (in a terminal window).

Ok, I’ve already got VirtualBox, but it’s out of date for minikube. So I directly download the latest off the web and install – oops, first issue! Mac OSX now has some fancy SIP security layer that prevents anyone from actually getting anything done as root (I swear if they keep making my unix-based Mac work like IOS I’m gonna convert to Ubuntu!). So after working around security to get that update in place (and thank you Oracle for VirtualBox) we are moving on!

$ virtualbox
Oh, and make sure to also have kubectl installed locally. The local kubectl will get dynamically linked into the minikube kubernetes environment that will be running inside virtualbox.
$ curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/v1.6.0/bin/darwin/amd64/kubectl && chmod +x kubectl && sudo mv kubectl /usr/local/bin/
$ kubectl version

For the minikube install I used brew, which of course I had to update first. And of course, I had to again work around the Mac OSX SIP challenge above (hopefully this is a one time fix) by setting /usr/local directly ownership to myself (then back to root:wheel after the dust settled).

$ brew update
$ brew cask install minikube
$ minikube start 
# minikube stop
# minikube service [-n NAMESPACE] [--url] NAME
$ minikube ip
$ minikube dashboard

At this point you can deploy containerized apps with kubectl into the minikube “cluster”.  This next bit is an example of a simple “echo” server from the minikube github.

$ kubectl run hello-minikube --image=gcr.io/google_containers/echoserver:1.4 --port=8080
$ kubectl expose deployment hello-minikube --type=NodePort
$ kubectl get pod
$ curl $(minikube service hello-minikube --url)

(If you are following along, you might suggest that I should play here with minishift too, but now is not yet the time! Maybe I’ll climb into that PaaS arena in another post.)

Now it’s time for Fission. These next snippets are taken from fission github readme page. The first curl gets the fission client command lines installed locally. The kubectl lines start the fission services up. The two shell variables are just for convenience of the provided example, and not part of the required install.

$ curl http://fission.io/mac/fission > fission && chmod +x fission && sudo mv fission /usr/local/bin/

$ kubectl create -f http://fission.io/fission.yaml
$ kubectl create -f http://fission.io/fission-nodeport.yaml

$ export FISSION_URL=http://$(minikube ip):31313
$ export FISSION_ROUTER=$(minikube ip):31314    (for these examples)

Ok, so now have our own lambda services host running. Next we can start deploying lambda functions. Fission does a number of things like scale-out our services and keep a few containers ready for fast startup, and probably a bunch of stuff I won’t figure out until some O’Reilly book comes out (oh, I could just read the code…).

$ fission env create --name nodejs --image fission/node-env
$ curl https://raw.githubusercontent.com/fission/fission/master/examples/nodejs/hello.js > hello.js

$ fission function create --name hello --env nodejs --code hello.js
$ fission route create --method GET --url /hello --function hello

First, we create a fission environment associating a fission environment container image with the name “nodejs”. Then we create a fission function with our functional lambda hello.js “code” into that fission environment. Here we are using javascript and node.js, but there are other execution environments available (and we can make our own!). We also then need to map a web services route to our fission function.


module.exports = async function(context) {
    return {
        status: 200,
        body: "Hello, World!\n"
    };
}
hello.js

You can see that a Fission lambda function is just a javascript function. In this case all it does is return a standard HTTP response.

$ curl http://$FISSION_ROUTER/hello
 ->  Hello, World!

Testing it out – we hit the URL with a GET request and tada!  Hello World!

This is quite an onion we’ve built, but you hopefully can appreciate that each layer is adding to the architecture that would enable easy deployment at large scale and wide distribution down the road. Next up though, I personally want Ruby lambdas!

I could build a full native ruby fission environment (should be easy enough to start with an existing RH or docker ruby container). There is a python fission example that wouldn’t be hard to emulate. I’d have to decide on key gems to pre-load, and that leads to a big question on what I’d like to actually do and how big and fat that environment might get (which could be slow and bloated). Or we could try to stay very small – there have been small embeddable ruby’s like mruby (although that one looks dead since 2015). There is also some interesting advice for building minimal ruby app containers .

While not actually ruby, CoffeeScript transpiling ruby-like coffeescript code to javascript seems the easiest route at the moment, and just uses the vanilla fission node.js environment we already have above. I could also see also embedding “coffee” in a fission environment easily enough so that I could send coffeescript code directly to fission (although that would require transpiling on every lambda execution – it’s always a trade-off). To get started with coffee, add it to your local node.js environment (install Node first if you don’t already have that).

$ npm install -g coffee-script
$ coffee

Using coffee is easy enough. Learning it might take a bit of study, although if you like ruby and only suffer when forced to work with native Javascript, it’s well worth it.

But CoffeeScript is not ruby.  Something like Opal (transpiling full ruby syntax to js) is an even more interesting project, and if it was ever solid it could be implemented here with fission in a number of ways – possibly embedding it in a unique Opal ruby fission environment, statically applying it upstream from a node.js fission environment like with CoffeeScript, or even using it dynamically as a wrapper with ruby code sent to the node.js environment.

Another idea is to build a small ruby native fission solution with something like a nested ruby Sinatra design. First create a local “super-fission-sinatra” DSL that would deploy sinatra-like web service definition code to an embedded ruby/sinatra fission environment. Kind of meta-meta maybe, but maybe an interesting way to build scalable, instrumented API’s.

All right – that’s enough for now. Time to play! Let me know if you create any Ruby Fission examples!

Playing with Neo4j version 3.0

I’ve been playing again with Neo4j now that v3 is out. And hacking through some ruby scripts to load some interesting data I have laying around (e.g. the database for this website which I’m mainly modeling as “(posts)<-(tags); (posts:articles)<-(publisher)”).

For ruby hacking in the past I’ve used the Neology gem, but now I’m trying out the Neo4jrb set of gems. And though I think an OGM is where it’s at (next Rails app I build will no doubt be using some graph db), I’m starting with just neo4j-core to get a handle on graph concepts and Cypher.

One thing that stumped me for a bit is that with the latest version of these gems – maybe now that they support multiple Neo4j sessions – I found it helped to add a “default: true” parameter to the session “open” to keep everything down stream working at the neo4j-core level. Otherwise Node and other neo4j-core classes seemed to lose the current session and give a weird error (depending on scope?).  Or maybe I just kept clobbering my session context somehow. Anyway doesn’t seem to hurt.

require 'neo4j-core'
@_neo_session = nil
def neo_session
  @_neo_session ||= Neo4j::Session.open(:server_db,
    'http://user:password@localhost:7474',
    default: true)
end
#...
neo_session
Neo4j::Node.create({title: "title"}, :Blog)
#...
Neo4j-core Session

The Neo4j v3 Mac OSX “desktop install” has removed terminal neo4j-shell access in favor of the updated slick browser interface. This updated browser interface is pretty good, but for some things I’d still really like to play with a terminal window command shell.  Maybe I’m just getting old :)… If you still want the neo4j shell, apparently you can instead install the linux tarball version (but then you don’t get the browser client?). I’m not sure why product managers make either-or packaging decisions like this. It’s not as if the shell was deprecated (e.g. to save much dev, time or testing effort).

Anyway, things look pretty cool in the browser interface, and playing with Cypher is straightforward as you can change between table, text, and graph views of results with just a click.

Screen Shot 2016-06-07 at 3.44.03 PM I’ve also been wanting to play with Gephi more. So I’m exporting data from Neo (using .cvs files though as the Gephi community neo4j importer plugin isn’t yet updated to Gephi v0.9) using Cypher statements like these and the browser interface download button.

#for the node table export -> Gephi import
MATCH (n) RETURN ID(n) AS Id, LABELS(n) AS Label, n.title As Title, n.url AS URL, toString(n.date) as Date, n.name AS Name, n.publisher AS Publisher

#for the edge table export -> Gephi import
MATCH (l)-[rel]->(r) RETURN ID(rel) AS Id, TYPE(rel) AS Label, ID(l) AS Source, ID(r) AS Target
Cypher Queries for Importing into Gephi

Hyperconverged Supercomputers For the Enterprise Data Center

(Excerpt from original post on the Taneja Group News Blog)

Last month NVIDIA, our favorite GPU vendor, dived into the converged appliance space. In fact we might call their new NVIDIA DGX-1 a hyperconverged supercomputer in a 4U box. Designed to support the application of GPU’s to Deep Learning (i.e. compute intensive deeply layered neural networks that need to train and run in operational timeframes over big data), this beast has 8 new Tesla P100 GPUs inside on an embedded NVLink mesh, pre-integrated with flash SSDs, decent memory, and an optimized container-hosting deep learning software stack. The best part? The price is surprisingly affordable, and can replace the 250+ server cluster you might otherwise need for effective Deep Learning.

…(read the full post)

Server Powered Storage: Intelligent Storage Arrays Gain Server Superpowers

An IT industry analyst article published by Infostor.


At Taneja Group we are seeing a major trend within IT to leverage server and server-side resources to the maximum extent possible. Servers themselves have become commodities, and dense memory, server-side flash, even compute power continue to become increasingly powerful and cost-friendly. Many datacenters already have a glut of CPU that will only increase with newer generations of faster, larger-cored chips, denser packaging and decreasing power requirements. Disparate solutions from in-memory databases (e.g. SAP HANA) to VMware’s NSX are taking advantage of this rich excess by separating out and moving functionality that used to reside in external devices (i.e. SANs and switches) up onto the server.

Within storage we see two hot trends – hyperconvergence and software defined – getting most of the attention lately. But when we peel back the hype, we find that both are really enabled by this vastly increasing server power – in particular server resources like CPU, memory and flash are getting denser, cheaper and more powerful to the point where they are capable of hosting sophisticated storage processing capabilities directly. Where traditional arrays built on fully centralized, fully shared hardware might struggle with advanced storage functions at scale, server-side storage tends to scale functionality naturally with co-hosted application workloads. The move towards “server-siding” everything is so talked about that it seems inevitable that traditional physical array architectures are doomed.

…(read the complete as-published article there)

Get the most from cloud-based storage services

An IT industry analyst article published by SearchStorage.


We have been hearing about the inevitable transition to the cloud for IT infrastructure since before the turn of the century. But, year after year, storage shops quickly become focused on only that year’s prioritized initiatives, which tend to be mostly about keeping the lights on and costs low. A true vision-led shift to cloud-based storage services requires explicit executive sponsorship from the business side of an organization. But unless you cynically count the creeping use of shadow IT as an actual strategic directive to do better as an internal service provider, what gets asked of you is likely — and unfortunately — to perform only low-risk tactical deployments or incremental upgrades.

Not exactly the stuff of business transformations.

Cloud adoption at a level for maximum business impact requires big executive commitment. That amount of commitment is, quite frankly, not easy to generate.

…(read the complete as-published article there)

Diamanti Reveals Hyperconverged Scale-out Appliances for Containers

(Excerpt from original post on the Taneja Group News Blog)

Diamanti (pre-launch known as Datawise.io) has recently rolled out their brand new hyperconverged “container” appliances. Why would containers, supposedly able to be fluidly hosted just about anywhere, need a specially built host? Kubernetes et.al. might take care of CPU allotment, but there are still big obstacles for naked containers in a production data center, especially as containers are now being lined up to host far more than simple stateless micro-services.  Now their real-world storage and networking needs have to be matched, aligned, and managed or the whole efficiency opportunity can be easily lost.

…(read the full post)

CI and disaggregated server tech can converge after all

An IT industry analyst article published by SearchDataCenter.


I’ve talked about the inevitability of infrastructure convergence, so it might seem like I’m doing a complete 180 degree turn by introducing the opposite trend of infrastructure: aggregation. Despite appearances, disaggregated server technology isn’t really the opposite of convergence. In fact, disaggregated and converged servers work together.

In this new trend, physical IT components come in larger and denser pools for maximum cost efficiency. At the same time, compute-intensive functionality, such as data protection, that was once tightly integrated with the hardware is pulled out and hosted separately to optimize performance and use cheaper components.

Consider today’s cloud architects building hyper-scale infrastructures; instead of buying monolithic building blocks, they choose to pool massive amounts of dense commodity resources.

…(read the complete as-published article there)

Evaluating hyper-converged architectures: Five key CIO considerations

An IT industry analyst article published by SearchCio.


Plain IT convergence offers IT organizations a major convenience — integrated and pre-assembled stacks of heterogeneous vendor infrastructure, including servers, storage and networking gear, that help accelerate new deployments and quickly support fast-growing applications.

But IT hyper-convergence goes farther to integrate IT infrastructure into simple modular appliances. Where pre-converged racks of infrastructure can provide good value to enterprises that would otherwise buy and assemble component vendor equipment themselves, hyper-converged architectures present a larger opportunity to not only simplify IT infrastructure and save on capital expenditures (CAPEX) but also help transform IT staff from internally focused legacy data center operators into increasingly agile, business-facing service providers.

With hyper-converged architectures, IT organizations can shift focus towards helping accelerate and enable business operations and applications, because they don’t spend as much time on, for example, silo troubleshooting, stack integration and testing, and traditional data protection tasks. The decision to adopt hyper-converged architectures is therefore something that business folks will see and appreciate directly through increased IT agility, cloud-like IT services, realistic BC/DR, and a greatly improved IT cost basis.

…(read the complete as-published article there)

Big Data Enterprise Maturity

(Excerpt from original post on the Taneja Group News Blog)

It’s time to look at big data again. Last week I was at Cloudera’s growing and vibrant annual analyst event to hear the latest from the folks who know what’s what. Then this week Strata (conference for data scientists) brings lots of public big data vendor announcements. A noticeable shift this year is less focus on how to apply big data and more about maturing enterprise features intended to ease wider data center level adoption. A good example is the “mixed big data workload QoS” cluster optimizating solution from Pepperdata.

…(read the full post)