Playing with Neo4j version 3.0

I’ve been playing again with Neo4j now that v3 is out. And hacking through some ruby scripts to load some interesting data I have laying around (e.g. the database for this website which I’m mainly modeling as “(posts)<-(tags); (posts:articles)<-(publisher)”).

For ruby hacking in the past I’ve used the Neology gem, but now I’m trying out the Neo4jrb set of gems. And though I think an OGM is where it’s at (next Rails app I build will no doubt be using some graph db), I’m starting with just neo4j-core to get a handle on graph concepts and Cypher.

One thing that stumped me for a bit is that with the latest version of these gems – maybe now that they support multiple Neo4j sessions – I found it helped to add a “default: true” parameter to the session “open” to keep everything down stream working at the neo4j-core level. Otherwise Node and other neo4j-core classes seemed to lose the current session and give a weird error (depending on scope?).  Or maybe I just kept clobbering my session context somehow. Anyway doesn’t seem to hurt.

require 'neo4j-core'
@_neo_session = nil
def neo_session
  @_neo_session ||= Neo4j::Session.open(:server_db,
    'http://user:password@localhost:7474',
    default: true)
end
#...
neo_session
Neo4j::Node.create({title: "title"}, :Blog)
#...
Neo4j-core Session

The Neo4j v3 Mac OSX “desktop install” has removed terminal neo4j-shell access in favor of the updated slick browser interface. This updated browser interface is pretty good, but for some things I’d still really like to play with a terminal window command shell.  Maybe I’m just getting old :)… If you still want the neo4j shell, apparently you can instead install the linux tarball version (but then you don’t get the browser client?). I’m not sure why product managers make either-or packaging decisions like this. It’s not as if the shell was deprecated (e.g. to save much dev, time or testing effort).

Anyway, things look pretty cool in the browser interface, and playing with Cypher is straightforward as you can change between table, text, and graph views of results with just a click.

Screen Shot 2016-06-07 at 3.44.03 PM I’ve also been wanting to play with Gephi more. So I’m exporting data from Neo (using .cvs files though as the Gephi community neo4j importer plugin isn’t yet updated to Gephi v0.9) using Cypher statements like these and the browser interface download button.

#for the node table export -> Gephi import
MATCH (n) RETURN ID(n) AS Id, LABELS(n) AS Label, n.title As Title, n.url AS URL, toString(n.date) as Date, n.name AS Name, n.publisher AS Publisher

#for the edge table export -> Gephi import
MATCH (l)-[rel]->(r) RETURN ID(rel) AS Id, TYPE(rel) AS Label, ID(l) AS Source, ID(r) AS Target
Cypher Queries for Importing into Gephi

Hyperconverged Supercomputers For the Enterprise Data Center

(Excerpt from original post on the Taneja Group News Blog)

Last month NVIDIA, our favorite GPU vendor, dived into the converged appliance space. In fact we might call their new NVIDIA DGX-1 a hyperconverged supercomputer in a 4U box. Designed to support the application of GPU’s to Deep Learning (i.e. compute intensive deeply layered neural networks that need to train and run in operational timeframes over big data), this beast has 8 new Tesla P100 GPUs inside on an embedded NVLink mesh, pre-integrated with flash SSDs, decent memory, and an optimized container-hosting deep learning software stack. The best part? The price is surprisingly affordable, and can replace the 250+ server cluster you might otherwise need for effective Deep Learning.

…(read the full post)

Server Powered Storage: Intelligent Storage Arrays Gain Server Superpowers

An IT industry analyst article published by Infostor.


At Taneja Group we are seeing a major trend within IT to leverage server and server-side resources to the maximum extent possible. Servers themselves have become commodities, and dense memory, server-side flash, even compute power continue to become increasingly powerful and cost-friendly. Many datacenters already have a glut of CPU that will only increase with newer generations of faster, larger-cored chips, denser packaging and decreasing power requirements. Disparate solutions from in-memory databases (e.g. SAP HANA) to VMware’s NSX are taking advantage of this rich excess by separating out and moving functionality that used to reside in external devices (i.e. SANs and switches) up onto the server.

Within storage we see two hot trends – hyperconvergence and software defined – getting most of the attention lately. But when we peel back the hype, we find that both are really enabled by this vastly increasing server power – in particular server resources like CPU, memory and flash are getting denser, cheaper and more powerful to the point where they are capable of hosting sophisticated storage processing capabilities directly. Where traditional arrays built on fully centralized, fully shared hardware might struggle with advanced storage functions at scale, server-side storage tends to scale functionality naturally with co-hosted application workloads. The move towards “server-siding” everything is so talked about that it seems inevitable that traditional physical array architectures are doomed.

…(read the complete as-published article there)

Get the most from cloud-based storage services

An IT industry analyst article published by SearchStorage.


We have been hearing about the inevitable transition to the cloud for IT infrastructure since before the turn of the century. But, year after year, storage shops quickly become focused on only that year’s prioritized initiatives, which tend to be mostly about keeping the lights on and costs low. A true vision-led shift to cloud-based storage services requires explicit executive sponsorship from the business side of an organization. But unless you cynically count the creeping use of shadow IT as an actual strategic directive to do better as an internal service provider, what gets asked of you is likely — and unfortunately — to perform only low-risk tactical deployments or incremental upgrades.

Not exactly the stuff of business transformations.

Cloud adoption at a level for maximum business impact requires big executive commitment. That amount of commitment is, quite frankly, not easy to generate.

…(read the complete as-published article there)

Diamanti Reveals Hyperconverged Scale-out Appliances for Containers

(Excerpt from original post on the Taneja Group News Blog)

Diamanti (pre-launch known as Datawise.io) has recently rolled out their brand new hyperconverged “container” appliances. Why would containers, supposedly able to be fluidly hosted just about anywhere, need a specially built host? Kubernetes et.al. might take care of CPU allotment, but there are still big obstacles for naked containers in a production data center, especially as containers are now being lined up to host far more than simple stateless micro-services.  Now their real-world storage and networking needs have to be matched, aligned, and managed or the whole efficiency opportunity can be easily lost.

…(read the full post)

CI and disaggregated server tech can converge after all

An IT industry analyst article published by SearchDataCenter.


I’ve talked about the inevitability of infrastructure convergence, so it might seem like I’m doing a complete 180 degree turn by introducing the opposite trend of infrastructure: aggregation. Despite appearances, disaggregated server technology isn’t really the opposite of convergence. In fact, disaggregated and converged servers work together.

In this new trend, physical IT components come in larger and denser pools for maximum cost efficiency. At the same time, compute-intensive functionality, such as data protection, that was once tightly integrated with the hardware is pulled out and hosted separately to optimize performance and use cheaper components.

Consider today’s cloud architects building hyper-scale infrastructures; instead of buying monolithic building blocks, they choose to pool massive amounts of dense commodity resources.

…(read the complete as-published article there)

Evaluating hyper-converged architectures: Five key CIO considerations

An IT industry analyst article published by SearchCio.


Plain IT convergence offers IT organizations a major convenience — integrated and pre-assembled stacks of heterogeneous vendor infrastructure, including servers, storage and networking gear, that help accelerate new deployments and quickly support fast-growing applications.

But IT hyper-convergence goes farther to integrate IT infrastructure into simple modular appliances. Where pre-converged racks of infrastructure can provide good value to enterprises that would otherwise buy and assemble component vendor equipment themselves, hyper-converged architectures present a larger opportunity to not only simplify IT infrastructure and save on capital expenditures (CAPEX) but also help transform IT staff from internally focused legacy data center operators into increasingly agile, business-facing service providers.

With hyper-converged architectures, IT organizations can shift focus towards helping accelerate and enable business operations and applications, because they don’t spend as much time on, for example, silo troubleshooting, stack integration and testing, and traditional data protection tasks. The decision to adopt hyper-converged architectures is therefore something that business folks will see and appreciate directly through increased IT agility, cloud-like IT services, realistic BC/DR, and a greatly improved IT cost basis.

…(read the complete as-published article there)

Big Data Enterprise Maturity

(Excerpt from original post on the Taneja Group News Blog)

It’s time to look at big data again. Last week I was at Cloudera’s growing and vibrant annual analyst event to hear the latest from the folks who know what’s what. Then this week Strata (conference for data scientists) brings lots of public big data vendor announcements. A noticeable shift this year is less focus on how to apply big data and more about maturing enterprise features intended to ease wider data center level adoption. A good example is the “mixed big data workload QoS” cluster optimizating solution from Pepperdata.

…(read the full post)

Google Drives a Stake Into The Human Heart? – AlphaGo Beats Human Go Champion

(Excerpt from original post on the Taneja Group News Blog)

Google’s AlphaGo program has just whipped a top human Go playing champion (Lee Sedol – a 9 Dan ranked pro and winner of many top titles) four games to one in a million dollar match. You might shrug if you are not familiar with the subtleties of playing Go at a champion level, but believe me that this is a significant milestone in Machine Learning. Software has long proven to be able to master games whose moves can be calculated out to the end game with enough computing power (like checkers and chess), but Go is (as of yet) not fully computable due to its board size (19×19) and seemingly “intuitive” beginning and even mid-game move options.

…(read the full post)

Accelerating Cloud Transformation with Microsoft StorSimple’s New Virtual Appliance

(Excerpt from original post on the Taneja Group News Blog)

We’ve always admired StorSimple from before Microsoft acquired them. And what was Microsoft up to, getting into the storage infrastructure space?  Well, if it wasn’t apparent before it is now – Microsoft StorSimple is a great on-ramp to the cloud (Azure naturally), especially if formal executive sponsored cloud transformation initiatives just aren’t in the cards for the rank and file IT organization. And this week Microsoft just announced a virtual StorSimple appliance option that can help accelerate cloud transformations even faster.

…(read the full post)