In Memory Big Data Heats Up With Apache Ignite

(Excerpt from original post on the Taneja Group News Blog)

Recently we posted about GridGain contributing their core in-memory solution to the Apache Ignite project. While this is still incubating, it’s clear that this was a good move for GridGrain, and a win for the big data/BI community in general. Today Apache Ignite drops its v1.0 release candidate with some new features added in like built-in support for jCache and an autoloader to help migrate data and schema in from existing SQL databases (e.g. Oracle, MySQL, Postgres, DB2, Microsoft SQL, etc.).

…(read the full post)

IoT Goes Real-Time, Gets Predictive – Glassbeam Launches Spark-based Machine Learning

(Excerpt from original post on the Taneja Group News Blog)

In-Memory processing was all the rage at Strata 2014 NY last month, and the hottest word was Spark! Spark is big data scale-out cluster solution that provides a way to speedily analyze large data sets in-memory using a “resilient distributed data” design for fault-tolerance.  It can deploy into its own optimized cluster, or ride on top of Hadoop 2.0 using YARN, (although it is a different processing platform/paradigm from MapReduce – see this post on GridGain for a Hadoop MR In-memory solution).

…(read the full post)

Turn to in-memory processing when performance matters

An IT industry analyst article published by SearchDataCenter.

In-memory processing is faster, and vendors are innovating to make in-memory database technology cheaper and better.

In-memory processing can improve data mining and analysis, and other dynamic data processing uses. When considering in-memory, however, look out for data protection, cost and bottlenecks.When you need top database speed, in-memory processing provides the ultimate in low latency. But can your organization really make cost-effective use of an in-memory database? It’s hard to know whether that investment will pay off in real business value.

And even if the performance boost is justified, is it possible to adequately protect important data kept live in-memory from corruption or loss? Can an in-memory system scale and keep pace with what’s likely to be exponential data growth?

There’s an ongoing vendor race to address these concerns. Vendors are trying to practically deliver the performance advantages of in-memory processing to a wider IT market as analytics, interactive decision-making and other (near-) real-time use cases become more mainstream.
Memory is the fastest medium

Using memory to accelerate performance of I/O-bound applications is not a new idea; it has always been true that processing data in memory is faster (10 to 1,000 times or more) than waiting on relatively long I/O times to read and write data from slower media — flash included.

Since the early days of computing, performance-intensive products have allocated memory as data cache. Most databases were designed to internally use as much memory as possible. Some might even remember setting up RAM disks for temporary data on their home PCs back in the MS-DOS days to squeeze more speed out of bottlenecked systems.

Today’s in-memory processing takes that concept to the extreme: using active memory (dynamic RAM) to hold current running database code and active data structures, and keep the persistent database in memory. These databases forget about making any slow trips off the motherboard to talk to external media and instead optimize their data structures for memory-resident processing.

Historically, both the available memory density per server and the relatively high cost of memory were limiting factors, but today there are technologies expanding the effective application of in-memory processing to larger data sets. These include higher per-server memory architectures, inline/online deduplication and compression that use extra (and relatively cheap) CPU capacity to squeeze more data into memory, and cluster and grid tools that can scale out the total effective in-memory footprint.

Memory continues to get cheaper and denser. Laptops now come standard with more addressable memory than entire mainframes once had. Today, anyone with a credit card can cheaply rent high-memory servers from cloud providers…

…(read the complete as-published article there)