Where serverless storage fits in the serverless computing world

An IT industry analyst article published by SearchStorage

Storage for serverless functions must be external to the compute environment. Learn about the types of storage that work best for serverless computing.

What is serverless computing and how does serverless storage work? Basically, serverless is a way to develop an application as a set of smaller functional components and then submit those functions to a host service for elastically scalable event-driven execution. The service provider takes care of running, scaling and operating the entire IT stack underneath, freeing the application owner to focus more on business value and less on IT operations.

Because the service provider charges per function execution and not for reserved infrastructure, serverless operating costs fully align with application usage. With serverless, cloud expenses can be traced directly to applications and business users.

Serverless isn’t truly without servers; there’s a server somewhere in a cloud. Behind the scenes, there are probably virtual machines, containers, servers and storage. There are OS and application server code and layers of IT management that might include cloud orchestration, virtual hosting and container management, such as Kubernetes. The service provider handles all the networking, availability, performance, capacity and scalability issues an IT shop might normally have to be concerned with when hosting its own applications.

Think of it as functions as a service

A better name for all this might be functions as a service. At its simplest, this kind of function is a little piece of event-driven code, perhaps written in JavaScript, that’s set up to be invoked in response to some trigger like a web-form submission click or IoT device event. Simple functions could simply insert a record into a database, make a log entry, trigger another event, send a notification message, transform a piece of data or return a calculation.

A complete serverless application would consist of a well-orchestrated cascade of functions. Each function could be independently reused and triggered in massively parallel ways. A function’s execution could be scaled orthogonally from other functions providing for tremendous flexibility without traditional bottlenecks.

When a well-designed master graph of functions that trigger each other in a sophisticated cascade of events is deployed at scale, it can not only replace but outperform monolithic applications. One of the premises of serverless is that your applications would scale smoothly and be fully elastic, well beyond the limits of any predefined or prepaid server farm or set of cloud machine instances.

This is still a bit of futuristic because, of course, applications aren’t made up of just compute functions. There’s still data that needs to be managed, protected and persisted. Storage for serverless must be a consideration and effective serverless storage developed.

Storage for serverless

Like microservices in containers, the initial idea for serverless functions is to construct them in ephemerally, so they don’t contain any data and aren’t relied upon to internally persist data.

Unlike long-running applications, the point is that functions are triggered by an event, do something specific and are retired as opposed to becoming a long-lived application. Serverless functions and may be launched in massively parallel ways to scale quickly to meet demand. So where does serverless data live?

Effectively, data storage for serverless functions must be external to the compute environment. Because of the elastic scaling and small event cascades, traditional storage volumes and file systems are going to struggle mightily and become obvious bottlenecks in large-scale serverless application deployments.

Today, serverless function development environments can make direct use of storage service APIs. As scalable storage suitable for web-scale apps and containerized applications, cloud-data services are ideal data persistence partners. Here are the storage types that work well as storage for serverless:

  1. Cloud database services that are architected for multi-tenancy, elastic scalability and web-style access make them a good choice for transactional persistence.
  2. Object storage services, such as AWS S3, with simple get/put protocols are ideal for many kinds of web-scale apps and functional designs.
  3. Application memory cache, such as Redis, can work for high-performance data sharing needs.
  4. Journaling logs in which data is written serially to the end, while readable in aggregate, can help protect streaming data in functional designs.

When using any data store with serverless designs, great care must be taken with idempotency — an operation that has no additional effect if applied more than one time. Care must also be taken with asynchronous event assumptions and timing or race conditions. In particular, functional designs have to pay attention to parallel data updates and write-locking conditions.

So what exactly is serverless storage?

The serverless ideal is to get IT — and DevOps — out of managing and operating servers of any kind, including physical servers, virtual servers, containers and cloud instances. The idea is to force IT to hand over the responsibility to be on call for any operational issue to the serverless environment provider. If a company is adopting serverless, ultimately, the idea will be to get out of operating and managing storage as well.

However, common requirements for IT to ensure data governance, compliance and protection will mean a long period of hybrid protected storage architectures. And in the long term? Data could just be passed forward from function to function through the event queue and never actually persisted, relying on the event queuing service for data protection. But, of course, there’s still storage in there somewhere.

…(read the complete as-published article there)

Be Careful What You Automate – part 2

Shining a Light on Automation Oversight

Part 8 in a series of posts about Data Protection as a Service. This is the second of two posts on automation… (Also posted on Cobalt Iron’s blog)

The first part of this series explored the need for automation in modern backup solutions. Not just automating existing operations, but seeking out and demanding best practices to protect enterprise business.

Mike Matchett, Small World Big Data

When thinking about the best way to approach enterprise data backup, automation ranks high on the list of goals. Automation means consistency, reliability, and out of sight, out of mind — and this can be a dangerous position.

Automation Oversight

Automation without oversight is a recipe for failure at scale. The larger the operation that is automated, the more checks and balances in the form of monitoring and quality/integrity checks are required. Monitoring goes hand in hand with smart automation (and sometimes provides built-in optimization feedback). It must be built-in to help identify when things don’t fully compute, when scripts can’t run, and when and where protection coverage isn’t complete.

Data protection automation at scale also needs to be securely operated. Perhaps even more securely than the data it’s operating over. If automation processes can be hijacked or hoodwinked, recovery may be impossible.

The solution to achieving great data protection coverage — faster, better, and more cost-effective — through automation lies in engaging with companies that have deep technical expertise, have years of experience embedding best practices for applications at scale, and can truly offer intelligent data protection automation. More than that, there is a growing need for a trusted partner that is in business to build and evolve even better automated data protection over time.

Examples of automation with oversight might include integration of the backup solution with enterprise orchestration tools such as ServiceNow and Remedy to ensure that backup is receiving the same level of visibility as other critical business operations. A second example would be a solution that automatically detects and provisions backup services for newly created virtual machines. Another example would be a solution that proactively resolves problems by leveraging the power of analytics.

Autopilot-Protected

One point of view is that IT automation is really about stress reduction, even though most people talk about the beneficial impacts in terms of process efficiency, risk and cost reduction, and even service assurance. This line of thinking about automation advocates that if people are not sitting directly in the critical path, then they can’t disrupt the operation, intentionally or unintentionally.

To be clear, this is not suggesting that people should be taken out of IT. Rather, that when critical recurring jobs just get done automatically — the backups run, the data is validated, and all the systems are protected — smart people can move on to bigger and better uses of their time and creative energy.

So really, what’s better than automating backup best practices, smartly AND securely? If that can be accomplished, rather than just repetitively running through spotty procedures, IT professionals will discover time to innovate and add new value to the business.