SQL Server machine learning goes full throttle on operational data

An IT industry analyst article published by SearchSQLServer.

Artificial intelligence is a hot topic in IT, and Microsoft has made strides to synchronize SQL Server with machine learning tools for use in analyzing operational data pipelines.

Mike Matchett

One of the hottest IT trends today is augmenting traditional business applications with artificial intelligence or machine learning capabilities. I predict the next generation of data center application platforms will natively support the real-time convergence of online transaction processing with analytics. Why not bring the point of the sword on operational insight to the frontline where business actually happens?

But modifying production application code that is optimized for handling transactions to embed machine learning algorithms is a tough slog. As most IT folks are reluctant — OK, absolutely refuse — to take apart successfully deployed operational applications to fundamentally rebuild them from the inside out, software vendors have rolled out some new ways to insert machine intelligence into business workflows. Microsoft is among them, pushing SQL Server machine learning tools tied to its database software.

Basically, adding intelligence to an application means folding in a machine learning model to recognize patterns in data, automatically label or categorize new information, recommend priorities for action, score business opportunities or make behavioral predictions about customers. Sometimes this intelligence is overtly presented to the end user, but it can also transparently supplement existing application functionality.

In conventional data science and analytics activities, machine learning models typically are built, trained and run in separate analytics systems. But models applied to transactional workflows require a method that enables them to be used operationally at the right time and place, and may need another operational method to support ongoing training (e.g., to learn about new data).
Closeness counts in machine learning

In the broader IT world, many organizations are excited by serverless computing and lambda function cloud services in which small bits of code are executed in response to data flows and event triggers. But this isn’t really a new idea in the database world, where stored procedures have been around for decades. They effectively bring compute processes closer to data, the core idea behind much of today’s big data tools.

Database stored procedures offload data-intensive modeling tasks such as training, but can also integrate machine learning functionality directly into application data flows. With such injections, some transactional applications may be able to take advantage of embedded intelligence without any upstream application code which needs to be modified. Additionally, applying machine learning models close to the data in a database allows the operational intelligence to be readily shared among different downstream users…(read the complete as-published article there)

Database performance tuning: Five ways for IT to save the day

An IT industry analyst article published by SearchDataCenter.

IT teams can play heroes when database performance issues disrupt applications. Try these five tips for performance tuning before there’s a problem.

When database performance takes a turn for the worse, IT can play the hero. Production database performance can slow dramatically as both data and the business grow. Whenever a key database slows down, widespread business damage can result.

Technically, performance can be tackled at many different levels — applications can be optimized, databases tuned or new architectures built. However, in production, the problem often falls on IT operations to implement something fast and in a minimally disruptive manner.

There are some new ways for IT pros to tackle slowdown problems. However, one question must be addressed first: Why is it up to IT?

Database administrators and developers have many ways to achieve database performance tuning. They can adjust configuration files to better align database requirements with underlying infrastructure, add indexing, implement stored procedures or even modify the schema to (gasp!) denormalize some tables.

Developers have significant control over how a database is used; they determine what data is processed and how it is queried. Great developers wield fierce SQL ninja skills to tune client queries, implement caching and build application-side throttling. Or, they rewrite the app to use a promising new database platform, such as a NoSQL variant.

All kinds of databases, however, can eventually suffer performance degradation at operational scales. Worse, many developers simply expect IT to simply add more infrastructure if things get slow in production, which clearly isn’t the best option.

Five ways [IT] can address database performance issues:…

…(read the complete as-published article there)

A SQL Sequel Called TED – the Distributed TransLattice Elastic Database

(Excerpt from original post on the Taneja Group News Blog)

While the focus in databases recently has been on exciting NoSQL variants designed to tackle big data challenges or build transformational web scale apps, most enterprise applications still have mandatory requirements for transactional (ACID) properties keeping them tied to traditional relational database architectures. However, businesses are feeling pressured to evolve their transactional database solutions to enable highly available, “local” application access to globally consistent data.

Global IT shops face a big challenge in distributing relational data, as most approaches to-date have large implementation and operational overhead costs, and often involve serious trade-offs between local access and global consistency. Adding to the burden, local governments might have laws about ensuring data “location” compliance that absolutely must be met.

TransLattice started out delivering an appliance “platform” for distributed business computing. Due to popular demand, they’ve unhinged the database and launched that as its own solution called the TransLattice Elastic Database, or TED for short. Although calling your database TED seems a bit informal, the underlying implementation based on PostgreSQL is anything but. TED’s fully ACID/SQL relational implementation is designed to be logically and geographically distributed, providing policy-driven controls for each table’s replication. Transactions are automatically brokered and routed on the client side to ensure local performance and high availability, with a distributed commit algorithm on the backend to ensure global consistency. When nodes become unavailable, they get routed around, and when back online, automatically get updated.

TED database nodes can be TransLattice appliances, virtual machines, cloud based instances, or even all of them at once as access, availability, and performance needs dictate. TED nodes can be launched in Amazon EC2 for instance. Nodes can be added to the database dynamically, and data will automatically get distributed according to policy or performance needs.

The best part is that SQL applications don’t have to be modified or become aware that TED isn’t just the same old local SQL database.  Since it drops in, scales easily and cost-effectively, and instantly increases global reach and availability, we think TED will go far.

…(read the full post)

MongoDB – Storing Big Data Documents

If a Big Data set (or smaller data) is in the form of documents, then it’s difficult to store them in a traditional schema-defined row and column database. Sure, you can create large blob fields to hold large arbitrary chunks of data, serialize and encode the document in some way, or just store them in a filesystem, but those options aren’t much good for querying or analysis when the data gets big.

MongoDB Document Database

MongoDB is a great example of a document database. There are no predefined schemas for tables (the schema is considered “dynamic”). Rather you declare “collections” and insert or update documents directly into each collection.

A document in this case is basically JSON with some extensions (actually BSON – Binary encoded JSON). It supports nested arrays and other things that you wouldn’t find in a relational database. If you are object oriented, this is fundamentally an object store.

Documents added to a single collection can vary widely from each other in terms of content and composition/structure (although an application layer above could obviously enforce consistency as happens when MongoDB is used under Rails).

MongoDB’s list of key features is a fantastic mini-tutorial in itself: Continue reading