Dan's Take

MapR 6.0 Introduces 'DataOps'

Dan learns a new buzzword.

MapR Technologies recently released MapR 6.0. The company's goal is reducing the time and effort needed for enterprises to obtain value from data analysis. This is done through what the company calls "DataOps." Although DevOps is a common term in the industry, "DataOps" was something new to me.

MapR 6.0 is a converged data platform, designed to do the following:

  • Automatically assure security and provide a reliable environment
  • Ingest data quickly using flexible tools
  • Produce and store data in a secure but still discoverable form
  • Help data scientists and analysts search and refine their data models using self-service tools built on top of MapR's machine learning/artificial intelligence tools

MapR 6.0 includes an enhanced MapR Control System (MCS) that administers all data and monitors the health of the clustered systems supporting it. MCS automatically manages data volumes, tables and streams. It provides a management environment in a single pane of glass that makes it possible to easily relate data held in one form with data held in other forms.

MapR 6.0 offers a number of security enhancements, such as enforcement of authentication and more comprehensive encryption on the wire that can be enabled with a single click.

The product's "Data Science Refinery" is designed to make it easy for data scientists and analysts to apply machine learning to the process of obtaining speedy, accurate and reliable insights based upon the data.

Defining DataOps
DataOps is the use of automated tools to improve data quality and reduce the time it takes for data analysis to create useful data out of huge stores of raw data. The key is supporting collaborative teams to address a rapidly-changing environment that's producing huge amounts of data.

Like its cousin DevOps, an iterative process is utilized, allowing automation to be introduced and portions of the data be gathered, cleaned and then analyzed in parallel in a collaborative process, rather than using the sequential processes used in the past.

A key difference is that processes used in the past were designed to produce value at the end;  DataOps is designed to provide a constant stream of insights through continuous processing.

Dan's Take: Welcome to a Parallel World
MapR says that "data analysis is increasingly being driven by machine learning and artificial intelligence to gain quick, accurate, and actionable insights." What they're leaving unsaid is that enterprises no longer want to take the time required for long, monolithic processes that may, in the end, be too late to allow any type of advantage over competitors or make it possible to provide better service to customers.

When it comes to application development, enterprises are increasingly turning to DevOps to reduce the time it takes to respond to a rapidly changing environment. Rather than deploying the waterfall approach used in the past, they're decomposing the process into separate units, assigning different teams to be responsible for unit and then building everything in parallel.

DataOps is built upon the same type of thinking, and based on using automation to address the ingestion of data from many sources; making sure the data is in acceptable forms; then making it possible for data scientists and analysts to quickly produce useful insights based upon what's seen in the data.

MapR is competing with other suppliers offering tools and services designed to facilitate that process. Tamr calls its DataOps strategy "Enterprise Data Unification." Delphix speaks about its "Dynamic Data Platform." Switchboard Software offers its own DataOps platform. What sets MapR apart is its focus on providing useful enterprise solutions, rather than computer science projects.

What I mean by "computer science projects" is that other solutions target a different part of the software stack needed to support these types of highly distributed, multi-platform, multi-OS workloads. Putting together a complete solution would be more difficult because tools coming from other suppliers would have to be woven together to create that complete solution. MapR holds a more comprehensive view and has built a more comprehensive set of tools.

About the Author

Daniel Kusnetzky, a reformed software engineer and product manager, founded Kusnetzky Group LLC in 2006. He's literally written the book on virtualization and often comments on cloud computing, mobility and systems software. He has been a business unit manager at a hardware company and head of corporate marketing and strategy at a software company.


Subscribe on YouTube