Dan's Take

StackIQ: 'Warehouse-Grade' Server Management

The company says its products have automated more than one million servers.

Tim McIntire, CEO and Co-Founder of StackIQ, came by to introduce his company, talk about what it takes to deal with large scale, "warehouse grade" server and cluster management issues, and to present StackIQ Boss 5.

To summarize our conversation: IT administrators need tools to make a larger number of servers, and clusters of servers, manageable. StackIQ likes to use a warehouse metaphor to discuss its approach to managing servers and clusters. The newest version of StackIQ's Boss is designed to address this challenge and deal with OpenStack, Docker and the complexity of an on-premise cloud environment.

StackIQ Boss 5
StackIQ Boss looks at a datacenter as if it were a warehouse and offers tools to manage these assets as if they were three building blocks: Boss, Pallets and Wire. Here's how the company describes these building blocks:

  • StackIQ Boss, the base product, performs the main duties of server discovery, pre-configuration, base installation and ongoing health and performance monitoring. Boss integrates with virtually any systems management or popular DevOps configuration tool. Boss ensures that the proper settings for site configuration, network configuration, security policies and overall corporate compliance are achieved blazingly fast, via a massively parallelized installation engine.
  • StackIQ Pallets are pre-built configuration packages that contain details specific to certain applications, middleware or physical infrastructure. These Pallets are available as sets or a la carte, depending on customer requirements. The Pallets provide the necessary deep “line of sight” from the application layer (e.g., commercial Hadoop package) all the way through to the underlying server cluster.
  • StackIQ Wire is Boss's programmable framework for third-party management systems to enroll whole clusters, individual nodes (based on machine types or roles), or run health checks for performance monitoring or troubleshooting. Wire also enables IT to create and tailor Pallets to meet the organization's specific needs (e.g., add custom monitoring, install a different version of Java). This makes Pallet installations extremely flexible.

Here are the key new features offered by version 5 of StackIQ Boss:

  • RHEL (Red Hat Enterprise Linux) 7 Support. RHEL7 natively supports Docker containers and is used in the latest release of Red Hat Enterprise Linux OpenStack Platform. Boss 5 updates the Core Pallet to version 7.0 which now fully supports RHEL7 and, by extension, delivers Docker automation.
  • Spreadsheet Configuration for Disks and Host Identification. StackIQ Boss now fully automates that pre-installation process. Every IT operations team maintains a spreadsheet inventory of systems that’s either generated by hand or through an asset management tool. Boss instantly converts what is contained in the spreadsheet into a functioning cluster by ingesting all information such as device ID, slot number, and RAID level. A host spreadsheet can be used to set IP address, hostname, primary network interface, and appliance type assignments.
  • Spreadsheet Configuration for All Major Commercial Hadoop Distributions. Hadoop Pallets for StackIQ Boss now includes spreadsheet configuration for application level configuration. This capability removes the need for manual configuration within a Hadoop management console at deployment time.
  • Advanced Agent Readiness For Puppet. Boss 5 prepares every compute node with Puppet agents. With a single command, a cluster can be made immediately available to any third-party management tool using an integrated Puppet Master and Puppet agents.
  • Real-time Log File Streaming. Lets IT ops view a combination of logs (e.g., Web log, Hive log, Ambari log) centrally. This allows the identification and tracing of a problem across nodes from a single UI.
Dan's Take: Complex Problems Need Complex Solutions
Modern Web-scale applications are architected of general-purpose services harnessed together to create problem-solving tools. Each of the services, in turn, are decomposed into individual functions that may be spread over multiple machines housed in one or more datacenters. This means that configuring, installing, operating and updating a complete workload made up of possibly hundreds or even thousands of servers is challenging and error prone.

Typically, IT administrators deal with a single machine at a time. More advanced enterprises have adopted tools that make system administration a bit easier, but the admin staff often has to maintain spreadsheets to keep track of the hardware configuration, as well as the OS version and configuration, database management software, application frameworks, application development tools and the applications themselves. Add a cloud framework such as OpenStack, and the number and type of things to track and manage becomes quite large, as well as continually changing.

It's far too easy to make a system software change and not keep the spreadsheet up to date. It's also easy to forget to update a system that's part of a large cluster, potentially creating a ticking time bomb. When a failure occurs, it can be hard to isolate the root cause and fix the problem quickly.

StackIQ points to a Vyom Labs table estimating the time it takes to complete various database server tasks:

 

Single Server

Expectation for 5

Reality for 5

Provisioning

8 hours

40 hours

60 hours

Patching

1 hour

5 hours

8 hours

Upgrading

4 hours

20 hours

30 hours

Auditing

2 hours

10 hours

15 hours

TOTAL

15 hours

75 hours

~113 hours

These times represent the problem StackIQ aims to alleviate. If your organization is facing the challenges of a very complex environment, and that environment is largely composed of industry standard x86 systems, it would be wise to be familiar with both the company and its technology.

About the Author

Daniel Kusnetzky, a reformed software engineer and product manager, founded Kusnetzky Group LLC in 2006. He's literally written the book on virtualization and often comments on cloud computing, mobility and systems software. He has been a business unit manager at a hardware company and head of corporate marketing and strategy at a software company.

Featured

Subscribe on YouTube