Skip to main content

Pushing the boundaries of OpenStack – Wait, what are they again?

wbentley

As a Production Support engineer for many years, I love providing operational support for front- and back-end systems.  That love of operations drives me to share knowledge on how you can push the boundaries of OpenStack.  To do that, you must first know the boundaries.

Walter Bentley is a Rackspace technical marketing
engineer and author with more than 15 years’
experience in online marketing, financial services,
aviation, the food industry, education, and technology.

Have you ever been curious about how much of a workload the OpenStack control plane can handle, before needing to scale horizontally?  Based on the load, how will the API performance adjust?  How much overhead will OpenStack APIs add to my application deployment timeline?  What behavior should I look for to determine when to add more control plane resources?

Answers to those questions help you get a better understanding of your cloud’s boundaries and how to handle moments of peak traffic.  Of course, the next question is, “Where can I perform these tests?”  Proper production-like resources are critical to your overall results. 

Fortunately, I know a bunch of folks with just the right resources available to take on such benchmarking tasks.  I reached out to OSIC.org and requested access to a bare metal cluster.  Within a few weeks, I was approved and then set-up to start my tests.

There are many ways to measure performance.  I expect that your approach may differ from mine, but the tests I conducted will provide a reference for you to build on.  Please do not treat this as a baseline standard. You need to adjust the tests to fit your objectives, environments and scenarios.

Testing Strategy

To conduct my tests, it made sense to create a script that mimics normal cloud utilization. The script proceeds through the following steps:

  1. List servers within the project (to test authentication)
  2. Create instances in that project (the number of instances created is configurable)
  3. Allow for instance build time (delay time is configurable)
  4. List servers within the project to attain new instance IDs (used for subsequent requests)
  5. Create a snapshot of one of the new instances
  6. Resize one of the new instances
  7. Confirm the resize of the instance, above
  8. Delete one of the new instances

That script can execute continuously with any number of simulated users to gather API metrics and performance patterns.  It made sense to focus on the computing service (nova), because it calls almost all the other services to perform specific actions—nova is the most heavily utilized service within the OpenStack ecosystem, providing computing resources.  We will cover object and block storage clouds in future posts.

My objective was to put varying loads on the OpenStack APIs, based on a set time limit, using the above script to observe API and control plane server performance. The metrics I gathered answered questions such as:

  • How long did each step take to complete?
  • Did the requested task completed as expected?
  • How much CPU utilization did it cause on the control plane?
  • How many iterations of the script did it complete?
  • How many instances were created successfully?
  • What was the overall instance provisioning time?
  • How much time did it take for the APIs to respond?

For more information, check out the complete testing strategy and full set of results here.