Why OpenStack matters to me

I’d like to start off with an apology to everyone out there that over the past 9 months if I didn’t reply to your email, didn’t answer your phone call, or made your life less interesting by disappearing from Twitter and from sharing my thoughts on this blog.  I’ll be out, alive and available again now that OpenStack is a reality.

Life is about priorities and hopefully at some point in your life you have already had or will have in the future an opportunity to work on something that has the ability to really make an impact.  At Rackspace we are a Strengths based organization.  My top 5 are Learner, Achiever, Competition, Analytical, and Focus.  I’ll use my strengths as a way to explain the past ~9 months.

When we started exploring the strategy around this all of us had lots to learn.  We’d all used open source software.  Some of us on the team had contributed to projects, but we all knew we had a lot to learn if we were going to get this right.  The great thing about open source, the full history of all of it is on the Internet.  You can go back and read mailing list archives, you can find out who contributed to a project, who led them, who had influence and you can reach out to those people and they’re often happy to talk about it.  This is very different from trying to do research on businesses where information is hard to find — no corporation will share their full mailing list archive that covers the history of their decision making (heck most don’t even have one).  The openness and ability to learn about things easily was a huge motivator for me.

So began the Learner->Analytical->Focus->Achiever “death spiral”, well the “death” of my learning anything not involved on this project that is.  The good news is those 4 strengths together make it so I really enjoy learning about new complex systems and figuring the best way to navigate, the bad news is the Focus->Achiever half may let me chase Alice all the way down the rabbit hole to Wonderland.  Sometimes this is counterproductive where a decision could have been made “good enough” with less analysis but in this case I’m really happy about it.  When forming an open source community you have a lot of choices to make and all of them have different benefits or drawbacks and the perception of is it a benefit or drawback varies from the perspective of the individual or group.

Forming this community is important enough to go all the way down the rabbit hole because thousands of people will become part of it and each potential member of the community is worth more than an hour of my time.  This gives me a good segway to talk about scale — If you’re only going to use a piece of software once to solve a single need then you should make it just good enough to get the job done — you should optimize for min(time coding + time for code to run[where you have to pay attention to it]).  The opposite end of the spectrum is a project like Linux (or like OpenStack will be — I dream big!) that runs on millions of machines 24/7 all around the globe.  If you can make an operation one minute faster on something that runs on a million machines you save 2 years worth of system time.  With that same idea we spent all the time we could making sure we got the community started the right way because every hour we spent will be multiplied by each of you that join it.

So now here is where my Competition kicks in.  I don’t want to make just an average community and then go watch reruns of “Everybody Loves Raymond” (Ray, hopefully you aren’t offended, you shouldn’t be, you were the first show that I know made it to rerun syndication that popped into my head!) on local TV — I want to make the best community ever.  The problem is… the bar is really high.. it isn’t like I said, “I want to make the biggest ball of rainbow yarn a person with a 9 letter long name made on a Tuesday afternoon” — I want to make the best open source community around a distribution of projects out there — and a lot of people have done an excellent job at this.  So to do this we’ve learned as much as we could from past projects to lay the proper foundation.  With that let me lay out the “4 opens” (I’d like to credit Rick Clark on our team for summarizing these thoughts into a concise and clear manner we can all hopefully understand)…

Open Source: We are committed to creating truly open source software that is usable and scalable. Truly open source software is not feature or performance limited and is not crippled. We will utilize the Apache Software License 2.0 making the code freely available to all. [Personal commentary: What this means is "we accept patches", the project won't block a feature contribution because it competes with a commercial feature a community member has.  This doesn't mean all of those commercial entities have to contribute all of their code -- it just means they aren't guaranteed exclusivity.]

Open Design: Every 6 months the development community will hold a design summit to gather requirements and write specifications for the upcoming release.  [Personal commentary: The design summits have been great (so far we've had 2) to get people aligned and to really get the complicated items solved.  An example on this is the large object support for Object Storage, members of the community had a number of different implementation ideas and through discussion we've come up with a great way to do it.]

Open Development: We will maintain a publicly available source code repository through the entire development process.  This will be hosted on Launchpad, the same community used by 100s of projects including the Ubuntu Linux distribution. [Personal commentary: Getting code and designs out in the open as early as possible in the process allows everyone to benefit from the power of a community in the biggest way possible.  This also makes finding and fixing big problems much easier as each patch can be tracked and its individual impact measured.]

Open Community: Our core goal is to produce a healthy, vibrant development and user community.  Most decisions will be made using a lazy consensus model.  All processes will be documented, open and transparent. [Personal commentary: Everyone should have a seat at the table at a level that corresponds to the effort and contributions they're putting into the project.  With all of the decision making done in IRC meetings (with transcripts) and over mailing lists members of the community can see "how the sausage was made" rather than just the end result of the decision -- this is really important to build and maintain trust.]

We’re off to a fun and exciting start.  Looking at the stats from this week I’m amazed at the amount of contribution we’re seeing from such a large group of developers (stats for the week of 12/3 to 12/9):

  • OpenStack Compute (NOVA) Data
    • 17 Active Reviews
    • 97 Active Branches – owned by 34 people & 4 teams
    • 472 commits by 26 people in last month
  • OpenStack Object Storage (SWIFT) Data
    • 5 Active Reviews
    • 41 Active Branches – owned by 19 people & 2 teams
    • 184 commits by 15 people in last month

This shows me what we’re doing is working and given the time to continue to grow and bloom OpenStack Compute can help IT make the move to automation the same way manufacturing has over the past 50 years.  Yes, I’m saying IT isn’t automated right now. IT automates other tasks inside the Enterprise but they haven’t really automated many of their own tasks (this probably deserves a full post of it’s own).

Object Storage is potentially more important even than the automation.  This is a topic I’ve been presenting on frequently because I’m very passionate about it (see the Strengths above) as it allows us to see an order of magnitude increase in efficiency over the TCO of “the average storage solution”.  It doesn’t serve every storage use case but the use case it does serve is growing rapidly and over the next decade it’ll be clear to everyone that their largest storage platform (in terms of GB stored) will be object based.

I expect we’ll see additional projects as part of OpenStack over the next year but we should keep that bar high as a community on what is a major project.  Both Compute and Object Storage are providing software for ubiquitous problems that are growing in importance to everyone.  Some items that clear the bar for me (these are critical issues to all users and operators of clouds a decade from now):

“Networking as a Service” — This should be abstracting from the end-point computing service as it can be utilized by all projects and to provide connection points to other inter-cloud and non-cloud services.  Here we can define, routing, switching, and filtering network devices and we can automate their integration with other cloud services.

“Inter-cloud Services” — As different clouds become available with varied services we need an automated way to discover and catalog them the same way routing protocols advertise network availability so we can have a loosely coupled global network (you may be familiar with it.. the Internet).  OpenStack is a great place to define a reference implementation of the directory and advertising capabilities as all interested parties can have a seat at the table to contribute their needs.

Some items I’m on the fence about (the reason I’m on the fence isn’t that they aren’t extremely important to some implementations, it is that they aren’t important to all implementations):

“Host Provisioning Automation” — For service providers that are constantly growing and re-provisioning assets automating these tasks is critical.  For a SMB that is going to build a 2-6 cabinet cloud solution once this isn’t nearly as important.

“Security & Compliance Services” — Everyone wants “some level” of security but what that level is and what amount of the resources that get dedicated to providing them varies widely.

“Network Block Storage Services” — As the performance and size of local storage continues to increase the need for network block storage decreases.  I’m still a big believer in the benefits here for many use cases; it just doesn’t apply for every use case.

I really believe 2011 our community has a chance to really deliver “the promise of cloud” to the masses through the efforts and commercial implementations created by the members of our community.  As exciting as getting things off the ground in 2010 I’m even more excited about the future to come.

Tags: , , , , , , ,

  • http://twitter.com/hollanddavids David Holland

    Networking as a service and Inter-Cloud services.

    I strongly agree that Networking as a service needs to be part of the underpinning of the IaaS model and I believe it must be an abstraction (read overlay) that provides the separation between the business of the Provider and the business of the Consumer. This separation is in fact the model of that loosely coupled network you refer to.

    In that model we support the (routing, switching, filtering) network appliances appropriate to the Customer's needs without compromising the stability and separation required by the provider to maintain the equity and security of the infrastructure they sell.

    Furthermore, that model is understood by all and naturally supports Inter-Cloud and Premise to Cloud mobility in as much as I have control in my Private Cloud and have comparable control in my IaaSP.

    That requires an extra layer in the IaaSP that need not be present in the Private Cloud because in the private cloud that layer already exists at the network boundary. For the IaaSP the boundary must exist on every compute or system image because that is where the edge has moved to.

    Adding that layer for small instances may be approximated in SW but requires HW to scale (IMO). Moving that function into the IO path makes the most sense to me since that mechanism can be used to underwrite the cost of deployment (giving a 2:1 capacity boost).

    Ideally that mechanism will support openvswitch as the default implementation at each level so it will be open and extensible. If the HW itself (at the customer level) is programmable then it opens the possibility for new and exiting innovation by the appliance vendors. It should go without saying the Provider has the necessary control and isolation to innovate and extend without impact to the Customer model. This capability has not yet found its way to the currently available silicon (AFAIK)

    The problem creating this is not that the technology is hard, but because the business interests are misaligned. Why spend the capital if everyone benefits. If I develop that technology in a closed environment optimized for my infrastructure everyone else is playing catchup and looking at 1 year + to get there.

    Maybe that is what the market really wants, but if you want to compete – don't bring a knife to a gunfight. Find a way to address the issue or cash out.

  • Diego Parrilla

    Great and passionate post…

    2011 will be a great year for openstack: i'm sure of that.

  • http://blog.sflow.com/ Peter Phaal

    I enjoyed your post.

    How do you see performance monitoring fitting into the OpenStack architecture? The challenge in providing scale-out performance monitoring is building in the necessary instrumentation into the solution stack.

    David mentioned the Open vSwitch/OpenFlow as a way to virtualize networking in OpenStack. The Open vSwitch also supports the sFlow standard, providing a scalable solution for network performance monitoring, providing an integrated solution for monitoring physical and virtual network resources.

    sFlow instrumentation is easily added to open source hypervisors (using the open source Host sFlow agent – http://host-sflow.sourceforge.net), extending visibility into the physical and virtual server resources to provide a comprehensive, end-to-end view of performance to the cloud service provider.

    There are 3 major areas where scalable, real-time performance monitoring can contribute to Open Stack:
    1. sFlow provides the real-time data needed to optimize workload placement and increase operational efficiency.
    2. sFlow provides detailed network and system performance data that can be used to create differentiated customers billing models. For example, charging different pricing for network bandwidth depending on service and locality.
    3. sFlow provides similar benefits to users of the cloud service, providing the data needed to match their deployments to changing demand, controlling costs by releasing excess capacity and proactively adding capacity as needed.

    It’s understandable that the initial efforts focus on provisioning, but without real-time visibility into performance, the cost savings made possible by cloud computing and virtualization are hard to fully realize.

  • Pingback: Being The Largest Hybrid Cloud Customer At Rackspace - The Official Rackspace Blog

  • Pingback: If cloud computing is a commodity, so is real estate | Bret Piatt