SVG
Commentary
91 Institute

Event Summary: Lessons Emerging from the JEDI Cloud: Immediate Steps and the Future of Next-Generation IT

tod_lindberg
tod_lindberg
Senior Fellow

Despite shifting the deadline for bid submissions to October 12, the Pentagon is remaining firm in its commitment to initially award the JEDI cloud contract to a single provider. As the acquisition process of this next-generation IT infrastructure moves into the planning and implementation phase, several areas of how JEDI will work still remain unclear.

Why the cloud?

Commercial cloud providers offer collections of machine rooms. Each machine room has lots of computers and storage devices in it. Because these resources are shared among a large number of customers, they can provide not only access to shared processing and data, but the access can be elastic, which means when you need it, there's likely to be the capacity for you to have it. It means that you can do sharing between applications that might not be likely to share because they're not co-resident. And it means that somebody else is managing the hardware and probably the software, which means you don't have to do that for yourself. Because the machines rooms are geo-distributed, it means that there's some measure of reliability because physical events—bombs going off, say—tend to be geographically local.

Why, then, does it make sense for DoD to move to the cloud or for DoD applications to move there? For one thing, the rest of industry is moving there. And that means large investments are being made in developing applications and services. We could take advantage of that within DoD if we run them on the cloud.

It also would enable DoD to take advantage of two new trends. One is so-called big data and the other is machine-learning, where one uses automated methods to analyze and make inferences. If DoD doesn't position itself onto a platform where these kinds of things are available, then people won't be able to experiment, because the data won't be there and the computing capabilities won't be there.

Yet any time you detach computing capability from the user, as in cloud services, there is going to be a risk that the tether can be severed. And to the extent that we currently outsource serving, whether to a cloud or elsewhere, we’re already running this risk. This points to an important distinction between tactical uses of a cloud and enterprise uses. When our forces deploy, they won’t be taking a full-scale cloud services facility with them.

As we become dependent on enterprise applications, moving in that direction can disrupt our warfighting efforts. And it becomes possible for our adversary to plant stuff in these enterprises over a long period of time and then invoke them as a way of preparing the battlefield before there is an attack. In our march to increased automation, we are going to have risks not only in what you would think of as the high-consequence applications—targeting and so on—but seemingly low-consequence activities, like deploying materiel and scheduling transports, that turn out to be highly disruptive if they don’t run smoothly. We will need to migrate these functions in such a way that they are more secure than they are today.

To the extent we become dependent on using data or processing capabilities that are in this new cloud service for actually fighting, for the warfighter, there is also going to be the possibility that those communications lines get severed and availability will be an issue. We need to design those systems with some capability for graceful degradation, for some Plan B operation. And we also need to make sure we exercise it so that we're experienced in living in that world.

It's clear that cloud-based IT architectures are going to be a central part of how DoD conducts its operations and supports U.S. military forces. For better or worse, we are well-advanced in this project, and it's likely to be initiated.

Understanding how the organism will operate with most of the data generated and processed outside of the platforms that are engaged in tactical operations, or the support of military operations, or the support of business operations, is a new world for DoD. But it’s one that is likely to be replicated elsewhere in society with governmental institutions and private sector institutions increasingly operating in this manner. So it's very important that we begin to get our hands around the full scope of this issue.

It's become a practice of DoD to give vendors, as well as the Congress, a long-term perspective on how a particular suite of programs or capabilities are going to be managed. For example, they just published a roadmap for the way in which DoD is going to procure unmanned systems between 2018 and 2042. Some systems, of course, are more amenable to long-term forecasts of that sort.

DoD purports to have a “road map” indicating its long-term plan for implementation of its move to the cloud, but it hasn’t been published yet. So it’s difficult to judge what exactly DoD’s expectations are.

Vendor lock-in

In fact, DoD has many clouds already. So a better way to look at this is DoD is letting a contract to get another cloud, not the cloud. And as long as officials view it that way, and it's the first step, then we'll be in very good shape

DoD needs to be careful not to get caught with vendor lock-in. We need the flexibility to move to another commercial cloud, and we need the flexibility to interoperate with other clouds at the same time. For example, maybe somebody provides a service that the winner doesn't provide, and we'd like to be able to import it.

It's going to be very important that we think carefully when we revisit things based on our experience, and that we don't make decisions that will inadvertently make it difficult for us to have flexibility moving forward. The kind of flexibility we should think about is not canceling our contract with the winner and going someplace else, but evolving to a community of clouds that are cooperating.

The argument that we should start with one cloud services provider because that would be a good way to get experience, and we'll understand the picture better, is a good argument. But it has to be an argument that's a first step to thinking in terms of being able to take advantage of the whole industry's innovations, and being able to bring to bear the best solutions when we can.

It's very seductive, usually on cost grounds, to do things in such a way that you are locked into a particular cloud services provider. You might start by deciding, “Well, let's not spend the extra money to build our applications in a way that makes them portable, because it's just an extra cost.” Or you might say, “Let's not run the drills where we move to another cloud for the weekend just to convince ourselves we can do it.” Those sorts of decisions are a slippery slope leading to lock-in. Moreover, the initial winner has every incentive to promote those kinds of decisions, for obvious reasons.

This is one of the cases where we are going into this with good knowledge that lock-in would be a risk for many reasons. And we have to be strong about it, and not do the cost-cutting in ways that would compromise flexibility.

Some in the DoD leadership have spoken of trying to promote an environment that they describe as fiercely competitive, that will facilitate the ability of cloud service providers to present to DoD the full range of services they can offer. DoD is something of a pathfinder for the government as a whole in the move to the cloud. The DoD acquisition therefore has a responsibility to find an effective segue to the government as a whole.

There is an experiment that's run by the government already, and it's operational. That's by the intelligence community (IC). And they happened to choose to develop two clouds simultaneously, one which was commercial and one which was government-developed. They both actually now exist. And they have interoperability problems. Ideally, those involved in the DoD procurement really ought to hold off on giving the contract until the vendor demonstrates that he could interoperate with both of those IC clouds, because they're going to have to. This is not a theoretical problem. The problem will be for DoD to define the test that they will accept as proof. You can't just sort of wing it. And that requires informing the contractor of the test that has to be passed before the contractor begins work so they understand what has to happen.
Consider a counter-example, electronic medical records, where the government chose to impose a constraint on the operations, not on the physical computers. Epic, which is a company that's now very rich, built their system to meet the requirements, but made sure that nobody could get records using any other system. And if you've ever changed doctors, and you've tried to get your records changed, you discover it's not like pushing a button and getting it sent. So this whole issue of, “What do you do after you have it?” is more than just, “how do I make the programs work and how do I interface with the users?”

There are really significant issues here. One test might be to pick the bombing data for all of the big strategic areas in the world, which is held in various sundry places, and make it a requirement that this cloud can actually go to all of those places, retrieve data, and get it to the guys with the bombs. On any procurement, we need to know what the test is that the vendor has to pass to get paid—and if not, no payment. That’s always a good question to ask.

We want to be able to show that both secure storage and operations can be conducted in the cloud and that we have a template for how the cloud procurement can assure DoD access to the innovation that's available in the private sector—that DoD has an acquisition process that facilitates its access to a vibrant commercial market. The path DoD is taking, where they are emphasizing the need for competitive procurement in the long term is going in the right direction.

Security

No system is secure. Some systems are more secure than other systems. People working on cybersecurity live with that reality. You have to decide what you are trying to secure against and what kinds of investments you are prepared to make. Our current machine rooms have insiders and employees who have access to lots of information. There have been a lot of sad stories over the last few years about what has happened as a result.

Cybersecurity is an issue of the weakest link in the network, and so there are a lot of opportunities in the current environment for bad things to happen. Part of the issue of the cloud is to get back to the logical equivalent of the control that existed in the old days.

One security advantage of the cloud is that the provider’s employees actually need not have access to any information in unencrypted form. What you should think about is a building where these racks of machines are in locked cages. There are video cameras pointing to all these locked cages. And the locks don't open while the machines are running. So, if you were a malfeasant employee, and you wanted to get to a machine, you would have to break in. And, of course, the video camera is recording this. While the machine is on, the data might be available. But data that is kept outside of the processor board would be encrypted. So when it's stored in outboard memory, when it's stored in disk, it's encrypted. And the only time you get to the machine is when it's turned off. And when it's turned off, all the data that you can get to is encrypted. All the data on the electronics of the processor will have diffused out.

And so you can think of Mission: Impossible stories where somebody breaks into the cage and has spoofed the cameras and so on. But it's not a matter of trusting the cloud employees. And that seems like an improvement.

Cloud providers also tend to be more attentive to security updates and so on. Stories about bad things happening because someone is running a very outdated version of an operating system are not stories that you will find in a cloud environment.

On the other hand, a cloud would tend to be a monoculture. That is, the way to get to scale is by replicating hardware and replicating software. That means if there's a problem with one, there's a problem with all. And that makes clouds attractive targets or attractive nuisances. And if you're worried about exploiting supply chain vulnerabilities, then knowledge that DoD is using a specific cloud would tell our adversaries what kinds of hardware DoD is using, and that might allow somebody to try to leverage a supply chain attack.

You have not only have opportunities for cyber operations against U.S. infrastructure, but also the physical vulnerability of the infrastructure for cyber, not only the communications links but, in the case of the cloud the storage sites themselves, where the hardware exists and the processing takes place.

While the adversary can benefit from access to the data, they can deny the U.S. the benefit of the data if they physically disrupt the facilities. The initial JEDI procurement is focusing on three sites, each with the full capabilities of the system. These would be in commercial locations, although in principle, they could be on military reservations or established in other ways that improve physical security. Nevertheless, there will be a finite number of these. And as inviting as a target as they are to cyber invasion, the physical security will need to be attended to as well, as we do in other national security sites that are important, few in number and expose the nation to considerable damage if the sites are disrupted or destroyed.

The Russians and Chinese recently engaged in a large-scale exercise in central Siberia called Vostok 2018. In one region of central Russia, in the Tula-Volga area, they had a mock attack on the power grid. At the same time, they were conducting these large-scale military operations at a training range in central Siberia. So it illustrates the fact that in addition to the kind of cybersecurity that we worry about day-to-day, we have issues relating to physical security as well, because of the fact that the adversary will have opportunities to engage the entire infrastructure, not just attack it at the most vulnerable spot, as has been our recent experience.