Laying the NFV Roads

Admin
Road divirging

 How much SDN do service providers truly need?Ramesh Blog

Before answering this question, I’d like to start with an analogy that features another road…

Let’s say I am going on a road trip from Boston to New York. Think of the highway network as a Wide Area Network (WAN) infrastructure and the cities and towns as cloud sites. Let’s say we have virtual network functions (VNFs) in the Boston and New York clouds that need to talk to each other, similar to my need to drive from Boston to New York.

While I could build a whole new highway between Boston and New York, I’m probably going to simply select one of the existing routes. It’s ready and waiting, probably a little longer than any new direct highway I could build, but certainly quite a bit cheaper.

If we take that thinking into the network world, should we build a new network path every time there is a service change or try to optimally use the existing pre-provisioned paths? Admittedly, the road building options in the network world may not appear as expensive as building a new highway and there are some advantages such as traffic engineering. But there are hidden costs and challenges to this approach.

Didn’t we try this before?
SDN purists tried this in the data center a few years back: Make every switch OpenFlow-programmable so the network becomes ultra-programmable and reactive. This essentially built a new data center highway each time there was a service change. Every single change had the potential to create a new optimized road on the fly.

Unfortunately, this approach ran into trouble quickly. It raised a number of questions: Is every switch going to be OpenFlow-programmable? Is a central control plane going to be reliable and scalable? Is the core network going to scale as it now needs to hold end system state, i.e. knowledge of all the flows? Does it support end system dynamics and mobility efficiently as workloads become more dynamic and mobile? The answer to many of these questions is, “No!”

Instead, network overlays and edge intelligence won the game in the data center and the SDN purists abandoned the idea of trying to program every switch in the network dynamically whenever there was a new service or service change.

So, why should things be any different in the WAN when supporting cloud services? In our WAN world, we have some of the same issues as in the data center. We have deployed network equipment that is never going to support anything other than CLI and SNMP, and we have concerns over the reliability and scalability of a central control plane etc.

It would seem that the same logic should apply and we should talk of WAN service overlays and edge intelligence. The only SDN we would need then is to program the edge intelligence to optimally select the network paths needed to get from one cloud to another. All policies related to the overlay networks and applications can also be programmed at the edge.

The WAN network transport paths can be built by whatever means are available. These network paths are provisioned on a slow time scale reflecting major changes in traffic patterns, not for every micro-service change. As a result, network programming APIs and tools may not be as critical and some clunkier tools may suffice. This approach is elegant and has the advantage of leveraging whatever WAN network technology is deployed (e.g. 2, 2.5 or 3). NFV can be as agile as we like and not tied to the utopia of infinite network programmability.

Of course, there is no free lunch. Service overlays and abstraction can have a hidden price. Take failure recovery as an example. If there is a failure in the underlay network, the overlay network may react and decide to take an alternate route. In the meanwhile, the underlay network may have its own failure recovery mechanism and reroute traffic. This may result in non-optimal results and/or race conditions. However, service providers have been building overlay networks for a long time. IP networks are built on top of optical networks, which are built on top of fiber networks that are built on top of physical conduit networks. These issues have been solved architecturally for many of these kinds of overlays and can be adopted for NFV service network overlays as well.

The service agility promise for service providers is not realized by laying the NFV road one brick at a time but rather by appropriate use of edge intelligence that leverages the network roads that already exist. Anything else is either an indulgence in a utopian vision or an exercise in futility.


Related articles