Innovations at software speed…

Networking industry is indeed going through an interesting time. One thing is quite clear that  the innovation at software speed has begun. Within just a span of one year, Nicira, Embrane and Midokura launched their products.  PlumGrid and Contrail are probably 6-9 months shy from launching their offers. What is really interesting about these companies is that they are pure software play and innovating at software speed. I am quite impressed with what Midokura has done with Midonet. Pure software – they have taken  OVS and turned it to a virtual distributed router, firewall and LB. I guess they have started where Nicira left off :-). And BTW, all these stuff they have done with some 10 engineers. Isn’t that impressive? I think we will see more of these companies in the near future.
 I have been thinking about how things are evolving in cloud(private or public) scale Data Center space, and here are some of things that I see are happening:
  • Barrier to entry in the networking space is decreasing. Good for the industry as innovations will thrive, and we already see the proof points.
  • I am convinced that we will see more and more innovations at the network edge (hypervisor layer). In other words, we will see more overlay type solutions using software. Open vSwitch will most likely be the foundation, and companies will innovate on top of that.
  • IMO, physical network will become plain L3 Fabric with non blocking CLOS/Fat Tree architecture that scales horizontally.
  • Network services are moving to the network edge, and it will accelerate more with the private cloud adoption.
  • Fundamentally, network will be  fluid  – any workloads, any network services anywhere without  a single chocking point for network services.
  • Orchestration is the key. In the last couple of months, I can’t believe how many times I have heard the comment that I want to build private cloud based on open stuff. BTW, they didn’t mean exactly Openstack, but it could very well be.
I like Marc Andreessen’s comment – software is eating the world. Networking industry is no different.
 Rants and rave  are welcome.

Edge vs Core

It is interesting to see how DC networking is evolving. It is also interesting to see the war between Edge vs Core. What I mean by Edge vs Core is whether the network intelligence should be at the edge or at the core. I must admit, the Edge seems to be winning over core. Let’s look at the choices we have:

1. Intelligent edge and simple core

vs

2. Intelligent core and  simple edge

In reality, the first option has fared well from scalability perspective. Internet is a perfect example of that.

Now let’s come back to the present day problem, virtualization and VM mobility. Do we need overlay networks with VXLAN/NVGRE/STT to solve those problems and thus create intelligent Edge or use EVB(802.1Qbg)  and make the core more intelligent? Interestingly IBM has launched a virtual switch recently that supports EVB. But from the apathy of most of the vendors in implementing EVB, I am not sure whether it is gaining any tractions. While EVB can provide the right vlan to the right port with vm mobility, it doesn’t address some of the problems that overlay technologies are trying to solve, for example inadequate forwarding table sizes in switches and decoupling logical and physical configuration.

In my opinion, intelligent edge has many advantages. First of all it doesn’t need any forklift upgrades.By separating the logical topology from the underlying physical topology,the overlay technology can certainly make network more agile.

Network virtualization, a new era of overlay networks

Much has been said about Network Virtualization in last one year. The topic  invited quite a lot of attention after Nicira came out of stealth mode and launched it’s products last week. So what is Network Virtualization and what can be done with this new technology? This blog is my attempt to capture the current challenges in DC  and the promises this new technology offers.

So what are the market needs?

Essentially what we all want is a simple networking topology where any workloads can be placed anywhere without needing to change the network attributes. It is not a difficult ask is it? Well let’s see how that can be achieved. Well, we all know that in computer networking two nodes can talk to each other either via Layer 2 or Layer 3.

Layer 2 is the simplest topology where all the networking devices share the same network segment and communicate to each other directly using the MAC addresses. But the problem with L2 is scalability. However, there are many innovations lately from the networking giants (read FabricPath and QFabric) to flatten the network and improve the L2 scalability. The bottom line is, L2 is simple and easy to implement. It is convenient for distributed application servers to talk to one other over L2. Lastly and most importantly, our goal of placing any workload anywhere can be easily accomplished with L2 without changing the network attributes.

Layer 3 on the other hand is scalable due to it’s hierarchical addressing schemes.Route aggregation provides scalability, ECMP provides efficient network utilization and Shortest path first routing provides efficiency.However, one significant disadvantage of L3 architecture is that workload can’t be moved across different subnets  without changing the network addresses. Not absolutely correct statement because LISP tries to address this but network has to support LISP natively.

So what do we want?

We want L3 scalability but L2 simplicity and flexibility. Fundamentally what we are looking at is an abstraction layer on top of the underlying L3 networking infrastructure.The abstraction layer would  enable users to create layer 2 segments across layer 3 boundaries. In a nutshell, this is what is network virtualization.

So how do we achieve that?

Well, primarily what we are trying to do is tunneling. Tunnel L2 frames in L3 packets at the edge. That way existing network infrastructure sees the outer L3 packets and route the packets accordingly. Fundamentally we are creating an overlay. VXLAN, NVGRE are some of the standards that are being used to create the overlay at the edge. But how about the performance? Can we really scale these overlay technologies for MSDC environment and that means scaling beyond 20K+ ports? For instance, VXLAN needs multicast at cloud scale and NVGRE creates L3 mesh and has inefficient link utilization.  Well, I think we will find a way to optimize these technologies to operate at a cloud scale.

Wrapping up…

So our ultimate goal of placing any workload anywhere can be achieved with these overlay technologies. And since we are creating these overlays at the edge, many cool things are possible. For example, we can have many virtual segments with same IP address ranges.This enables enterprise or SP customers to provide any IP addresses to any applications/customers irrespective of what are being used in their network. The other cool use-case would be IPV6 endpoints communicating to one another over IPV4 infrastructure. I believe, Network Virtualization is here to stay but only time will tell how fast companies adopt this new technology.

New networking paradigm

Lately Open Flow and SDN have generated lot of buzz. Many startups are already touting to have open flow switches and SDN applications that can simplify and change the data center networking landscape. But apart from commoditizing switching hardware and speed of innovation in the networking space, not that they are not enough, what else Open Flow and SDC promise to solve? So let’s look at the current DC problems first.

So what are the problems….

Well, the problem is traditional Data Center design doesn’t scale when companies like Google, Facebook, and Amazon  are designing MSDC( Massively Scalable Data Center) with an objective of putting any workload at any place. Workloads can move, thanks to server virtualization, but traditional network is the bottleneck.

So what are the problems with traditional networking…

Let’s start with listing out the problems

1. Static networking prevents workload mobility

2. Server to Server traffic causes computational bottleneck

3. Scale up model doesn’t serve today’s workload elasticity

 

Let me explain one by one what I mean.

1. Static networking: If we look at the traditional deployment model, the common practice is to map a VLAN to an application. This rigid vlan mapping model breaks workload mobility. In addition, vlan spanning concentrates traffic at aggregation layer where router links are overbooked. Not to mention VLAN provisioning takes weeks,

2. Server to Server traffic: With new type of applications, there is an explosion of server to server traffic.  According to some study(don’t remember the name), north-south bound and east-west bound traffic consist of 25% and 75% of the total data center traffic respectively. In traditional networking model, different server-farms are placed in different L2 domain and communication among them go through a L3 aggregation switch,which is grossly over-subscribed, at times it can be as high as 1:250 at the top of the hierarchy. Here network becomes the bottleneck for computation.

3. Scale up model: Traditional networking model is based on the scale up model.More performance can be added through an under subscribed hardware,and once the maximum hardware capacity is reached, a new  hardware is required to replace the old hardware. In contrast, with Scale Out model additional performance demand can be met by adding one more unit. Many virtual appliances are now built with scale out model rather scale up model, providing efficient resource utilization and elasticity needed in today’s data centers.

Open Flow and SDN promise to solve the above problems by having a global view of the network. I will explore Open Flow and SDN more in my next blog.