Making Managing and Deploying VMWare vSphere Networking a Sinch

The title is pretty lofty, but having lived many VMWare deployments, I wouldn’t say networking under VMWare is rocket science, but there are a few things I’d give a few tips on.  Now, there’s bound to be many more, and I’d invite the hive mind to add their own experiences to my own or tell me that I’ve got it all wrong, either works!

Here’s my top list of things you’ll want to think of when managing networking under VMWare vSphere:

Tag it!  Tag It!  Tag It! Cuz Q is Cool (ok 802.1Q isn’t that cool, but I still love it)

A common mistake I see are people not tagging their network interfaces.  Whether you only plan to connect one vSwitch to a ethernet card or put your vmnic0 on it, tag it.  The reason why I am so forceful about this, just like in the networking world, utilizing trunking enables for very minimal overhead the flexibility of having multiple layer 2 networks come over the same port.  In a very small implementation, often I will see that the ports are configured as “access” ports.  This limits only a single layer 2 network to be connected thru the given physical network cards.

Inevitably, much like Murphy’s Law (I get dibs on him first, he’s so going to get it), the need arises to connect additional virtual lans to an ESX server.  If from the initial onset, tagging is used, it’s a “no fuss, no muss”, simply add the vlan on the switches and create the new vSwitch.  If tagging is not enabled, you’ll be enjoying a nice outage on each ESX server to enable it.  Better to do it upfront while downtime is an option then after when you are in production.

ESX should be treated like an edge switch

Often, network technicians/operators tend to think of a server as just a server.  Luckily this is changing slowly as VMWare has become a common item in the datacenter.  If you’re not lucky enough to have an enlightened soul, explain, yes, you have a server, but that your network interfaces are in fact a bridge to a virtual switching infrastructure.  If they are still doubtful, the fact that ESX supports etherchannel and 802.3ad often get’s them a little curious.  VMWare KB 1004048 explains nicely how it can be implemented.

Service Style Networks Lessen the Workload

A tendency of network engineers is to only provision the VLAN’s you require when you request them.  I know I will start a good debate with this point, but I generally ask my networking staff to simply present all VLAN’s to my ESX clusters.  Yes, there are some drawbacks such as broadcast traffic (each interface will receive all broadcast traffic for the site, irregardless of network, ARP’s/BDPU/et cetra), some possible security concerns (if you already have DMZ, Public and Private VLANs on the same physical switches … I would recommend keeping an eye on VMWare’s Security Advisories whether you choose to cross security zones or not. ) and undoubtedly there are more other drawbacks (feel free to comment).

My retort often comes in the manner of “Hey, everytime a new VM that requires access to a VLAN we haven’t configured, do you want me to call you to add to my 30 node ESX farm?” and immediately everyone around the table gets the point. In short, using a “all vlans” strategy lessens the workload for the network team and enables more flexibility and quicker turnaround from an ESX point of view.  For a network engineer, it’s quite simple (cisco sample):

switchchport mode trunk

switchport trunk allow vlan all

Additionally, I have had too many fun experiences where a virtual machine migrates to another host and thru a simple human error, “host X just went off the network!  What’s going on?”  You can guess what the error was, a missing vlan.

Come to love the dvSwitch!

So you’ve got an ESX Cluster, eh?  5 hosts and things are rolling fine?  Got a couple of Vlan’s and things are quiet?

Oh wait, here comes the boss, some big project is coming that’s going to make you have to work?  What?!?  20 new ESX servers and some genious on the Network architecture team has decided there needs to be 14 new Vlans? Oh, this ain’t pretty, 280 new vSwitches plus adding the the vlans to the original ESX cluster!  Guessing it’s time to say bye to your family and build yourself a little spot to sleep above the ceiling tiles, you knew this day was coming.  Wait one sec, what’s this feature called dvSwitches?  Hold the pressess!  You might just get to see your family before your 80th birthday after all!

For those of us who’ve been using VMWare ESX since the v2.x days, we’ve come to love our vSwitches and the abilities they give us, making our ESX servers look virtually like a network switch to the networking team.  Of course, anyone who’s managed large ESX clusters know well that these can also be the bane of your existence.  Between the number of vSwitches to define and vMotion being very picky that the names match exactly, they aren’t quite for the 15+ ESX server farms out there.

Since version 4.0, VMWare has since given us this little blessing called a “dvSwitch” (aka  Distributed Virtual Switch).  The primary difference from a 10,000 foot view your manager’s manager could easily grasp is “instead of defining virtual switches on a single ESX server, the switches are defined at the data center level.”  I won’t dive in too deep on “how to implement a dvSwitch” as there are great resources out there, but the sheer managability it provides, especially in large deployments is immense.

  • Configure portGroups once at the datacenter level (equivelant of a vSwitch, well, nearly) instead of once for each individual ESX host
  • Provisioning of a new host’s dvSwitch takes less then 2 minutes, versus vSwitches which can range 20 minutes or more (depending on the network complexity)
  • Adding a new VMWare ESX server to an existing cluster is greatly simplified.  Simply clone, present the LUNs, have the switches configured and then run the “Add Host” wizard for the dvSwitch.
  • Lessen the chance of provisioning errors.  Simple mistake naming a vSwitch can cause vMotion or DRS to fail or even cause outages.

In short, dvSwitches are a way to save allot of time.  This time savings becomes quite prevlant in very large deployments where a single resource may need to manage 30+ ESX servers alone.

Now, the only caveat, from the last time I checked (and please correct me if I am wrong), dvSwitches are only available in Enterprise Plus licensing.  For those who are concerned, this can easily be justified by the time savings in comissioning, managing and troubleshooting.

Well, those are my ramblings for the evening.  Enjoy the reading and please, feel free to comment.  I love feedback and either koodos or a good debate!

Advertisements

About ericgrav
Senior technologist specializing in information management and dabblings into cloud computing

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: