Wednesday, May 28, 2014

DELL Force10 : VLT - Virtual Link Trunking

Do you know CISCO's Virtual port Channel? Do you want the same with DELL datacenter switches. Here we go.

General VLT overview

Virtual Link Trunking or VLT is a proprietary aggregation protocol developed by Force10 and available in their datacenter-class or enterprise-class network switches. VLT is implemented in the latest firmware releases (FTOS from 8.3.10.2) for their high-end switches like the S4810, S6000 and Z9000 10/40 Gb datacenter switches. Although VLT is a proprietary protocol from Force10, other vendors offer similar features to allow users to set up an aggregated link towards two (logical) different switches, where a standard aggregated link can only terminate on a single logical switch (thus either a single physical switch or on different members in a stacked switch setup).  For example CISCO's similar proprietary protocol is called Virtual Port Channel (aka vPC) and Juniper has another one called Multichassis LAG (MC-LAG).

VLT is a layer-2 link aggregation protocol between end-devices (servers) connected to (different) access-switches, offering these servers a redundant, load-balancing connection to the core-network in a loop-free environment, eliminating the requirement for the use of a spanning-tree protocol.[2] Where existing link aggregation protocols like (static) LAG (IEEE 802.3ad) or LACP (IEEE 802.1ax) require the different (physical) links to be connected to the same (logical) switch (such as stacked switches), the VLT, for example, allows link connectivity between a server and the network via two different switches.

Instead of using VLT between end-devices like servers it can also be used for uplinks between (access/distribution) switches and the core switches.[3]

Above VLT general description is from Wikipedia. Fore more information about VLT see http://en.wikipedia.org/wiki/Virtual_Link_Trunking

DELL published Force10 VLT Reference Architecture (PDF - link cached by google) where VLT is explained in detail so it is highly recommended to read it together with all product documentation and release notes before any real plan, design and implementation.

VLT Basic concept and terminology

The VLT peers exchange and synchronize Layer2-related tables to achieve harmonious Layer2 forwarding among the whole VLT domain, but the mechanism involved is transparent.

VLT is a trunk (as per its name) attaching remote hosts or switches.
VLTi is the interconnect link between the VLT peers. For historical reasons that is also called ICL (InterConnect Link) in the command outputs.

All the following rules apply to the VLT topologies
 2 unit per domain (as of FTOS 8.3.10.2)
 8 links per port-channel or fewer.
 Units should run the same FTOS version
 The backup should employ a different link than the VLTi, and preferably a diverse path

Simple implementation plan

Below I'll write simplified implementation plan for VLT configuration so it should be handy for any lab or proof of concept deployments.

 Implementation plan is divided in to 6 steps.
  1. Check or configure spanning tree protocol
  2. Check or configure LLDP
  3. Check or configure out of band management leveraged for VLT backup link
  4. Configure VLTi link (VLT inter connect)
  5. Configure VLT domain
  6. Configure VLT port-channel

Step 1 - Check or configure spanning tree protocol
Rapid Spanning-Tree should be enabled to prevent configuration and patching mistakes. STP configuration depends on customer environment and spanning tree topology preferences. Below parameters are just examples.

Switch A - configured to become RSTP root
protocol spanning-tree rstp
 no disable
 hello-time 1
 max-age 6
 forward-delay 4
 bridge-priority 4096 (if you want to have this switch as STP root)

Switch B - configured as backup root.
protocol spanning-tree rstp
 no disable
 hello-time 1
 max-age 6
 forward-delay 4
 bridge-priority 8192
Step 2 - LLDP configutration
LLDP must be enabled to advertise theirs configuration and receive configuration information form the adjacent LLDP-enabled device.

Switch A
protocol lldp
  advertise management-tlv system-description system-name
  no disable

Switch B
protocol lldp
  advertise management-tlv system-description system-name
  no disable
Step 3 - VLT backup link
VLT backup link is used to exchange heartbeat messages between the two VLT peers. The Management interface at both VLT peers to activate the backup link.

Switch A
interface management 0/0
  ip address switch-A-IP/switch-A-mask
  no shutdown
Switch B
interface management 0/0
  ip address switch-B-IP/switch-B-mask
  no shutdown

Step 4 - VLTi (interconnect) link
Now we configure the VLTi, the connection between both VLT peers. It is recommended to use a Static Port channel for redundancy reasons. Two 40GbE interfaces are enough and we bound it at the Port channel 127.  No special configuration is required at the interface or Port channel configuration level. To become a VLTi (automatically managed by the system), the port-channel should be in default mode (no switchport).

Switch A
interface port-channel 127
  description "VLTi - interconnect link"
  channel-member VLTi_INTERFACE1
  channel-member VLTi_INTERFACE2
  no ip address 
  mtu 12000
  no shutdown

Switch B
interface port-channel 127
  description "VLTi - interconnect link"
  channel-member VLTi_INTERFACE1
  channel-member VLTi_INTERFACE2
  no ip address 
  mtu 12000
  no shutdown

Note 1: Don't forget to do no shutdown for physical interfaces acting as port-channel members. Your port-channel stay down unless you put them up.
Note 2: Port-channel nor physical ports must NOT be in switchmode to be used for VLTi.
Note 3: If you are planning to use jumbo frames (bigger MTU size) then you have to use it also for VLTi links (max MTU on Force10 is 12000 so it is good idea to set it to max).

Use following configuration for all VLTi interfaces
interface VLTi_INTERFACEx
  no shutdown
  no switchmode

Verify port-channel status on both switches
show int po 127 brief

Port-channel should be up and composed from 2 ports.

Step 5 - VLT domain configuration
 We have to configure the domain number and the VLT domain options described below.
  • We use the peer-link command to select which is the VLTi interface.
  • We have to select the interface for the heartbeat messages exchange we use the back-up destination command with the ip address of the other VLT peer.
  • We should set the primary-priority command to configure the VLT role (primary or secondary). Primary VLT node will be the switch with lower priority. 
  • The system-mac mac-address command must match at both peers in the VLT domain. 
  • The unit id number 0 or 1 with the unit-id command will minimize the time required for the VLT system to determine the unit ID assigned to each peer switch when one peer switch reboots.

Switch A (primary)
vlt domain 1
  peer-link port-channel 127
  back-up destination switch-B-IP
  primary-priority 1
  system-mac mac-address 02:00:00:00:00:01
  unit-id 0
Switch B (secondary)
vlt domain 1
  peer-link port-channel 127
  back-up destination switch-A-IP
  primary-priority 8192
  system-mac mac-address 02:00:00:00:00:01
  unit-id 1
For verification we can use commands below
sh vlt brief
sh vlt statistics
sh vlt backup-link

Step 5 - VLT Port Channel
It is recommended that VLTs that are facing hosts/switches should be preferably built by LACP, to benefit from the protocol negotiations. However static port-channels are also supported.

It is also recommended to configure dampening (or equivalent) on the interfaces of connected hosts/switches (access switches, not VLT peers). The reason to use dampening is that at start-up time, once the physical ports are active a newly started VLT peer takes several seconds to fully negotiate protocols and synchronize (VLT peering, RSTP, VLT backup links, LACP, VLT LAG sync, etc). The attached devices are not aware of that activity and upon activation of a physical interface, the connected device will start forwarding traffic on the restored link, despite the VLT peer unit being still unprepared. It will black-hole traffic. Dampening on connected devices (access switches) will hold an interface temporarily down after a VLT peer device reload. A reload is detected as a flap: the link goes down and then up. Dampening acts as a cold start delay, ensuring that the VLT peers are up most ready to forward before the physical interface is activated, avoiding temporary black holes. Suggested dampening time: 30 seconds to 1 minute. We use 60 seconds in our example.

So let's finally configure the port channel (dynamic LAG) that interconnect the  S4810’s (VLT Domain) to the ustream S60 what is our hypotetical L3 switch (router).

Switch A
interface port-channel 1
  description "Uplink to S60"
  no ip address
  switchport
  vlt-peer-lag port-channel 1
  no shutdown

interface tengigabit 0/PO1-INTERFACE
  port-channel-protocol lacp
    port-channel 1 mode active
  dampening 10 100 1000 60
  no shutdown
Switch B
interface port-channel 1
  description "Uplink to S60"
  no ip address
  switchport
  vlt-peer-lag port-channel 1
  no shutdown

interface tengigabit 0/PO1-INTERFACE
  port-channel-protocol lacp
    port-channel 1 mode active
  dampening 10 100 1000 60
  no shutdown
Hope it is helpful not only for me but also for someone else. Any comments are welcome.

17 comments:

Anonymous said...

Hi David,

Thanks for your posting it is really helpful. I just have a question regarding "system-mac mac-address" value. It seems like we need to type the same mac address on both vlt peers. So my question is; do we need to use real mac address or fictitious?
Thanks in advanced...

David Pasek said...

The snip from Configuration Guide ...

(Optional) When you create a VLT domain on a switch, Dell Networking OS automatically creates a VLT-system MAC address used for internal system operations.
VLT DOMAIN CONFIGURATION mode system-mac mac-address mac-address
To explicitly configure the default MAC address for the domain by entering a new MAC address, use the system-mac command. The format is aaaa.bbbb.cccc.
Also, reconfigure the same MAC address on the VLT peer switch.
Use this command to minimize the time required for the VLT system to synchronize the default MAC address of the VLT domain on both peer switches when one peer switch reboots.


So it means that setting of VLT-system MAC address is optional and FTOS would create virtual MAC address automatically. But I've read some white paper recommending to set it explicitly. This mac-address is just internal therefore I use special private (locally administered) MAC addresses defined in IEEE 802 standard.

David Pasek said...

Look at my other blog post "Locally Administered Address Ranges"

http://blog.igics.com/2014/05/locally-administered-address-ranges.html

There are private MAC address ranges.

Didier said...

Hello David,

Thank you for this blog, it helped a lot in configuring the switches, I have a question.

I created the VLTi and the LACP according to informaciĆ³n on this blog, I added several vlan to the port-channel 1. I have a Dell PowerConnect 6248 with 2 10 GB interface that I would like to use with LACP as a trunk to the s4810.

This is the commands I used on the PC6248
swtest01#show running-config interface port-channel 1
description "Conexion a s4810"
switchport mode trunk
switchport trunk allowed vlan add 18,22
mtu 9216

swtest01#show running-config interface ethernet 1/xg1
channel-group 1 mode auto
mtu 9216

swtest01#show running-config interface ethernet 1/xg2
channel-group 1 mode auto
mtu 9216

swtest01#show interfaces port-channel 1

Channel Ports Hash Algorithm Type
------- ----------------------------- -------------------
ch1 Active: 1/xg1, 1/xg2 3

The status on the s4810
swS4810-04#sh interfaces port-channel 1
Port-channel 1 is up, line protocol is up
Created by LACP protocol
Description: "Uplink to access switch"
Hardware address is 00:01:e8:8b:4b:d6, Current address is 00:01:e8:8b:4b:d6
Interface index is 1258291712
Minimum number of links to bring Port-channel up is 1



When I try to ping the switches s4810 IPs in the associated vlans I cannot ping, is there something I'm missing, any help is appreciated.

David Pasek said...

It looks like port-channel works properly but you have problem with L3 connectivity. Do you have properly configured VLANs on Force10? Please share config files with me and I can try to do troubleshooting. Use my email address david.pasek (at) gmail.com to send me config files of all three switches.

Martin Zidek said...

Just to clarify VLT priority settings, because documentation is little confusing in this case - Be advised that the VLT role is NOT pre-emptive: that means that once a role is set to a switch it will keep that role – even if another switch with a lower priority would become available: only on failure a new election will take place. So in general: the 1st VLT switch that comes up will be the primary VLT domain member and only if an actual election is taking place (both switches come online/join VLT domain at same time) the primary priority value is applicable. - he The only reliable method is to boot up the switch you want to run as primary first.

David Pasek said...

@Martin: thanks for sharing your clarification. That's actually means there is no operational procedure how to change a master node (primary) when VLT cluster (both nodes) is up and running. The only possibility (besides the switch failure) is to change priority and restart the primary switch. Do I understand it correctly?

Martin Zidek said...

@David - yes, you uderstand it correctly. There is no way to enforce primary VLT role, because there is no preemption. You must reboot primary to switch roles.

niktips said...

Hi David, thanks for another great post. I have a question about RSTP. Say if I have a HP switch in the network core, which is my RSTP root, then I can leave RSTP priorities on Force10 switches as default, can't I?

David Pasek said...

You can. However, your RSTP root switch (HP in your case) should have set lower priority then other switches in L2 network to be sure that it will be always elected as a RSTP root.

For example "bridge-priority 4096" is the best number to set on the switch intended to be a root.

Genious Person said...

Thanks for the blog loaded with so many information. Stopping by your blog helped me to get what I was looking for. vlt gratis

Gerson Acevedo said...
This comment has been removed by the author.
David Pasek said...

Hi Gerson,
thanks for stopping by.

vlt-peer-lag port-channel 1 just informs the VLT peer (the other switch on VLT domain) that the port-channel associated to this VLT is port-channel 1.

Here is the snippet from Force10 CLI documentation.
vlt-peer-lag port-channel id-number
Associate the port channel to the corresponding port channel in the VLT peer for the VLT connection to an attached device.

id-number - Enter the respective vlt port-channel number of the peer device. The range is from 1 to 128.

If not specified, it assumes the same port-channel number is used on both VLT peers, which is best practice anyway.

Gerson Acevedo said...
This comment has been removed by the author.
David Pasek said...

If you do not have OOB management network and you will not use out-of-band management for OOB switch management you can use point to point link between those management interfaces for VLT backup link. It is perfectly ok for VLT because it is just another heartbeat / witness to avoid split brain scenario.

Anonymous said...

Regarding convergence speed. If I am using a VLT + Peer-routing combination. How many seconds before achieve full recovery? I had a solution with Dell N4000 series. it was MLAG + VRRP if one of the peers fails(master switch is dead) it takes around 28-32 seconds to recover connectivity. That was a lot of time it wasnt a good solution.

So can I tweak VLT performance? or default is good enough?

David Pasek said...

If I remember correctly the failover was instantaneous. Peer-routing is actually ARP proxy so that's the reason why failover is very fast. VRRP is different.

Dell S-series (Force10) is IMHO much more mature datacenter technology then N-Series which was more focused on campus networking.

Disclaimer: I do not work anymore directly for Dell (I work for VMware), therefore I'm not aware of the latest networking improvements. I also do not have access to Force10 (S-Series) equipment so I cannot do any tests.