Monday, September 26, 2016

VMware DVS - change default policy

The main advantage of VMware distributed virtual switch (DVS) over VMware standard virtual switch (SVS) is centralized configuration which is pushed to ESXi hosts. This centralized management provides uniform virtual switch configuration across all ESXi hosts in DVS scope. Virtual switch specific settings can be generally reconfigured for each port-group. In other words, port-group is a virtual switch construct which defines how particular group of ports will behave. Port-groups inherit default settings from virtual switch.

So far so good. However DVS and SVS differs. In SVS, you can change default settings of the switch and all port-groups on that particular switch inherit these settings. However, this is not possible on DVS. Actually, it is not possible through vSphere Client. Through, GUI you can change just settings of particular port-group as you can see on screenshot below.

DVS port-group various settings.

It is very annoying and time consuming to change settings for each port-group manually. There is also very high probability of inconsistent settings across port-groups. To mitigate challenges described above, you can leverage PowerCLI and change default DVS settings (policy) as you can see in example below. This will change default DVS settings and all new port-groups will be configured consistently and as required by particular design.

PowerCLI script below shows how to to change default settings for traffic Shaping (in/out) and uplink teaming policy. This particular example shows how to configure LAG teaming with IP src/dst load balancing but any other teaming and load balancing can be used as well.

 $vDSName = “vDS01”  
 $vds = Get-VDSwitch $vDSName  
 $spec = New-Object VMware.Vim.DVSConfigSpec  
 $spec.configVersion = $vds.ExtensionData.Config.ConfigVersion  
 $spec.defaultPortConfig = New-Object VMware.Vim.VMwareDVSPortSetting
  
 $inShapingPolicy = New-Object VMware.Vim.DVSTrafficShapingPolicy  
 $inShapingPolicy.Enabled = New-Object VMware.Vim.BoolPolicy  
 $inShapingPolicy.AverageBandwidth = New-Object VMware.Vim.LongPolicy  
 $inShapingPolicy.PeakBandwidth = New-Object VMware.Vim.LongPolicy  
 $inShapingPolicy.BurstSize = New-Object VMware.Vim.LongPolicy  
 $inShapingPolicy.Enabled.Value = $true  
 $inShapingPolicy.AverageBandwidth.Value = 1000000000  
 $inShapingPolicy.PeakBandwidth.Value = 5000000000  
 $inShapingPolicy.BurstSize.Value = 19200000000  
 $inShapingPolicy.Inherited = $false  
 $outShapingPolicy = New-Object VMware.Vim.DVSTrafficShapingPolicy  
 $outShapingPolicy.Enabled = New-Object VMware.Vim.BoolPolicy  
 $outShapingPolicy.AverageBandwidth = New-Object VMware.Vim.LongPolicy  
 $outShapingPolicy.PeakBandwidth = New-Object VMware.Vim.LongPolicy  
 $outShapingPolicy.BurstSize = New-Object VMware.Vim.LongPolicy  
 $outShapingPolicy.Enabled.Value = $true  
 $outShapingPolicy.AverageBandwidth.Value = 1000000000  
 $outShapingPolicy.PeakBandwidth.Value = 5000000000  
 $outShapingPolicy.BurstSize.Value = 19200000000  
 $outShapingPolicy.Inherited = $false  
 $uplinkTeamingPolicy = New-Object VMware.Vim.VmwareUplinkPortTeamingPolicy  
 $uplinkTeamingPolicy.policy = New-Object VMware.Vim.StringPolicy  
 $uplinkTeamingPolicy.policy.inherited = $false  
 $uplinkTeamingPolicy.policy.value = “loadbalance_ip”  
 $uplinkTeamingPolicy.uplinkPortOrder = New-Object VMware.Vim.VMwareUplinkPortOrderPolicy  
 $uplinkTeamingPolicy.uplinkPortOrder.inherited = $false  
 $uplinkTeamingPolicy.uplinkPortOrder.activeUplinkPort = New-Object System.String[] (1) # designates the number of uplinks you will be specifying.  
 $uplinkTeamingPolicy.uplinkPortOrder.activeUplinkPort[0] = "LAG01"  
 $spec.DefaultPortConfig.InShapingPolicy = $inShapingPolicy  
 $spec.DefaultPortConfig.OutShapingPolicy = $outShapingPolicy  
 $spec.DefaultPortConfig.UplinkTeamingPolicy = $uplinkTeamingPolicy  
 $vds.ExtensionData.ReconfigureDvs_Task($spec)  

Please, note that example above has to be altered to fulfill your particular needs but now you should have pretty good idea how to achieve it.

Wednesday, September 14, 2016

VMworld 2016 US sessions worth to watch

Here is the list of VMworld sessions I watched or still have to watch during next days and weeks.

After watching the session I do categorization and brief description of sessions. I'm also assigning category labels and technical level to each session.

Category labels:
  • Strategy
  • Architecture
  • Operations
  • High Level Product Overview
  • Deep Dive Product Overview
  • Technology Preview
  • Idea for Improvement

Technical levels:
  • Basic
  • Middle
  • Advanced

Note: If you will have trouble with re-play particular session from my links below go to OFFICIAL VMWORLD SITE and search for particular session by session code. You will need to register there but it still should be free of charge.

Already watched and categorized sessions

Session code: INF9044
Speaker(s): Emad Younis
Category labels: Technology Preview
Technical level: Basic-Middle
Brief session summary: Introduction and demo of vCenter migration tool for easy migration from Windows based vCenter to vCenter Server Appliance (VCSA).


ESXi

Session name: vSphere 6.x Host Resource Deep Dive
Session code: INF8430
Speaker(s): Frank Denneman, Niels Hagoort
Category labels: Architecture

Technical level: Advanced
Brief session summary: Frank is presenting his findings about NUMA and implications for vSphere Architecture. Frank is also explaining the difference between RAM, local NVMe and external storage proxiimity and impact on access times. Niels is very nicely explaining how hardware features like VXLAN, RSS, VMDq, device drivers and drivers parameters impacts performance of virtual overlay networking (VXLAN).   

VIRTUAL SAN

Session name: VSAN Networking Deep Dive and Best Practices
Session code: STO8165R
Speaker(s): Ankur Pai, John Nicholson
Category labels: Architecture, Deep Dive Product Overview
Technical level: Advanced
Brief session summary: John starts presentation and he discuss what kind of virtual switch to use. You can use standard or distributed. Distributed virtual switch is highly recommended because of configuration consistency and advanced features like NIOC, LACP, etc. Next there is discussion about multicast used for VSAN. Multicast is used just for state and metadata. VSAN data traffic is transfer via unicast. IGMP snooping for L2 should be always configured to optimize metada traffic and PIM just in case the VSAN traffic should go over L3. VSAN uses two multicast addresses - 224.2.3.4 port 23451 (Agent Group Multicast Address) and 224.1.2.3 port 12345 (Master Group Multicast Address). These default multicast addresses has to be changed in case two VSAN clusters are running in the same broadcast domain. John is sharing esxcli commands how to change default multicast addresses but generally it is recommended to keep each VSAN cluster in dedicated non routable VLAN (single broadcast domain). John explains that VSAN is supported on both L2 and L3 topologies but keep it in L2 topology is significantly simpler. Physical network equipment, cabling and subscription ratios matters. John warns against CISCO FEX (port extender) and Blade Chassis switches where oversubscription is usually expected. Next topic is troubleshooting and presenters introduce VSAN health-check plugin which removes the need for cli (Ruby vSphere Console aka RVC) troubleshooting commands. VSAN does not need Jumbo Frames but you can use it if dictated by some other requirement. LLDP/CDP should be enabled for better visibility and easier troubleshooting.  After 24 minute John pass the presentation to Ankur. Ankur is presenting Virtual SAN stretched cluster networking considerations. He starts with high level overview of stretched VSAN topology and witness requirement in third location. L2 stretch network is recommended even L3 is supported but more complex. Cross site round trip has to be less then 5 ms and recommended bandwidth is 10Gbps. Witness can be connected over L3 with roundtrip less then 200ms and bandwidth 100Mbps (2Mbps per 1000 VSAN components). Communication between the witness and main site is L3 unicast - tcp 2233 (IO) and udp 23451 (cluster) both directions. Periodic heartbeating between the witness and the main site occurs every second. Failure is declared after 5 consecutive failures. Static routes have to be configured between data hosts and witness host because custom ESXi TCP/IP stack for VSAN is not supported at the moment. Ankurs presents three form factors of witness appliance (Large, Medium, Tiny). Later Ankurs explains bandwidth calculations for cross site interconnect and another calculation for witness network bandwidths requirements. At the end there is Q&A for about 5 minutes.   

BUSINESS CRITICAL APPLICATIONS

Session name: Performance Tuning and Monitoring for Virtualized  Database Servers
Session code: VIRT7511
Speaker(s): David Klee, Thomas LaRock
Category labels: Architecture, Operations
Technical level: Middle
Brief session summary: TBD.

vSPHERE CONFIGURATION MANAGEMENT

Session name: Enforcing a vSphere Cluster Design with PowerCLI Automation
Session code: INF8036
Speaker(s): Duncan Epping, Chris Wahl
Category labels: Idea for Improvement
Technical level: Middle to Advanced
Brief session summary: TBD.


SITE RECOVERY MANAGER

Session name: SRM with NSX: Simplifying Disaster Recovery Operations & Reducing RTO
Session code: STO7977
Speaker(s): Rumen Colov, Stefan Tsonev
Category labels: High Level Product Overview
Technical level: Basic-Medium
Brief session summary: Rumen Colov is SRM Product Manager so his part of session is more about High Level introduction to SRM and NSX integration. Session really starts from 8:30 min. Before it is just introduction. Then Rumen explains main SRM use cases - Disaster Recovery, Datacenter Migration, Disaster Avoidance - and differentiation among them. The latest SRM versions supports zero-downtime VM mobility in Active/Active datacenter topology.  Then there is explained Storage Policy-based Management integration with SRM and NSX 6.2 interoperability with NSX stretched logical wire (NSX Universal Logical Switch) across two vCenters. Ruman also covers product licensing editions required for integration.  In 0:22:53 Rumen says that network inventory auto-mapping for NSX stretched networking works only with storage based replication (SBR) and not with host based replication (aka HBR or vSphere Replication). It seems to me the reason is because for SBR information about network connectivity are saved in VMX file which is replicated by storage but for HBR we would need manual inventory mappings. In 0:24:00 presentation is handed over to Stefan Tsonev (Director R&D).  Stefan shows several screenshots with SRM and NSX integration. It seem that Stefan is very deeply SRM oriented with basic NSX knowledge but it is very nice introduction into SRM and NSX integration basics. 

LOG INSIGHT

Session name: Insight into the World of Logs with VMware vRealize Log Insight
Session code: MGT7685R
Speaker(s): Iwan Rahabok, Karl Fultz, Manny Sidhu
Category labels: Operations, Deep Dive Product Overview
Technical level: Middle
Brief session summary: Very nice walkthrough VMware LogInsight use cases and real examples how to analyze logs. At the end of presentation are explained some architecture topologies for log management in financial institution.

Still have to watch and categorize sessions below

STO7650 - Duncan Epping, Lee Dilworth : Software-Defined Storage at VMware Primer
INF8837 - Rene-Francois Mennecier : How to Relocate Your Physical Data Center Without Downtime in a Few Clicks, Thanks to Automation
INF8260 - Automated Deployment and Configuration of the vCenter Server Appliance
INF8045 - vSphere High Availability Best Practices and Tech Preview
INF8172 - vSphere Client Roadmap: Host Client, HTML5 Client, and Web Client
INF8375 - What's New with vSphere
INF9944R - What's New with vCenter Server
INF8858 - vSphere Identity: Multifactor Authentication Deep Dive
INF9089 - Managing vCenter Server at Scale? Here's What You Need to Know
INF9119 - How to Manage Your Platform Services Controller Like Batman
INF8553 - The Nuts and Bolts of vSphere Resource Management
INF8108 - Extreme Performance Series: vCenter Performance Deep Dive
INF8959 - Extreme Performance Series: DRS Performance Deep Dive—Bigger Clusters, Better Balancing, Lower Overhead
VIRT8530R - Deep Dive on pNUMA & vNUMA - Save Your SQL VMs from Certain DoomA!
INF8780R - vSphere Core 4 Performance Troubleshooting and Root Cause Analysis, Part 1: CPU and RAM
INF8701R - vSphere Core 4 Performance Troubleshooting and Root Cause Analysis, Part 2: Disk and Network
INF9205R - Troubleshooting vSphere 6 Made Easy: Expert Talk
INF8755R - Troubleshooting vSphere 6: Tips and Tricks for the Real World
INF8850 - vSphere Platform Security
VIRT8290R - Monster VMs (Database Virtualization) Doing IT Right
VIRT7621 - Virtualize Active Directory, the Right Way!
INF8845 - vSphere Logs Grow Up! Tech Preview of Actionable Logging with vRealize Log Insight
MGT8641R - A Lot of Insight and No PowerPoint
INF8469 - iSCSI/iSER: HW SAN Performance Over the Converged Data Center
SDDC7808-S - How I Learned to Stop Worrying and Love Consistency: Standardizing Datacenter Designs
INF9048 - An Architect's Guide to Designing Risk: The VCDX Methodology
INF8644 - Getting the Most out of vMotion: Architecture, Features, Performance and Debugging
INF8856 - vSphere Encryption Deep Dive: Technology Preview
INF8914 - Mastering the VM Tools Lifecycle in your vSphere Data Center
INF7825 - vSphere DRS and vRealize Operations: Better Together
INF7827 - vSphere DRS Deep Dive: Understanding the Best Practices, Advanced Concepts, and Future Direction of DRS
INF8275R - How to Manage Health, Performance, and Capacity of Your Virtualized Data Center Using vSphere with Operations Management
INF9047 - Managing vSphere 6.0 Deployments and Upgrades
STO7965 - VMware Site Recovery Manager: Technical Walkthrough and Best Practices
STO7973 - Architecting Site Recovery Manager to Meet Your Recovery Goals
STO8344 - SRM with vRA 7: Automating Disaster Recovery Operations
STO8246R - Virtual SAN Technical Deep Dive and What’s New
STO8750 - Troubleshooting Virtual SAN 6.2: Tips & Tricks for the Real World
STO7904 - Virtual SAN Management Current & Future
STO8179R - Understanding the Availability Features of Virtual SAN
STO7557 - Successful Virtual SAN 6 Stretched Clusters
STO8743 - Extreme Performance Series: Virtual SAN Performance Troubeshooting
STO7645r - Virtual Volumes Technical Deep Dive
INF8038r - Getting Started with PowerShell and PowerCLI for Your VMware Environment
NET7858R - Reference Design for SDDC with NSX and vSphere: Part 2
CNA9993-S - Cloud Native Applications, What it means & Why it matters
CNA7741 - From Zero to VMware Photon Platform
CNA7524 - Photon Platform, vSphere, or Both?
SDDC7502 - On the Front Line: A VCDX Perspective Working in VMware Global Support Services
VIRT9034 - Oracle Databases Licensing on a Hyper-Converged Platform
VIRT9009 - Licensing SQL Server and Oracle on vSphere


All VMworld US 2016 Breakout Sessions

All VMworld US 2016 session are listed here.
If you cannot play session from list above go to OFFICIAL VMWORLD SITE and search for particular session. You will need to register there but it is free of charge.

Wednesday, September 07, 2016

VMware Virtual Machine Hardware Version and CPU Features

I always thought that only device not virtualized by VMware ESXi is the CPU. It is generally true but I have just been informed by someone that available CPU instructions sets (Features) are dependent on VM hardware version. CPU Features are generally enhanced CPU Instruction sets for special purposes. For more information about CPUID and Features read this.

My regular readers knows that I don't believe anything unless I test it. Therefore I did a simple test. I provisioned new VM with hardware version 4, installed FreeBSD OS and identified CPU features. You can see screenshot below.

VM with hardware version 4 and FreeBSD Guest OS
Next, I provisioned another VM with hardware 10 with FreeBSD OS on the same ESXi hosts and listed CPU Features. See screenshot below.

VM with hardware version 10 and FreeBSD Guest OS
Now, if you compare CPU Features you can see differences. Following CPU Features are added in VM hardware version 10:

  • FMA
  • PCID
  • X2APIC
  • XSAVE
  • OSXSAVE
  • AVX
  • F16C
  • RDRAND
  • HV
Does it matter? Well, it depends if your particular application really needs advanced CPU features.

vCPU in VMware virtualization is really not virtualized but some CPU Features are masked in older VM hardware because VM hardware emulates particular chipset.

I did not know that! I have never thought about it! My bad.

Anyway, this is just another proof that everyday there is some surprise and you can always learn something new even in area where you believe you are good at. We never know everything.

Monday, July 25, 2016

How to read BIOS settings from HP server

Sometimes it is pretty handy how to read BIOS settings from modern HP server. Let's assume you have server ouf-of-band remote management card (aka HP iLO).

HP iLO 4 and above supports RESTful API. Here is the snippet from "HPE iLO 4 User Guide".
iLO RESTful API 
iLO 4 2.00 and later includes the iLO RESTful API. The iLO RESTful API is a management interface that server management tools can use to perform server configuration, inventory, and monitoring via iLO. A REST client, such as the RESTful Interface Tool, sends HTTPS operations to the iLO web server to GET and PATCH JSON-formatted data, and to configure supported iLO and server settings, such as the UEFI BIOS settings.
So you can leverage REST API calls or if you like PowerShell you can simplify it by precooked HP command-lets.

Following PowerShell code should show the level of power versus performance for the system.

 $ilo = 192.168.0.100   
 $bios = Connect-HPBIOS $ilo -Username "username" -Password "password"  
 Get-HPBIOSPowerProfile $ilo  
 Disconnect-HPBIOS $ilo  

Other sources:

Sunday, July 24, 2016

ESXi PSOD and HeartbeatPanicTimeout

A Purple Screen of Death (PSOD) is a diagnostic screen with white type on a purple background that is displayed when the VMkernel of an ESX/ESXi host experiences a critical error, becomes inoperative and terminates any virtual machines that are running.  For more info look here.

Nobody is happy to see PSOD in ESXi host but it is important to say that it is just another safety mechanism how to protect your server workloads because PSOD is intentionally initiated by ESXi's vmkernel in situations when something really bad happens in low level. It is usually related to hardware, firmware or driver issue. You can find further information in VMware KB article - Interpreting an ESX/ESXi host purple diagnostic screen (1004250).

The main purpose of this blog post is to explain the timing of PSOD for just single type of error message - "Lost heartbeat". If there is no heartbeat in some time interval PSOD looks like screenshot below. 

no heartbeat
There is no doubt that something serious has to happened in ESXi vmkernel, however regardless what exactly happened following two vSphere advanced settings are used to control heartbeat time interval in which heartbeat must be received otherwise PSOD is executed.   
  • ESXi - Misc.HeartbeatPanicTimeout
  • VPXD (aka vCenter) - vpxd.das.heartbeatPanicMaxTimeout
Let's start with ESXi advanced setting Misc.HeartbeatPanicTimeout. It defines interval in seconds after which vmkernel goes to panic if no heartbeat is received. Please, don't mixed this "Panic Heartbeat" with "HA network heartbeat". These two heartbeats are very different. "HA network heartbeat" is heart beating mechanism between HA cluster members (master<-><->sleaves) over ethernet network but "Panic Heartbeat" is heartbeat inside single ESXi host between vmkernel and COS software components. You can see "Panic Heartbeat" settings by issuing following esxcli command
esxcli system  settings advanced list | grep -A10 /Misc/HeartbeatPanicTimeout
 [root@esx01:~] esxcli system settings advanced list | grep -A10 /Misc/HeartbeatPanicTimeout  
   Path: /Misc/HeartbeatPanicTimeout  
   Type: integer  
   Int Value: 14  
   Default Int Value: 14  
   Min Value: 1  
   Max Value: 86400  
   String Value:  
   Default String Value:  
   Valid Characters:  
   Description: Interval in seconds after which to panic if no heartbeats received  

I have tested that  Misc.HeartbeatPanicTimeout has different values in different situations. Default value is always 14 seconds but
  1. if you have single standalone ESXi host not connected to HA Cluster effective value is 900 seconds
  2. if you have ESXi host as a member of vSphere HA Cluster then the value is 14 seconds
So now we know that the value in ESXi host with enabled HA is 14 seconds (panicTimeoutMS = 14000) and it usually works without any problem. However, if you will, from whatever reasons, decide to change this value it is worth to know that in HA enabled ESXi host is in HA code hardcoded cap of 60 seconds on this value. It is a cap so it does not change the value if it is already less than 60. However, if you use for example the value 900 it will be caped to 60 seconds anyway. I did a test in vSphere6/ESXi6 and it works exactly like that and I assume it works in the same way in vSphere5/ESXi5.

Side note: It was very different in vSphere4/ESXi4 because HA cluster was rewritten in vSphere 5 from the scratch but it is already a history and I hope nobody use vSphere4 anymore.

Behavior justification:
Behavior described in paragraph above makes perfect sense if you ask me. If you have standalone ESXi host and you are experiencing some hardware issue it is better to wait 900 seconds (15 minutes) before ESXi goes to PSOD state because virtual machines running on top of this ESXi host cannot be automatically restarted in other ESXi hosts anyway. And guess what, if ESXi host have some significant hardware failure, it has most probably negative impact on virtual machines running on top of this particular ESXi host, right? Unfortunately, if you have just a single ESXi host vSphere cannot do anything for you.

On the other hand, if affected ESXi host is a member of vSphere HA cluster then it is better to wait only 14 seconds (by default) or maximally 60 seconds and put ESXi host into PSOD quicker because HA cluster will restart affected virtual machines automatically and helps to mitigate the risk of unavailable virtual machines and with that application services running inside these virtual machines.

So that's the explanation how ESXi setting /Misc/ HeartbeatPanicTimeout behaves. Now we can look what vpxd.das.heartbeatPanicMaxTimeout setting is. My understanding is that vpxd.das.heartbeatPanicMaxTimeout is vCenter (VPXD) global configuration for ESXi advanced setting Misc.HeartbeatPanicTimeout. But don't forget that HA cluster is capping Misc.HeartbeatPanicTimeout value on ESXi hosts as described above.

You can read further details about vpxd.das.heartbeatPanicMaxTimeout in VMware KB 2033250 but I think that following description is little bit misleading.
"This option impacts how long it takes for a host impacted by a PSOD to release file locks and hence allow HA to restart virtual machines that were running on it. If not specified, 60s is used. HA sets the host Misc.HeartbeatPanicTimeout advanced option to the value of this HA option. The HA option is in seconds."
My understanding is that description should be reworded to something like ...
"This option is in seconds and impacts how long it takes for ESXi host experiencing some critical issue to go into a PSOD. Setting vpxd.das.heartbeatPanicMaxTimeout is a global setting used for vCenter managed ESXi advanced option Misc.HeartbeatPanicTimeout however Misc.HeartbeatPanicTimeout is adjusted automatically in certain situations. 
In standalone ESXi host 900s is used. In vSphere HA Cluster ESXi host it is automatically changed to 14s and capped to maximum of 60s. This setting have indirect impact on time when file locks are released and hence allow HA cluster to restart virtual machines that were running on affected ESXi host."
Potential side effects and impacts
  • ESXi HA Cluster restart of virtual machines - if your Misc.HeartbeatPanicTimeout is set to 60 seconds than HA cluster will most probably try to restart VMs on another ESXi hosts because network heartbeat (also 14 seconds) will not be received. However because it is not in PSOD the file lock still exist and VM restart will be unsuccessful. 
  • ESXi Host Profiles - if you use the same host profile for HA protected and also non-protected ESXi hosts then it can report difference of Misc.HeartbeatPanicTimeout against compliance.
Blog posts in blogosphere covering "no heartbeat" issues:

Friday, July 15, 2016

DELL Force10 : DNS, Time and Syslog server configuration

It is generally good practice to have time synchronized on all network devices and configure remote logging (syslog) to centralized syslog server for proper troubleshooting and problem management. Force10 switches are not exceptions therefore let's configure time synchronization and remote logging to my central syslog server - VMware LogInsight in my case.

I would like to use hostnames instead of IP addresses so let's start with DNS resolution, continue with time settings and finalize the mission with remote syslog configuration.

Below are my environment details:

  • My DNS server is 192.168.4.21
  • DNS domain name is home.uw.cz
  • I will use internet following three NTP servers/pools - ntp.cesnet.cz, ntp.gts.cz and cz.pool.ntp.org
  • My syslog server is at syslog.home.uw.cz 

Step 1/ DNS resolution configuration
f10-s60#conf
f10-s60(conf)#ip name-server 192.168.4.21
f10-s60(conf)#ip domain-name home.uw.cz
f10-s60(conf)#ip domain-lookup
f10-s60(conf)#exit
Don't forget to configure "ip domain-lookup" because it is the command which enables domain name resolution.

Now let's test name resolution by ping www.google.com
f10-s60#ping www.google.com            Translating "www.google.com"...domain server (192.168.4.21) [OK]
Type Ctrl-C to abort.
Sending 5, 100-byte ICMP Echos to 172.217.16.164, timeout is 2 seconds:!!!!!Success rate is 100.0 percent (5/5), round-trip min/avg/max = 40/44/60 (ms)
We should also test some local hostname in long format
f10-s60#ping esx01.home.uw.cz        
Translating "esx01.home.uw.cz"
...domain server (192.168.4.21) [OK]
Type Ctrl-C to abort.
Sending 5, 100-byte ICMP Echos to 192.168.4.101, timeout is 2 seconds:
!!!!!
Success rate is 100.0 percent (5/5), round-trip min/avg/max = 0/0/0 (ms)
and short format
f10-s60#ping esx01
Translating "esx01"
...domain server (192.168.4.21) [OK]
Type Ctrl-C to abort.
Sending 5, 100-byte ICMP Echos to 192.168.4.101, timeout is 2 seconds:
!!!!!
Success rate is 100.0 percent (5/5), round-trip min/avg/max = 0/0/0 (ms)
Step 2/ Set current date, time and NTP synchronization

You have to decide if you want to use GMT or local time. The hardware time should be always set to GMT and you can configure timezone and summer-time if you wish. So let's configure GMT time in the first place.
f10-s60#calendar set 15:12:46 july 15 2016
and test it
f10-s60#sho calendar
15:12:39  Fri Jul 15 2016
Ok, so hardware time is set correctly to GMT.

If you really want to play with timezone and summer-time you can do it in conf mode with following commands.

f10-s60(conf)#clock ?
summer-time         Configure summer (daylight savings) time
timezone             Configure time zone   
I prefer to keep GMT time everywhere because it, in my opinion, simplifies troubleshooting, problem management and capacity planning.

Step 3/ Configuration of remote logging

FTOS by default doesn't use date and time for log messages. It uses uptime (time from last boot) therefore you can see when something happened since last system boot. However, because we already have time configured properly it is good idea to change this default behavior to use date and time.
f10-s60(conf)#service timestamps log datetime
To be honest, you generally don't need date and time on log messages because remote syslog server will add date and time to messages but I generally prefer to have both times - time from device and time when message arrived to syslog server. If you want to disable time stamping on syslog messages, use no service timestamps [log | debug].

And now, finally, let's configure remote syslog server by single configuration command
f10-s60(conf)#logging syslog.home.uw.cz
Translating "syslog.home.uw.cz"
Translating "syslog.home.uw.cz"
...domain server (192.168.4.21) [OK]
And we are done. Now you can see incoming log messages in your syslog server. See screenshot of my VMware Log Insight syslog server.

VMware Log Insight with Force10 log messages.
Hope you find it useful and as always - any comment is very appreciated.

Monday, June 27, 2016

ESXi boot mode - UEFI or BIOS

Legacy BIOS bootstrapping along with a master boot record (MBR) is uses with x86 compatible systems for ages. The concept of MBRs was publicly introduced in 1983 with PC DOS 2.0. It is unbelievable that we are still using the same concept after more then 30 years.

However, there must be some limitations in 30 years old technology, isn't it?

BIOS limitations (such as 16-bit processor mode, 1 MB addressable space and PC AT hardware) had become too restrictive for the larger server platforms. The effort to address these concerns began in 1998 and was initially called Intel Boot Initiative later renamed to EFI. In July 2005, Intel ceased its development of the EFI specification at version 1.10, and contributed it to the Unified EFI Forum, which has evolved the specification as the Unified Extensible Firmware Interface (UEFI).

What is EFI (or UEFI) firmware?

UEFI replaces the the old Basic Input/Output System (BIOS). UEFI can be used on

  • physical server booting ESXi hypervisor or 
  • Virtual Machine running on top of ESXi hypervisor. 
This blog post is about ESXi boot mode however just for completeness I would like to mention that VMware Virtual Machine with at least hardware version 7 supports UEFI as well. For further information about VM UEFI look at [1].

Originally called Extensible Firmware Interface (EFI), the more recent specification is known as Unified Extensible Firmware Interface (UEFI), and the two names are used interchangeably.

EFI (Extensible Firmware Interface) is a specification for a new generation of system firmware. An implementation of EFI, stored in ROM or Flash RAM, provides the first instructions used by the CPU to initialize hardware and pass control to an operating system or bootloader. It is intended as an extensible successor to the PC BIOS, which has been extended and enhanced in a relatively unstructured way since its introduction. The EFI specification is portable, and implementations may be capable of running on platforms other than PCs.
 
For more information, see  the Wikipedia page for Unified Extensible Firmware Interface.

General UEFI Advantages

UEFI firmware provides several technical advantages over a traditional BIOS system:

  • Ability to boot from large disks (over 2 TB) with a GUID Partition Table (GPT)
  • CPU-independent architecture
  • CPU-independent drivers
  • Flexible pre-OS environment, including network capability
  • Modular design
  • Since UEFI is platform independent, it may be able to enhance the boot time and speed of the computer. This is especially the case when large hard drives are in use. 
  • UEFI can perform better while initializing the hardware devices.
  • UEFI can work alongside BIOS. It can sit on top of BIOS and work independently.
  • It supports MBR and GPT partition types.

Note: Modern systems are only emulating the legacy BIOS. They are EFI native.

UEFI on ESXi

vSphere 5.0 and above supports booting ESXi hosts from the Unified Extensible Firmware Interface (UEFI). With UEFI, you can boot systems from hard drives, CD-ROM drives, USB media, or network.

UEFI benefits

  • ESXi can boot from a disk larger than 2 TB provided that the system firmware and the firmware on any add-in card that you are using supports it. 

UEFI drawbacks

  • Provisioning with VMware Auto Deploy requires the legacy BIOS firmware and is not available with UEFI BIOS configurations. I hope that this limitation will be lifted soon.

Notes: Changing the host boot type between legacy BIOS and UEFI is not supported after you install ESXi 6.0. Changing the boot type from legacy BIOS to UEFI after you install ESXi 6.0 might cause the host to fail to boot.

Conclusion

UEFI is meant to completely replace BIOS in the future and bring in many new features and enhancements that can’t be implemented through BIOS. BIOS can be used in servers that do not require large storage for boot. To be honest, even you can, it is not very common to use boot disks greater then 2 TB for ESXi hosts therefore you may be using BIOS at the moment, but I would recommend shifting to UEFI, as it is the future while BIOS will fade away slowly.

ESXi hypervisor supports both boot modes therefore if you have modern server hardware and don't use VMware Auto Deploy then UEFI should be your preferred choice.

References:
[1] VMware : Using EFI/UEFI firmware in a VMware Virtual Machine 
[2] VMware : Best practices for installing ESXi 5.0 (VMware KB 2005099)
[3] VMware : Best practices to install or upgrade to VMware ESXi 6.0 (VMware KB 2109712)
[4] Usman Khurshid : [MTE Explains] Differences Between UEFI and BIOS
[5] Wikipedia : Unified Extensible Firmware Interface