This post is a writeup of a project for a master’s class in Decision Support Systems at Murray State. This is my first dive into VMware PowerCLI aside from some one shots. All feedback is welcome.
Our Problems
Problem 1: Servers are not being virtualized due to a decentralized procurement process
A decentralized server procurement process presents many problems to an organization. There are many gains with standardizing OS/hardware platforms.
Problem 2: Servers are not being virtualized because knowledge is required to make “Virtualize/Don’t Virtualze” decision
The benefits of server virtualization are easy to explain and are a part of our culture. However, the organization has not adopted a “virtualize first” mentality. There is still a lack of stakeholder understanding with regards to virtualization.
Due to lack of knowledge, ROI is not maximized. This knowledge exists in two places – the virtual infrastructure itself and as tacit knowledge with the VMware administrator.
PROBLEM ANALYSIS
Problem 1: Servers are not being virtualized due to a decentralized procurement process
This problem is outside of the scope of the CIS645 class. We’re working on it.
Problem 2: Servers are not being virtualized because knowledge is required to make “Virtualize/Don’t Virtualze” decision
Problem 2 has two major parts.
CAPACITY – CAN OUR VIRTUAL INFRASTRUCTURE SUPPORT THIS APPLICATION?
This question has historically been answered heuristically with ball park figures. Manually gathering current storage and RAM capacity data too time consuming.
CANDIDACY – BASED ON SYSTEM REQUIREMENTS AND INDUSTRY KNOWLEDGE, IS VIRTUALIZATION SUITABLE FOR THIS APPLICATION?
This is the harder question. Typically you’ll hear consultants say “it depends”. Answering this question usually involves a phone call with the VMware administrator. The conversation is series of questions from the administrator to the stakeholder.
RECOMMENDATION
When the two questions have been answered, a recommendation of Virtualize/Don’t Virtualize is made. If a Virtualize decision is made, the VMware administrator must find the optimal storage unit to deploy to and coordinate the deployment with the stakeholder.
SOLUTION DESIGN
USER INTERFACE
The users of this system are already familiar with Excel and would prefer to utilize Excel’s familiarity and What-If scenario planning.
What if we added another 2TB of storage?
What if we upgraded our RAM?
What if we didn’t have to have the license dongle?
Excel quickly enables these questions to be answered. A normal ‘GUI’ application would take more time to develop and would not invite queries of an ad-hoc nature.
CAPACITY
Capacity data resides at several levels: the virtual machine itself, the host, and the data store. The data is put into Excel using VMware’s PowerCLI. PowerCLI is a Windows PowerShell snap-in that integrates with any VMware Virtual Infrastructure. Windows PowerShell also integrates nicely with Excel.
Here are the steps to capacity gathering with the VMware Expert System:
- Open the Excel Spreadsheet
- Clear previously gathered data
- Connect to a vCenter Server
- Gather datastore information
- Gather host information
- Gather virtual machine information
- Write values to ‘Capacity’ Worksheet
- Write values to ‘New Virtual Machine’ Worksheet
- Save Excel Spreadsheet
- Clean up and quit Excel
CANDIDACY
The user of the VMware Expert System will answer a series of questions to determine system candidacy. Through knowledge capture, the conversation with the VMware Administrator does not need to take place. The knowledge is generally accepted by a community of VMware experts.
RECOMMENDATION
After answering the capacity and candidacy questions, the user receives a final recommendation. The recommendation is only “Virtualize” if capacity is available and candidacy is met.
The interface also displays reasons why a machine is not suitable for virtualization to enable What-If analysis.
DECISION TREE
Modified from VI:OPS P2V Decision Tree
RUNNING THE VMWARE EXPERT SYSTEM
PREREQUISITES
- VMware PowerCLI 4.0 or better
STEPS TO RUN THE VMWARE EXPERT SYSTEM
- Download and extract vmware-expert-system.zip
- Rename launch.tab to launch.bat
- Edit launch.bat, line 2
- Substitute your path to updatespreadsheet.ps1 where you see “C:\users\%username%\Documents\cis645\Project\vmware_expert_system\updatespreadsheet.ps1″, make sure the path is in quotation marks
- Edit updatespreadsheet.ps1, line 11
- Substitute your path to vmware_expert_system.xlsm where you see “C:\users\%username%\Documents\cis645\Project\vmware_expert_system\vwmare_expert_system.xlsm”, make sure the path is in quotation marks
- Run click ‘launch.bat’
- A screen similar to this will appear:

- Launch the spreadsheet “vmware_expert_system.xlsm” and enable macros
- Enter system requirements
- Press “Send Work Order”
EXAMPLE SYSTEM: NEW WEB SERVER
- Enter the hostname: newwebserver
- The hostname must not be already existing and must be a valid hostname (“The Internet Engineering Task Force (IETF)”)
- Enter a functional contact: Andy Hill
- Enter a staff contact: Andy Hill
- Select an Operating System: Windows Server 2003
- Enter a storage requirement: 20 GB
- The minimum storage requirement must be >8 GB and less than the size of a maximum single disk
- Enter a RAM requirement: 1024 MB
- The minimum RAM requirement is 256MB and must be less than one host and still tolerant of a host failure
- Number of Processors: 1
- Must be numeric, greater than or equal to 1, less than or equal to 4
- Number of NICs: 1
- Must be numeric, greater than or equal to 1, less than or equal to 4
- Average CPU utilization: 5%
- Must be numeric, between 0 and 1, if 4 processors are used average utilization cannot exceed 50%
- Average RAM utilization: 256 MB
- Must not exceed 8GB
- Average NIC utilization: 1 MBps
- Must not exceed 100MBps
- Maximum Disk IO: 10 MBps
- Must not exceed 100MBps
- Answer TRUE/FALSE to the following hardware components
- Modems: FALSE
- Fax Cards: FALSE
- License Dongles: FALSE
- Security Dongles: FALSE
- Hardware Encryption: FALSE
- Answer TRUE/FALSE to Vendor Support: TRUE
- Recommendation: Virtualize!
ADD NEW SUPPORTED GUEST OS
VMware’s Guest OS Compatibility Guide (“VMware, Inc.”) is exhaustive and does not line up with Murray State University’s environment. The drop-down list is populated from a hidden worksheet within Excel. For our environment, we limited this drop down to Guests OSes which have regularly maintained templates.
To add, delete, or change an entry in the operating system list follow these steps:
- Toward the bottom of Excel, right click the current worksheet

- From the context menu, select “Unhide…”

- From the Unhide Window, Select ‘Supported Guest Operating Systems’ and press OK

- Navigate to the ‘Supported Guest Operating Systems’ Worksheet. Make changes Column A. Only changes in Column A will be reflected in the spreadsheet. Save your changes.
Future Considerations
- Support for advanced disk layouts
- Get-Template feeding the ‘Supported Guest OS’ worksheet
- 1 click ‘deploy from template’
- Support for tiered storage
- Graphs of compute resources by host and virtual machine
# VMware Expert System Capacity Gathering
# v0.2
# by Andy Hill
# http://virtualandy.wordpress.com
# gathering data for VMware capacity
$viserver = Read-Host "Enter a vCenter server";
Write-Host "Gathering Excel data...1/8"
$excel = new-object -comobject Excel.Application
# Edit this value to the location of your vmware_expert_system.xlsm
$excelfile = $excel.workbooks.open("C:\Users\andy.hill\Documents\cis645\Project\vmware_expert_system\vmware_expert_system.xlsm")
$worksheet = $excelfile.worksheets.item(3) # Select Capacity Worksheet
Write-Host "Clearing existing capacity data...2/8"
# Clear existing data
$worksheet.Range("A5:N65000").Clear() | out-null
$worksheet.cells.item(1,2) = $viserver
Write-Host "Connecting to $viserver, this may take a moment...3/8"
connect-viserver $viserver -erroraction stop -WarningAction SilentlyContinue | out-null
# datastore information
Write-Host "Gathering disk information...4/8"
$i = 5
$disks = get-datastore
foreach($disk in $disks) {
$worksheet.cells.item($i, 1) = $disk.name;
$worksheet.cells.item($i, 2) = $disk.freespaceMB;
$worksheet.cells.item($i, 3) = $disk.capacityMB;
$i++;
}
$disk_count = $i;
$i = 5
Write-Host "Gathering host information...5/8"
# host information
Get-VMHost | %{Get-View $_.ID} | %{
$esx = "" | select Name, NumCpuPackages, NumCpuCores, Hz, Memory
$esx.NumCpuPackages = $_.Hardware.CpuInfo.NumCpuPackages
$esx.NumCpuCores = $_.Hardware.CpuInfo.NumCpuCores
$esx.Hz = $_.Hardware.CpuInfo.Hz
$esx.Memory = $_.Hardware.MemorySize
$esx.Name = $_.Name
$worksheet.cells.item($i, 6) = $esx.Name
$worksheet.cells.item($i, 7) = $esx.NumCpuPackages
$worksheet.cells.item($i, 8) = $esx.NumCpuCores
$worksheet.cells.item($i, 9) = $esx.hz / 1000 / 1000
$worksheet.cells.item($i, 10) = $esx.memory / 1024 / 1024;
$i++;
}
$host_count = $i;
# vm information
$i = 5
Write-Host "Gathering virtual machine information...6/8"
get-vm | % {
$vm = "" | select name, MemoryMB
$worksheet.cells.item($i, 13) = $_.Name
$worksheet.cells.item($i, 14) = $_.MemoryMB
$i++;
}
# Create the totals and amount utilized
$worksheet.cells.item(($i+1),13) = "Total"
$worksheet.cells.item(($i+1),14) = "=sum(N6:N" + $i + ")"
$vm_count = $i;
Write-Host "Writing values to Excel Spreadsheet...7/8"
#add some formatting
$worksheet.cells.item(($disk_count + 2), 1) = "Datastore with most free space";
$worksheet.cells.item(($disk_count + 3), 1) = "Memory (MB) Available";
$worksheet.cells.item(($disk_count + 4), 1) = "Memory Utilization %";
$worksheet.cells.item(($disk_count + 5), 1) = "Storage Available (GB)";
$worksheet.cells.item(($disk_count + 6), 1) = "Storage Utilization %";
$worksheet.cells.item(($disk_count + 7), 1) = "Most Storage Available on a datastore (GB)";
# add the formulas
$worksheet.cells.item(($disk_count + 2), 2) = "=INDEX((A5:A" + $disk_count + "),MATCH(MAX(B5:B" + $disk_count + "),B5:B" + $disk_count + ",0))";
$worksheet.cells.item(($disk_count + 3), 2) = "=SUM(J5:J" + $host_count + ") - N" + ($vm_count+1);
$worksheet.cells.item(($disk_count + 4), 2) = "=N" + ($vm_count+1) + "/SUM(J5:J" + ($host_count-1) + ")"; # n-1 hosts for HA failover
$worksheet.cells.item(($disk_count + 5), 2) = "=SUM(B5:B" + $disk_count + ")/1024";
$worksheet.cells.item(($disk_count + 6), 2) = "=1-SUM(B5:B" + $disk_count + ")/SUM(C5:C" + $disk_count + ")";
$worksheet.cells.item(($disk_count + 7), 2) = "=INDEX((B5:B" + $disk_count + "),MATCH(MAX(B5:B" + $disk_count + "),B5:B" + $disk_count + ",0))/1024";
Write-Host "Saving Excel Spreadsheet...8/8";
# Select main worksheet
$worksheet = $excelfile.worksheets.item(1);
# Update the 'new virtual machine' worksheet with capacity data
$worksheet.cells.item(8,4) = "=Capacity!B" + ($disk_count + 5) + "-'New Virtual Machine'!B8";
$worksheet.cells.item(8,7) = "=MAX(Capacity!B5:" + "B" + ($disk_count - 1) + ")/1024";
$worksheet.cells.item(9,4) = "=(Capacity!B" + ($disk_count +3) + ")/1024";
$worksheet.cells.item(29,2) = "=Capacity!B" + ($disk_count + 2);
$excel.activeworkbook.save();
$excel.quit();
This was our shop’s first real dive into kickstarts. The material I read in Visible Ops really emphasized track able/repeatable processes for setting up systems. One great way to do that is through kickstart scripts and some kind of version control system. We used Subversion.
I’ve edited a few parts out of this, but I spent a while finding several kickstart scripts that accomplished parts of what we needed. I highly customized one for our environment.
What it does:
- Configures licensing for the host using a license server
- Configures NTP
- Adds users, expires their accounts and configures a sudo group
- MOTD
- Configures NICs and VMware ESX Networking
- Creates a script to download and install IBM iSCSI Host Utilities Kit
- Creates a script to download and install QLA4050C BIOS and firmware updates
Thanks to Leo’s ESX 3.5 Kickstart script – part 3.
You will need to download IBM iSCSI Host Utilities Kit from IBM and the QLA4050C BIOS and Firmware from QLogic to a server with scp capabilities.
# make sure this file is UNIX formatted so the line breaks can be handled. install lang en_US.UTF-8 langsupport --default en_US.UTF-8 keyboard us mouse genericwheelps/2 --device psaux skipx network --device eth0 --bootproto static --ip <ip> --netmask <netmask> --gateway <gw> --nameserver <dns1>,<dns2> --hostname <hostname> --addvmportgroup=0 --vlanid=0 # Encrypted root password rootpw --iscrypted <password> firewall --enabled authconfig --enableshadow --enablemd5 timezone America/Chicago bootloader --location=mbr # The following is the partition information you requested # Note that any partitions you deleted are not expressed # here so unless you clear all partitions first, this is # not guaranteed to work vmaccepteula # test license server vmlicense --mode=server --server=27000@<vc> --edition=esxFull --features=vsmp,backup reboot firewall --enable clearpart --exceptvmfs --drives=sda part /boot --fstype ext3 --size=100 --ondisk=sda part / --fstype ext3 --size=1800 --grow --maxsize=5000 --ondisk=sda part swap --size=544 --grow --maxsize=544 --ondisk=sda part /var/log --fstype ext3 --size=100 --grow --ondisk=sda %packages grub @base %post cat > /etc/rc.d/rc3.d/S11servercfg << EOF #Configure NTP echo "Configuring NTP" chkconfig --level 345 ntpd on echo "restrict kod nomodify notrap noquery nopeer" > /etc/ntp.conf echo "restrict 127.0.0.1" >> /etc/ntp.conf echo "server <ntp> >> /etc/ntp.conf echo "driftfile /var/lib/ntp/drift" >> /etc/ntp.conf echo <ntp>" > /etc/ntp/step-tickers service ntpd start #Adding users with default password "changeme" generated with `openssl passwd changeme` echo "Adding users" adduser <user1> -p MKgX23V6snwoc chage -d 0 -M 99999 <user1> adduser <user2> -p MKgX23V6snwoc chage -d 0 -M 99999 <user2> adduser <user3> -p MKgX23V6snwoc chage -d 0 -M 99999 <user3> usermod -G wheel user usermod -G wheel user2 usermod -G wheel user3 echo "Done adding users" echo "Configuring sudoers" cat > /etc/sudoers << SUDO # sudoers file. # # This file MUST be edited with the 'visudo' command as root. # # See the sudoers man page for the details on how to write a sudoers file. # # Host alias specification # User alias specification # Cmnd alias specification # Defaults specification Defaults syslog=local2 # User privilege specification root ALL=(ALL) ALL # Uncomment to allow people in group wheel to run all commands %wheel ALL=(ALL) ALL # Same thing without a password # %wheel ALL=(ALL) NOPASSWD: ALL # Samples # %users ALL=/sbin/mount /cdrom,/sbin/umount /cdrom # %users localhost=/sbin/shutdown -h now SUDO echo "Done configuring sudoers" echo "Configuring MOTD" echo "MOTD HERE" > /etc/motd echo "Done configuring MOTD" echo "Configuring hosts file" echo "ip hostname.fqdn hostname" >> /etc/hosts echo "Done configuring hosts file" # we have 6 nics echo "Configuring NIC duplex/speeds" /usr/sbin/esxcfg-nics -s 1000 -d full vmnic0 /usr/sbin/esxcfg-nics -s 1000 -d full vmnic1 /usr/sbin/esxcfg-nics -s 1000 -d full vmnic2 /usr/sbin/esxcfg-nics -s 1000 -d full vmnic3 /usr/sbin/esxcfg-nics -s 1000 -d full vmnic4 /usr/sbin/esxcfg-nics -s 1000 -d full vmnic5 echo "Configuring NIC duplex/speeds" echo "Configuring networking" # VMNetwork /usr/sbin/esxcfg-vswitch -a vSwitch1 # Blind Switch /usr/sbin/esxcfg-vswitch -a vSwitch2 # VMkernel /usr/sbin/esxcfg-vswitch -a vSwitch3 # Add NIC 1 and 3 to vSwitch1 (VMNetwork) /usr/sbin/esxcfg-vswitch -L vmnic1 vSwitch1 /usr/sbin/esxcfg-vswitch -L vmnic3 vSwitch1 # Add NIC 2 to vSwitch0 (Service Console, already contains NIC 0) /usr/sbin/esxcfg-vswitch -L vmnic2 vSwitch0 # Add NIC 4 and 5 to vSwitch3 (VMkernel) /usr/sbin/esxcfg-vswitch -L vmnic4 vSwitch3 /usr/sbin/esxcfg-vswitch -L vmnic5 vSwitch3 # Give appropriate port group labels to vSwitches /usr/sbin/esxcfg-vswitch -A "Blind Switch" vSwitch2 /usr/sbin/esxcfg-vswitch -A "VMkernel" vSwitch3 /usr/sbin/esxcfg-vswitch -A "VMNetwork" vSwitch1 # Configure IP addresses for service console and VMkernel /usr/sbin/esxcfg-vswif -i <ip> -n 255.255.255.0 vswif0 /usr/sbin/esxcfg-vmknic -a -i <vmotion address> -n 255.255.255.0 VMotion /usr/sbin/esxcfg-vswif -E # Enable SSH Client through firewall /usr/sbin/esxcfg-firewall -e sshClient echo "Done configuring networking" # generate script to download/install HUK, make it executable echo "Generating host utilities download/install script" cat > /root/huk-install.sh << HUK cd /home/user/ scp user@host:/home/user/ibm_iscsi_esx_host_utilities_3_1.tar.gz . tar -zxf ibm_iscsi_esx_host_utilities_3_1.tar.gz cd ibm_iscsi_esx_host_utilities_3_1 ./install echo "Done generating host utilities download/install script" HUK chmod a+x /root/huk-install.sh # generate script to download/install iscli and firmware/BIOS updates, make it executable echo "Generating iscli and firmware update script" cat > /root/iscli-script.sh << ISCLI cd /home/user/ scp user@host:/home/user/iscli-1.2.00-15_linux_i386.install.tar.gz user@host:/home/user/ql4022rm.BIN user@host:/home/user/VER4032_03_00_01_53.zip . tar -xvzf iscli-1.2.00-15_linux_i386.install.tar.gz unzip VER4032_03_00_01_53.zip chmod +x iscli.dkms.install.sh ./iscli.dkms.install.sh install # HBA 0 /usr/local/bin/iscli -f 0 /home/user/qla4022.dl sleep 5 /usr/local/bin/iscli -bootcode 0 /home/user/ql4022rm.BIN sleep 5 # HBA 1 /usr/local/bin/iscli -f 1 /home/user/qla4022.dl sleep 5 /usr/local/bin/iscli -bootcode 1 /home/user/ql4022rm.BIN sleep 5 reboot ISCLI echo "Done generating iscli and firmware script" # Moves this file so it will not be called on next host boot mv /etc/rc.d/rc3.d/S11servercfg /root/unsw-setup.sh rm -f /root/system-info EOF /bin/chmod a+x /etc/rc.d/rc3.d/S11servercfg
Here’s the ’script’ read from while doing our ESX upgrades:
In general:
- Do lots of up front work with kickstarts and analysis
Each ESX Host
- Put host in maintenance mode
- Shut Down
- File request with storage administrator to make only boot LUN is visible to host as we are about to do some potentially damaging operations
- Put in new HBA (QLA4050)
- Boot to floppy diskette with QLA 4050 BIOS firmware updates
- Upgrade HBA BIOS
- iFlash
- If the system detects a QLx40xx controller, it displays the following message:
- QLx40xx Adapter found at I/O address: xxxxxxxx
- You will need to enter the adapter address
- Select “FB” to flash the BIOS. The iFlash program will write flash to the adapter using ql4022rm.BIN found in the same directory.
- Reboot. Press CTRL+Q on the second (new) HBA to manage boot settings
- Configure Host Adapter according to IP / initiator name
- Configure iSCSI Target
- You will need:
- iSCSI name
- IP Address
- Subnet Mask
- Default Gateway
- iSCSI Target
- IP Address:port
- Target Name
- Host Boot Settings = MANUAL
- Exit and Reboot
- Insert ESX 3.5 U4 CD (We don’t have PXE boot available yet)
- Reboot system to boot from ESX 3.5 U4 CD
- Install ESX 3.5 U4
- type ‘
esx ks=<url to kickstart file> ksdevice=eth0 method=cdrom‘ - More on the kickstart file is here
- Press enter. This installs ESX with all appropriate settings. Ask someone for the root password.
- Log in as root
- sh iscli-script.sh (from the kickstart)
- sh huk-install.sh (from the kickstart)
- Launch VirtualCenter
- Disconnect the host from VirtualCenter (Right click, disconnect)
- Reconnect the host to VirtualCenter (Right click, connect)
- Enter maintenance mode (so no VMs are vMotioned on)
- VMotion doesn’t get set up correctly via kickstart because the host does not have shared storage. Contact the SAN Administrator to make the other ESX LUNs visible and rescan.
- Delete the VMKernel Switch
- Add the VMkernel switch (nic4 and nic5), enabling vmotion. <IP address> subnet <subnet> – no default GW since not routed
- Configuration -> Memory -> Increase Service Console RAM to 800MB
- Configure Storage Paths in Active/Passive
- Reboot Host (to enact Service Console RAM changes)
- Exit Maintenance Mode
vCenter Database Server
- Manually backup VMware database
BACKUP DATABASE [VMWare] TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\VMWare\VMWare_backup_preupgrade.bak' WITH NOFORMAT, NOINIT, NAME = N'VMWare-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10 GO
- Manually backup UpdateManager
BACKUP DATABASE [UpdateManager] TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\UpdateManager\UpdateManager_backup_preupgrade.bak' WITH NOFORMAT, NOINIT, NAME = N'UpdateManager-Full Database Backup', SKIP, NOREWIND, NOUNLOAD, STATS = 10 GO
- Grant MSDB owner permissions for SQL user
USE [msdb]GO EXEC sp_addrolemember N'db_owner', N'USER' GO
vCenter Server
- Log in as local administrator
- Back up the License File
copy "C:\Program Files\VMware\VMware License Server\Licenses\vmware.lic" \\server\share\vmware-license-backup.lic
- Mount vCenter DVD ISO
- Back up sysprep files for templates
copy C:\Documents and Settings\All Users\Application Data\VMware\VMware VirtualCenter\sysprep\.* \\server\share
- Run vCenter Install
- Reboot Server
- Notify users of upgrades
- Schedule times for VMware Tools Upgrades
vCenter Database Server
- Revoke MSDB owner permissions for SQL user
USE [msdb]GO
EXEC sp_droprolemember N'db_owner', N'USER'
GO
We filed support requests with IBM and VMware and went through a very lengthy process without any results.
Each of our hosts had the following iSCSI HBAs:
- QLA4010
- QLA4050C
A while ago we found out QLA4010 is not on the ESX 3.5 HCL even though it runs with a legacy driver.
As our virtual environment grew we noticed storage performance lagging. This was particularly evident with our Oracle 10G Database server running our staging instance of Banner Operational Data Store. We were seeing 1.1 MB/sec and slower for disk writes.
We opened a case with VMware support and later with IBM support. We provided lots of data to VMware and IBM while no one mentioned the unsupported HBA. No one at IBM mentioned it either. VMware support referred us to KB# 1006821 to test virtual machine storage I/O performance.
We ran HD Speed in a new VM mimicing the setup using RDM and using a dedicated LUN. Similar results.
We ran HD Speed on the same RDM on a physical machine and got 45 MB/sec.
All of our hosts had an entry like this in the logs (grep -i abort /var/log/vmkernel* | less)
vmkernel.36:Mon DD HH:ii:ss vmkernel: 29:02:31:16.863 cpu3:1061)LinSCSI: 3201: Abort failed for cmd with serial=541442, status=bad0001, retval=bad0001
Hundreds, if not thousands of these iSCSI aborts in the log files. We punted to IBM and they gave us the recommendation of running Host Utilities Kit. This optimizes HBA settings specific to IBM storage systems.
My recommendation ended up being two fold: Upgrade the ESX hosts because we were on an old build (95xxx) and replace the QLA4010 with a QLA4050C on each host.
Now that our ESX upgrade is complete we are seeing much better performance from our iSCSI storage.
Yesterday I made a mistake. We have a virtual machine set up to test Spacewalk which runs CentOS.
It has a virtual disk for this OS on datastore1 and a virtual disk for the data on datastore2. datastore1 had 11 gb free and datastore2 had 300 gb free. I snapshotted the VM, we did some work, and I committed the snapshot. Except it didn’t work. Now the machine won’t stay booted.I remembered reading something from Yellow-Bricks about disk space and snapshots. Oops. Since this VM was on an ESXi host, there was no service console commands to commit the snapshot.
This error popped up, and the VM would power down:
There is no more space for the redo log of VMNAME-000001.vmdk.
I freed up some space on datastore1, but I couldn’t find how to commit the snapshot. There were several -delta.vmdk files in the virtual machine’s folder on datastore1.
Solution: After freeing up some disk space, I created another snapshot from the VI Client. Then I immediately when to “Delete All”. This got rid of the orphaned snapshot as well as the newly created one.
AutoPager is a Firefox extension which follows the “Next” links on lots of pages and loads them inline. If you’re already using the extension, go to AutoPager -> Update Setting -> Update Setting Online.
The authors just added VMTN forums and NetApp Technology Network to their supported sites. This means if you’re reading a long thread you don’t have to click next. You can just keep scrolling — the next page is loaded inline.
It also works on thread lists.
This is a screenshot of the “Loading” indicator in the bottom left. Once you scroll so far, it automatically shows up, then fetches the next page.
In our organization, the storage administrator is completely separate from the VI Administrator. This process requires some coordination with the storage administrator. Here is our process for restoring a VM from our SAN snapshots. A lot of this information was gleamed from Scott Lowe’s posts on FlexClones.
Unfortunately, we do not have SMVI (the jaw dropping video demo is here) at this moment. It appears NetApp has made this process trivial with that application. This is how we’re making it work on a limited budget.
Step 0 – Determine Snapshot to clone from
Working with the VMware admin, determine which Snapshot to clone from based on timestamp and LUN
Step 1 – Create LUN Clone
- Telnet to the filer
- Run this command to create LUN clone –
lun clone create /vol/volume_name/lun_clone_name -o noreserve -b /vol/volume_name/original_lun_name parent_snapshot_name - Verify new LUN is created using FilerView in a browser
Step 2 – Map clone LUN
- Log into FilerView for the filer
- In left column click on LUNS, then Manage
- Click on the name of the new LUN clone
- Click on Map LUN near the top
- Click on Add Groups to Map, and add to appropriate group
- Type a number (we typically use 99) into the box labeled LUN ID and click Apply
Step 3 – Enable Volume Resignature
- Launch VirtualCenter
- From VC, select a host
- Select the configuration tab
- Select advanced
- Navigate to LVM
- Change the value of
LVM.EnableResignatureto 1 (on, the default value is 0)
Step 4 – Rescan for the new LUN
- From the Configuration tab on a selected host, Navigate to Storage Adapters
- Select “Rescan”
- The recovered VMFS datastore will appear with a name similar to “snap_*”
-
From here, there are two options:
- Add the virtual machine to inventory and run from the recovered LUN
- Copy the virtual machine’s folder to another LUN, then add to inventory
- It is recommended that you copy the virtual machine’s folder to another LUN (non snap_*), and then add the recovered virtual machine to inventory.
Step 5 – Clean up
- Disable
LVM.EnableResignature– repeat step 1 of this document, but change the value back to 0. - Ensure all VMs running on the recovery LUN are powered off
- From VC, select a host
- Select the configuration tab
- Select Storage
- Select the recovery LUN and click Remove
- Delete the LUN clone after VMware admin has finished removing
The Virtual Machine will be brought up as if it went down from a “dirty” shutdown. In a lot of cases, this is okay. For write intensive applications (like databases) you may have to go a few steps farther in restoring functionality.
Here’s my PlanetV12n Wish List (in no particular order):
- Provide feed customization. Strategy/Administration/Business Case/etc. Virtualization has turned into an extremely broad topic. Too much noise in the feed reader is a loss of value to PlanetV12n.
- Provide more virtualization related feeds from vendors like EMC, NetApp, Dell, and IBM.
- Require full articles. If there is resistance on this, just politely remind publishers that advertising is available via RSS
- Give us the option of having OPML output of PlanetV12n. Personally, I would prefer OPML-only, it gives users more control over what feeds they want to see. OPML can be imported into almost any feed reader. Lots of the bloggers on PlanetV12n are very interested in their subscriber statistics. Being published on PlanetV12n drives those numbers down.
My ideal setup for PlanetV12n, a form to generate an OPML file I can add to Google Reader. VMware’s site is full of these forms, so adding another can’t be that bad right? ;-)
Select your role within IT: (checkboxes) Business / Strategy / Administration / Performance / Disaster Recovery / Evangelist / etc.
Tell us about your VMware Products: (checkboxes) ESX / ESXi / Workstation / Fusion / etc
Tell us about your vendors: IBM / Dell / NetApp / EMC / etc
… the list goes on. This could be useful for VMware’s marketers as well as end users.






