PowerShell: DataOnTAP and SID Convertions

This morning while standing up a new vScan A/V server I wanted to look up our McAfee service account.  I knew the account would be a domain account, and I knew it would be a member of the backup operators group on the filer.  With that in mind I ran the following.

[0:4]> Get-NaDomainUser -Group "Backup Operators"

SID
---
S-1-5-21-XXXXXXXX-XXXXXXXXX-XXXXXXXXX-112477
S-1-5-21-XXXXXXXX-XXXXXXXXX-XXXXXXXXX-111419
S-1-5-21-XXXXXXXX-XXXXXXXXX-XXXXXXXXX-146727

Well that’s rather useless… Unfortunately, the OnTAP API doesn’t provide a means to convert a SID to a NTAccount.  This is normally accomplished via the “cifs lookup” command on the Ontap CLI, but that doesn’t help us much from the toolkit.  Fortunately .Net provides a native means to perform this conversion.  This isn’t new to anyone who has been following PowerShell for a while (/\/\o\/\/ first posted these function way back in the Monad days), but that doesn’t make them any less useful!

Function ConvertTo-NTAccount
{
    Param(
        [Parameter(Mandatory=$true,
            HelpMessage="Enter the Sid to translate",
            ValueFromPipeLine=$true,
            ValueFromPipelineByPropertyName=$true
        )]
        [string]
        $SID
    )
    Process {
        $SIDObject = New-Object system.security.principal.securityidentifier($SID)
        write-output $SIDObject.translate([system.security.principal.ntaccount])
    }
}
Function ConvertTo-SID
{
    Param(
        [Parameter(Mandatory=$true,
            HelpMessage="Enter the NTAccount to translate in the form of domain\account",
            ValueFromPipeLine=$true,
            ValueFromPipelineByPropertyName=$true
        )]
        [string]
        $NTAccount
    )
    Process {
        $NTAccountObject = New-Object system.security.principal.NtAccount($NTaccount)
        write-output $NTAccountObject.translate([system.security.principal.securityidentifier])
    }
}

Armed with my trusty functions Let's try this again!
[0:15]> Get-NaDomainUser -Group "Backup Operators" | ConvertTo-NTAccount

Value
-----
GetAdmin\svcAV
GetAdmin\svcBackup
GetAdmin\svcMNV

Now that’s more like it!  This is what I Love about powershell.  In the past I would have had to push back on my sales rep, who would have inturn pushed back on the development team.  fast forward a year, and maybe I would have a workaround.  Or I would have had to try and glue a couple third party exe together (yuck). With PowerShell if I don’t like something I simply extend it in script.  No development, nothing complicated, just a couple line of PowerShell.  Best of all I can then provide this to the vendor as a concreate example of what I want in the next release (hint hint NetApp cifs lookup needs to be in the SDK!)

It really is just great stuff.
~Glenn

NetApp
Powershell

Comments (1)

Permalink

Cacti: Monitor protocol statistics for NetApp volumes

I have made no secret that I use two applications daily to monitor my infrastructure: Nagios and Cacti. I have created a fair number of scripts (and hopefully publishing more soon) to help Nagios monitor the different parts of the infrastructure, however I haven’t published many of my Cacti scripts previously.

One of the most useful is the config that I use to monitor the different protocol stats for volumes. I created an indexed query so that the single script, and accompanying XML file, are capable of monitoring all the volumes, and I can select which graphs to create for each volume. The polling script is loosely based off of the multi-protocol realtime volume statistics script that I created some time ago.

Download the template and script(s) here.

Some examples…

Total Operations, Latency
Cacti Volume Total Operations  Cacti Volume Total Latency
CIFS Operations, Latency
Cacti Volume CIFS Operations  Cacti Volume CIFS Latency
NFS Operations, Latency
Cacti Volume NFS Operations  Cacti Volume NFS Latency
iSCSI Operations, Latency
Cacti Volume iSCSI Operations  Cacti Volume iSCSI Latency

NetApp
Perl

Comments (7)

Permalink

Nagios: Checking for abnormally large NetApp snapshots

My philosophy with Nagios checks, especially with the NetApp, is that unless there are extenuating circumstances then I want all volumes (or whatever may be being checked) to be checked equally and at the same time. This means I don’t want to have to constantly add and remove checks from Nagios as volumes are added, deleted and modified. I would much rather have one check that checks all of the volumes and reports on them en masse. This means I don’t have to think about the check itself, but rather, only what it’s checking.

One of the many things that I regularly monitor on our multitude of NetApp systems is snapshots. We have had issues, especially with LUNs, where the snapshots have gotten out of control.

In order to prevent this, or at least hope that someone is watching the screen…, I wrote a quick script that checks to see if the total size of snapshots on a volume exceed the snap reserve. Since not all of our volumes have a snap reserve, I also put in the ability to check the size of the snaps against the percentage of free space left in the volume.

This last measure is a little strange, but I think it works fairly well. Take, for example, a 100GB volume. If it is 50% full (50GB), there is no snap reserve and the alert percentage is left at the default of 40% free space, then the alert will happen when snapshots exceed about 15GB. “But that’s not 40% of the free space”, I hear you saying. Ahhh, but it is…you see as the snapshot(s) grow, there is less free space, which means that it takes a larger percentage as the free space shrinks. So at 15GB of snapshots, there would be 35GB of free space, and 40% of 35GB is 14GB.

This causes the alerts to happen earlier than you may expect at first. You can adjust this number to be a percentage of the total space in the volume if you like…however, why not just set a snap reserve at that point? I chose to make the script this way in order to attempt to keep a little more free space in the volume, while not making a snap reserve mandatory.

One last word…please keep in mind this script does not check for a volume being filled, you should have other checks for that. This merely checks to see if snapshots have exceeded a threshold of space in the volume to prevent them from taking up too much space.

Bring on the Perl…

Continue Reading »

NetApp
Perl

Comments (2)

Permalink

PowerShell: DataOnTap Realtime Multiprotocol Volume Latency

I had some free time yesterday morning as I couldn’t sleep after the long weekend. I used the time to dig into into the DataOnTap PowerShell Toolkit.  I started with an easy port of one of Andrews performance monitoring scripts.  I won’t go into as it’s very straight forward, but I will say so far I am very pleased with the DataOnTAP toolkit. 

<#
.SYNOPSIS
    Get the different protocol latencies for a specified volume.
.DESCRIPTION
    Get the different protocol latencies for a specified volume.
.PARAMETER Volume
    Volume to retrieve the latency.
.PARAMETER Protocol
    Protocol to collect latency for valid values are 'all','nfs','cifs','san','fcp','iscsi'
.PARAMETER Interval
    The interval between iterations in seconds, default is 15 seconds
.PARAMETER Count
    the number of iterations to execute, default is infinant
.PARAMETER Controller
    NetApp Controller to query.
.EXAMPLE
    .Get-NaVolumeLatency.ps1 -Volume vol0 

    Get the average latency for all protocols on vol0
.EXAMPLE
    Get-NaVol | .Get-NaVolumeLatency.ps1 -Interval 5 -count 5 | ft

    Get the average latency for all protocols, all volumes, 5 samples, 5 seconds apart.
.EXAMPLE
    .Get-NaVolumeLatency.ps1 -Volume vol0 -protocol nfs

    Get the NFS latency for vol0
#>
[cmdletBinding()]
Param(
    [Parameter(Mandatory=$true,
        HelpMessage="Volume name to retrieve latency counters from.",
        ValueFromPipelineByPropertyName=$true,
        ValueFromPipeLine=$true
    )]
    [Alias("Name")]
    [string]
    $Volume
,
    [Parameter(Mandatory=$false)]
    [ValidateSet('all','nfs','cifs','san','fcp','iscsi')]
    [string]
    $Protocol='all'
,
    [Parameter(Mandatory=$false)]
    [int]
    $Interval=15
,
    [Parameter(Mandatory=$false)]
    [string]
    $count
,
    [Parameter(Mandatory=$false)]
    [NetApp.Ontapi.Filer.NaController]
    $Controller=($CurrentNaController)
)
Begin
{
    IF ($Protocol -eq 'all')
    {
       $Counters = @(
            @{
                Counter = 'read_latency'
                Base     = ''
                unit     = ''
            }
       ,
            @{
                Counter = 'write_latency'
                Base     = ''
                unit     = ''
            }
       ,
            @{
                Counter = 'other_latency'
                Base     = ''
                unit     = ''
            }
       ,
            @{
                Counter = 'avg_latency'
                Base     = ''
                unit     = ''
            }
        )
    }
    Else
    {
        $Counters = @(
            @{
                Counter  = "$($Protocol.ToLower())_read_latency"
                Base     = ''
                unit     = ''
            }
        ,
            @{
                Counter = "$($Protocol.ToLower())_write_latency"
                Base     = ''
                unit     = ''
            }
        ,
            @{
                Counter = "$($Protocol.ToLower())_other_latency"
                Base     = ''
                unit     = ''
            }
        )
    }
    foreach ($c in $Counters)
    {
        Get-NaPerfCounter -Name 'volume' -Controller $Controller |
            Where-Object {$_.name -eq $c.Counter} |
            ForEach-Object {
                $c.Base = $_.BaseCounter
                $c.unit = switch ($_.unit) {
                    "microsec"  {10000}
                    "millisec"  {1}
                }
            }
    }
}
Process
{            

    # Check if volume exists.
    if (-Not ((get-navol -Controller $Controller|select -ExpandProperty Name) -contains $Volume)) {
        Write-Warning "$volume doesn't exist!"
        break;
    }
    $iteration = 0
    $first = $null
    #loop untill we're done or Cntr ^c 
    while ((!$Count) -or ($iteration -le $count))
    {
        $second = New-Object Collections.HashTable
        Get-NaPerfData -Name volume `
            -Instances $Volume `
            -Controller $Controller `
            -Counters ($Counters|%{$_.base,$_.counter}) |
            Select-Object -ExpandProperty Counters |
            ForEach-Object {
                $second.add($_.Name,$_.value)
            }            

        if ($first -and $second)
        {
            $results = "" | Select-Object -Property ($Counters|%{$_.base,$_.counter})
            foreach ($v in $Counters)
            {
                IF ($second[$v.Base] -gt $first[$v.Base])
                {
                    #calculate the average over our interval
                    $avg = ($second[$v.Counter] - $first[$v.Counter])/($second[$v.Base] - $first[$v.Base])
                    #conver to ms
                    $results."$($v.Base)" = [math]::Round((($second[$v.Base] - $first[$v.Base])/$Interval))
                    $results."$($v.Counter)" = ("{0} ms" -f [math]::Round($avg/$v.unit))
                }
                Else
                {
                    $results."$($v.Base)" = 0
                    $results."$($v.Counter)" = "0 ms"
                }
            }
            Write-Output $results| Add-Member NoteProperty 'Volume' $Volume -PassThru
        }
        Start-Sleep -Seconds $Interval
        $first = $second.clone()
        $iteration++
    }
}

~Glenn

NetApp
Powershell

Comments (3)

Permalink

Monitoring for orphaned snapshots left by SMVI

NetApp’s SnapManager for Virtual Infrastructure (SMVI) is a great product, but it’s messy. If it encounters the any error, it seemingly forgets to delete the virtual machine snapshots from the Virtual Infrastructure before dying.

To prevent many orphans (I’ve seen as many as 20 on a single virtual machine) from happening, I created a quick Nagios check that simply alerts when it sees them.

This script is very elementary. It very simply uses a regex to check for any snapshots that match the default SMVI naming convention. For each one it finds, a counter is incremented. If any are found, the script returns an error to Nagios, which causes an alert to be sent.

#!/usr/bin/perl -w
#
# check_vi_smvi_snapshots.pl - written by Andrew Sullivan, 2010-06-16
#
# Please report bugs and request improvements at http://get-admin.com/blog/?p=1059
#
# A simple script to look for snapshots that match the name pattern that smvi uses.
# We are merely pulling a list of all snapshots, searching for the string "smvi" in 
# the name, if it's found, we return a warning condition.  This could lead to a 
# "false" positive if it runs while a snapshot series is still ongoing, but since
# the smvi snaps should be very short lived the condidition will not last unless
# the snap is left.
#
# Example:
#   ./check_vi_smvi_snapshots.pl --server your.esx.host --username you --password secret
#
 
use strict;
use warnings;
 
use FindBin;
use lib "$FindBin::Bin/../";
 
use VMware::VIRuntime;
 
# substitute the location of your nagios perl library
use lib "/usr/lib64/nagios/plugins";
use utils qw(%ERRORS);
 
Opts::parse();
Opts::validate();
 
Util::connect();
 
main();
 
Util::disconnect();
 
sub main {
 
	# the number of smvi snapshots
	my $smviSnaps = 0;
 
	# for setting the type of exit we want
	my $exitCondition = "";
 
	# we need MORs for each of the VMs on the host
	my $VMs = Vim::find_entity_views( view_type => 'VirtualMachine' );
 
	foreach my $vm (@$VMs) {
		if ($vm->snapshot) {			
			foreach my $childSnapshot (@{$vm->snapshot->snapshotInfo->rootSnapshotList}) {
				$smviSnaps += getSnaps($childSnapshot);
			}
 
		} else {
			#print $vm->name . " has no snapshots\n";
		}
	}
 
	if ($smviSnaps > 0) {
		print "WARNING - " . $smviSnaps . " SMVI snapshots exist.\n";
		$exitCondition = "WARNING";
 
	} else {
		print "OK - No SMVI snapshots exist.\n";
		$exitCondition = "OK";
 
	}
 
	Util::disconnect();
	exit $ERRORS{ $exitCondition };
}
 
sub getSnaps {
	my ($snapshotTree) = @_;
	my $snapcount = 0;
 
	# uncomment for debugging
	#print "Found snap: " . $snapshotTree->{name} . "\n";
 
	if ( $snapshotTree->{name} =~ /smvi/ ) {
		$snapcount++;
	}
 
	if ($snapshotTree->childSnapshotList) {
		foreach my $childSnapshot (@{$snapshotTree->childSnapshotList}) {
			$snapcount += getSnaps($childSnapshot);
		}
	}
 
	return $snapcount;
}

I’ve set the check to execute once an hour in my environment, as I don’t feel that granularity finer than that is needed…an hour’s worth of change is ok for an SMVI snapshot for me.

Nagios
NetApp
Perl
Scripting
Virtulization

Comments (0)

Permalink

PowerShell: Import NetApp AutoSupport

The first step to any problem is getting said problem into PowerShell! We all know the usual players here WMI, ADSI, .NET, COM, etc… but what about good old text. I get the impression text has been ignored as legacy. When text does get a little attention it is almost always treated like the unix world.  Get-Content | Select-String used in place of cat|grep… I offer a different approach, in my opinion you haven’t really ingested something until it’s in the form of an object. Ala, Import-ASUP, a PowerShell script to ingest a NetApp autosupport. Give it a try and look around at the object it spit’s out. You may be surprised just how much information we all beam back to the mother ship every week!

Continue Reading »

NetApp
Powershell
Scripting

Comments (3)

Permalink

PowerCLI: Remove SMVI snapshots

I wrote this script about a year ago to deal with errant SMVI snapshots, and was drafting this blog post when my rss feed caught me off guard. It appears Matt Robinson has beat me to the punch line.   He has produced a Perl script that cleans up any leftover snapshots, but if you favor a PowerShell approach… I give you Remove-SMVISnapshots.
Continue Reading »

PowerCLI
Powershell
VMware
Virtulization

Comments (5)

Permalink

PowerCLI: Speed boost, Find VM Snapshots by Name.

I use NetApp SnapManager for Virtual Infrastructure(SMVI) to back up 95% of my environment. SMVI isn’t perfect, and it’s kinda a pain, but it gives me the ability to back up 100-250 vm’s in less than 10min! Well the actual NetApp snapshot takes seconds. The rest of the time is spent waiting on ESX snapshots. The only real downside here is from time to time SMVI will fail to delete the ESX snapshots. I have been using PowerCLI to find and delete these snapshot, but the cmdlets are just too slow, for what I’m trying to do. I’ll post my Finalized SMVI cleanup script later. Until then, I give you finding snapshots really fast!

Continue Reading »

NetApp
PowerCLI
Powershell
Scripting
Uncategorized
VMware

Comments (4)

Permalink

PoshOnTap: Manage NetApp SAN from PowerShell Demo

Well it’s the night before VMworld, and I can’t sleep, so I’m catching up on my blog.  A while back I did a presentation to the PowerShell Virtual Users Group.  I demoed my PoshOnTap PowerShell Module, a lot has changed since that presentation.  Mainly I have a new version, but I’m still in the process of trying to get the nod from NetApp.  So in the meantime if you wished you could manage a NetApp SAN from PowerShell go check it out I have the first 30 minutes.  http://marcoshaw.blogspot.com/2009/08/windows-powershell-virtual-user-group.html

With a little luck I’ll get the nod to redistribute the manageontap c# assemblies, and I’ll post Version 2 of my PoshOnTap module!

~Glenn

NetApp
Powershell

Comments (0)

Permalink

Really NetApp?!? You didn’t use your own SDK?

So, this post irked me. Not because of the poster or his post (honest Andy, if you ever read this, I have nothing against you or your post! I’m happy to see another VMware/NetApp blogger!), but because of the script he referenced and the problem encountered. He has a good solution, but the problem shouldn’t exist.

You see, I hate RSH. I don’t know why (well, it is quite insecure, and it can require some configuration), but I hate it. SSH is only marginally better in this case…sure it’s secure, but you have to auth each time, and if you don’t (ssh keys), well, it’s only a little better than RSH (comms are encrypted, but compromise of a single account can lead to bad things on many hosts). The script that is referenced, one that NetApp recommends that admins use to verify that their aggregates have enough free space to hold the metadata for the volumes in OnTAP 7.3 (the metadata gets moved from the volumes to the aggregate in 7.3), uses RSH to execute commands that are then parsed in a somewhat rudimentary way to get information.

Sure, it’s effective, but it’s far from graceful…especially when you have a perfectly good and effective SDK at your disposal.

I was kind of bored, so I decided to rewrite the script using the SDK. This is the end result. It reports the same data, but uses the SDK to gather all of the necessary information to make a determination for the user. The new script is significantly shorter (10KB vs 25KB, 380 lines vs 980), and it requires only one login.

Thanks to NetApp for providing their SDK, and I hope that no one over there minds me refactoring…

Continue Reading »

NetApp
Perl

Comments (4)

Permalink