Monitoring for orphaned snapshots left by SMVI
NetApp’s SnapManager for Virtual Infrastructure (SMVI) is a great product, but it’s messy. If it encounters the any error, it seemingly forgets to delete the virtual machine snapshots from the Virtual Infrastructure before dying.
To prevent many orphans (I’ve seen as many as 20 on a single virtual machine) from happening, I created a quick Nagios check that simply alerts when it sees them.
This script is very elementary. It very simply uses a regex to check for any snapshots that match the default SMVI naming convention. For each one it finds, a counter is incremented. If any are found, the script returns an error to Nagios, which causes an alert to be sent.
#!/usr/bin/perl -w # # check_vi_smvi_snapshots.pl - written by Andrew Sullivan, 2010-06-16 # # Please report bugs and request improvements at http://get-admin.com/blog/?p=1059 # # A simple script to look for snapshots that match the name pattern that smvi uses. # We are merely pulling a list of all snapshots, searching for the string "smvi" in # the name, if it's found, we return a warning condition. This could lead to a # "false" positive if it runs while a snapshot series is still ongoing, but since # the smvi snaps should be very short lived the condidition will not last unless # the snap is left. # # Example: # ./check_vi_smvi_snapshots.pl --server your.esx.host --username you --password secret # use strict; use warnings; use FindBin; use lib "$FindBin::Bin/../"; use VMware::VIRuntime; # substitute the location of your nagios perl library use lib "/usr/lib64/nagios/plugins"; use utils qw(%ERRORS); Opts::parse(); Opts::validate(); Util::connect(); main(); Util::disconnect(); sub main { # the number of smvi snapshots my $smviSnaps = 0; # for setting the type of exit we want my $exitCondition = ""; # we need MORs for each of the VMs on the host my $VMs = Vim::find_entity_views( view_type => 'VirtualMachine' ); foreach my $vm (@$VMs) { if ($vm->snapshot) { foreach my $childSnapshot (@{$vm->snapshot->snapshotInfo->rootSnapshotList}) { $smviSnaps += getSnaps($childSnapshot); } } else { #print $vm->name . " has no snapshots\n"; } } if ($smviSnaps > 0) { print "WARNING - " . $smviSnaps . " SMVI snapshots exist.\n"; $exitCondition = "WARNING"; } else { print "OK - No SMVI snapshots exist.\n"; $exitCondition = "OK"; } Util::disconnect(); exit $ERRORS{ $exitCondition }; } sub getSnaps { my ($snapshotTree) = @_; my $snapcount = 0; # uncomment for debugging #print "Found snap: " . $snapshotTree->{name} . "\n"; if ( $snapshotTree->{name} =~ /smvi/ ) { $snapcount++; } if ($snapshotTree->childSnapshotList) { foreach my $childSnapshot (@{$snapshotTree->childSnapshotList}) { $snapcount += getSnaps($childSnapshot); } } return $snapcount; }
I’ve set the check to execute once an hour in my environment, as I don’t feel that granularity finer than that is needed…an hour’s worth of change is ok for an SMVI snapshot for me.