Posted: December 9th, 2010 | Author: Fabio Rapposelli | Filed under: Virtualization | Tags: Copy-on-Write, Snapshots, Storage, VAAI, VMware | 1 Comment »
This post originally appeared on Juku but I find it technical enough to be featured in my personal blog
By now everyone and his dog already made a post about VAAI, I would not bother you with an extensive explanation of what is VAAI and why it’s crucial to Virtualization, I will simply refer to a couple of posts that explain its current implementation in details:
My focus will be on how I envision to accelerate VAAI even more, enhancing its storage side.
To explain my point of view I will do an analogy with a common feature found in storage arrays today: Point-in-Time Copies.
Point-in-time copies (sometimes referred to as Snapshots) are a really valuable feature, they provide a consistent point in time of a specified Data set in order to perform various tasks like: backups, environment duplications and so on.
Traditionally PIT copies were made using a technique called Copy-On-Write which is suitable for a small number of PIT for a single LUN but its performance issues take their toll as soon as their first PIT is created, PIT copies concept was pioneered by IBM with its FlashCopy functionality.
NetApp innovated the approach to PIT copies using a different pointer-based snapshot technique, this almost completely eliminated the performance issue and made possible a massive number of multiple snapshots per single LUN enabling the complete potential of the PIT concept, this post explain how the Compellent Storage Center pointer-based snapshots works in detail, however this is not specific to Compellent, almost all the next-generation storage arrays (like IBM XIV, NetApp FAS, 3Par InServ, Dell Equallogic, HP Lefthand and many others) use the same approach.
So basically we have a great concept (PIT copies) but with most of its potential still locked by its implementation (Copy-on-Write) and then we have an innovator that enable its full potential with a clever implementation and I’m pretty sure that VAAI is still in its “Copy-on-Write” stage of life
.
As you already know VAAI is implemented using an extended SCSI command set, Let’s take as example the most sought-after feature: the Hardware Offloaded Copy.
The hardware offload copy in my opinion can be accelerated to 100000x making all the cloning tasks a matter of few seconds, here’s how:
Keep in mind how a pointer-based snapshot works and bear with me with my explanation:

A 16GB VM sitting in a 128GB Datastore is currently accessed by an ESX host.

Then a VAAI-enabled Clone request is issued by the host, the storage array, instead of doing a real block-to-block copy, simply create a “map” of pointers of the cloned VM on another portion of the datastore, locking its space but without issuing a single block copy, this operation should take the same time as a normal snapshot: few seconds.

Then the host start to write to the new cloned VM and the delta differences are stored in the blocks locked by the “map” previously created.
A similar task can be already done today using snapshots, but it becomes cumbersome immediately because every clone needs to reside on its own LUN and datastore, this approach, instead, can be applied “inside” a datastore streamlining the deployments. Just imagine a VDI infrastructure relying on such cloning technique!
.
I’m sure that storage vendors will try to integrate and innovate their respective VAAI implementations, I hope this post made you realize how powerful can be the still-evolving VAAI approach.
Technorati Tags: virtualization, VMware, VAAI, storage
Posted: December 7th, 2010 | Author: Fabio Rapposelli | Filed under: Storage, Virtualization | Tags: blogging, juku, Storage, Virtualization | 1 Comment »
It’s been a long time since my last post, as you may already know I’ve been very busy obtaining the VCDX certification and I’ve been also knee deep in getting a new blog online: Juku.it
Me and Enrico decided to convey our blogging effort into a more open and agnostic form, without being tied to a specific vendor (I work as Architect at a small consulting firm called Cinetica) and so Juku was born, as told in the “Why Juku?” section Jukus are private Japanese schools and they’re intended to help students improve performance in their regular school work and to help them better prepare for exams, and that’s precisely our goal, not to replace the traditional information channels, but to augment them with our opinions in the more unbiased manner possible.
I will continue to post the more technical articles and my personal thoughts on P2V It! so don’t just unsubscribe it
Technorati Tags: virtualization, juku, blogging, storage
Posted: September 14th, 2010 | Author: Fabio Rapposelli | Filed under: Storage | Tags: in, Storage, Tool, Tools | 1 Comment »
Last week I updated my old WWN Decoder page (renamed as “Storage Tools“) with three useful storage widgets: RAID Space Calculator, a RAW IOPS Calculator and a Replication Bandwidth Calculator.
The IOPS calculator is a bit simplistic right now, I’m trying to improve it to include latency and other determining factors in the IOPS calculation.
Comments and suggestions are VERY welcomed!
Technorati Tags: Storage, Tools, Storage Tools
Posted: July 29th, 2010 | Author: Fabio Rapposelli | Filed under: Storage | Tags: Active Directory, FAS, in, NetApp, Storage | No Comments »
Couple of weeks ago I was preparing a demo lab for a technology event held by my company here in San Marino and I had to join a couple of NetApp filers to an Active Directory environment.
The process itself is very simple but there are a couple of things to keep in mind regarding the time so I thought it would be nice to share them.
Before starting, here’s a bit of background on why clock is very important:
Active Directory authentication is based on a protocol called Kerberos, which use a ticketing system to grant access, the system time is very important because:
[...] In order to prevent intruders from resetting their system clocks in order to continue to use expired tickets, Kerberos V5 is set up to reject ticket requests from any host whose clock is not within the specified maximum clock skew of the KDC. Similarly, hosts are configured to reject responses from any KDC whose clock is not within the specified maximum clock skew of the host. The default value for maximum clock skew is 300 seconds, or five minutes. [...]
(taken from the Kerberos V5 System Administrator’s Guide).
So, basically, if the system clock of a machine is not within the 5 minutes range, the Kerberos system deny the authentication saying “clock skew too great”.
In order to avoid this we need to make sure that our NetApp FAS is within the acceptable range because even the join cannot complete if the clocks are not aligned, so first of all, issue a date command with this syntax:
demo02> date 201002171454
Warning: syncing time to an external time source which will eventually override the time set by the date command.
201002171425 which is (YYYYMMDDhhmm) means:
February, 17th 2010 2:54pm
And then we need to configure the NTP server to keep the time in sync with the Domain Controllers:
demo02> options timed.enable off
demo02> options timed.proto ntp
demo02> options timed.servers <NTP SERVER ADDRESS>
demo02> options timed.max_skew 5m
demo02> options timed.enable on
Now you can proceed with the domain join which is a very simple wizard-like interactive procedure, the command is cifs setup and here you can find a transcript:
demo02> cifs setup
This process will enable CIFS access to the filer from a Windows(R) system.
Use "?" for help at any prompt and Ctrl-C to exit without committing changes.
Your filer does not have WINS configured and is visible only to
clients on the same subnet.
Do you want to make the system visible via WINS? [n]:
A filer can be configured for multiprotocol access, or as an NTFS-only
filer. Since multiple protocols are currently licensed on this filer,
we recommend that you configure this filer as a multiprotocol filer
(1) Multiprotocol filer
(2) NTFS-only filer
Selection (1-2)? [2]: 2
CIFS requires local /etc/passwd and /etc/group files and default files
will be created. The default passwd file contains entries for 'root',
'pcuser', and 'nobody'.
Enter the password for the root user []:
Retype the password:
The default name for this CIFS server is 'DEMO02'.
Would you like to change this name? [n]:
Data ONTAP CIFS services support four styles of user authentication.
Choose the one from the list below that best suits your situation.
(1) Active Directory domain authentication (Active Directory domains only)
(2) Windows NT 4 domain authentication (Windows NT or Active Directory domains)
(3) Windows Workgroup authentication using the filer's local user accounts
(4) /etc/passwd and/or NIS/LDAP authentication
Selection (1-4)? [1]: 1
What is the name of the Active Directory domain? [HANDS-ON.LOCAL]: HANDS-ON.LOCAL
In order to create an Active Directory machine account for the filer,
you must supply the name and password of a Windows account with
sufficient privileges to add computers to the HANDS-ON.LOCAL domain.
Enter the name of the Windows user [Administrator@HANDS-ON.LOCAL]: Administrator@HANDS-ON.LOCAL
Password for Administrator@HANDS-ON.LOCAL:
CIFS - Logged in as Administrator@HANDS-ON.LOCAL.
The user that you specified has permission to create the filer's
machine account in several (2) containers. Please choose where you
would like this account to be created.
(1) CN=computers
(2) OU=Domain Controllers
(3) None of the above
Selection (1-3)? [1]: 1
CIFS - Starting SMB protocol...
It is highly recommended that you create the local administrator
account (DEMO02\administrator) for this filer. This account allows
access to CIFS from Windows when domain controllers are not
accessible.
Do you want to create the DEMO02\administrator account? [y]:
Enter the new password for DEMO02\administrator:
Retype the password:
Currently the user "DEMO02\administrator" and members of the group
"HANDS-ON\Domain Admins" have permission to administer CIFS on this
filer. You may specify an additional user or group to be added to the
filer's "BUILTIN\Administrators" group, thus giving them
administrative privileges as well.
Would you like to specify a user or group that can administer CIFS? [n]: n
Welcome to the HANDS-ON.LOCAL (HANDS-ON) Active Directory(R) domain.
CIFS local server is running.
As you can see it’s a really simple and straightforward process, and you can even fire up compmgmt.msc from your Windows box and point it to the NetApp to see and map shares!.
Technorati Tags: NetApp, Storage, FAS, Active Directory
Posted: July 26th, 2010 | Author: Fabio Rapposelli | Filed under: Storage | Tags: Aggregate, extend, in, NetApp, Storage | 4 Comments »
In the last couple of days I had the pleasure of play around with a FAS2020, the smallest unified storage made by NetApp. It’s a very nice machine indeed, it’s really “user friendly” (from a UNIX admin perspective
, is packed with great features (Deduplication, Snapshots and so on) and gives you the maximum degree of flexibility when it comes down to troubleshooting.
During my tests with this FAS I found myself with a wrong aggregate layout:
fas2020-01> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
Plex /aggr0/plex0 (online, normal, active)
RAID group /aggr0/plex0/rg0 (normal)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0c.00.0 0c 0 0 SA:B - SAS 15000 272000/557056000 274845/562884296
parity 0c.00.1 0c 0 1 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.2 0c 0 2 SA:B - SAS 15000 272000/557056000 274845/562884296
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 0c.00.3 0c 0 3 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.4 0c 0 4 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.5 0c 0 5 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.6 0c 0 6 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.7 0c 0 7 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.8 0c 0 8 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.9 0c 0 9 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.10 0c 0 10 SA:B - SAS 15000 272000/557056000 274845/562884296
spare 0c.00.11 0c 0 11 SA:B - SAS 15000 272000/557056000 274845/562884296
As you can see my Aggregate “aggr0″ was comprised of just 3 disks, in fact this is a kind of “best practice” in the NetApp world, because the system volume “vol0″ reside on the first aggregate and is usually kept separate from the real data to preserve the system in case of something bad occurs to the data disks.
But, in my current test situation I had to extend the aggregate 0 to span 11 disks (leave just 1 for spare), using this command:
aggr add aggr0 8@300G
Immediately a stream of messages comes up in console stating that the disks has been added to the aggregate 0:
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.10 Shelf 0 Bay 10 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXJLK5C] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.9 Shelf 0 Bay 9 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXJM20C] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.8 Shelf 0 Bay 8 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXJLT7C] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.7 Shelf 0 Bay 7 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXK5P2C] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.6 Shelf 0 Bay 6 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXHWVGC] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.5 Shelf 0 Bay 5 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXK4T2C] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.4 Shelf 0 Bay 4 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXK5VXC] to aggregate aggr0 has completed successfully
Wed Mar 31 13:46:21 GMT [raid.vol.disk.add.done:notice]: Addition of Disk /aggr0/plex0/rg0/0c.00.3 Shelf 0 Bay 3 [NETAPP X287_HVPBP288A15 NA00] S/N [JLXJZ2VC] to aggregate aggr0 has completed successfully
Addition of 8 disks to the aggregate has completed.
And if we check again the system configuration we found out that our aggregate has been extended:
fas2020-01> sysconfig -r
Aggregate aggr0 (online, raid_dp) (block checksums)
Plex /aggr0/plex0 (online, normal, active)
RAID group /aggr0/plex0/rg0 (normal)
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
dparity 0c.00.0 0c 0 0 SA:B - SAS 15000 272000/557056000 274845/562884296
parity 0c.00.1 0c 0 1 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.2 0c 0 2 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.3 0c 0 3 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.4 0c 0 4 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.5 0c 0 5 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.6 0c 0 6 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.7 0c 0 7 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.8 0c 0 8 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.9 0c 0 9 SA:B - SAS 15000 272000/557056000 274845/562884296
data 0c.00.10 0c 0 10 SA:B - SAS 15000 272000/557056000 274845/562884296
Spare disks
RAID Disk Device HA SHELF BAY CHAN Pool Type RPM Used (MB/blks) Phys (MB/blks)
--------- ------ ------------- ---- ---- ---- ----- -------------- --------------
Spare disks for block or zoned checksum traditional volumes or aggregates
spare 0c.00.11 0c 0 11 SA:B - SAS 15000 272000/557056000 274845/562884296
Now if we have volumes on this aggregate that we would like to “restripe” to use the new disks we can issue the reallocate command, like this:
reallocate start -f /vol/vol0
and then check the progress with reallocate status:
fas2020-01> reallocate status
Reallocation scans are on
/vol/vol0:
State: Reallocating: Inode 35941, block 0 of 1 (0%)
Schedule: n/a
Interval: 1 day
Optimization: 1
It’s really simple like that.
Technorati Tags: NetApp, Storage, Aggregate, extend