SAN FRANCISCO (10/10/2003) - Many storage solutions can expand only up to a point, limited by the number of disk enclosures their controllers support. After crossing that threshold, you're back to managing discrete storage systems, although not as many as you would with DAS (direct attached storage). An ideal storage system should accept additional capacity as need arises, without compromising manageability and performance.
LeftHand Networks Inc.'s DSM (Distributed Storage Matrix), comes very close to that ideal with a clustered storage solution for Linux and Windows servers based on modular storage arrays linked over an IP network -- it scales easily and is pretty much self-administering, which will keep IT happy.
I Want My DSM
Previous DSM products were based on four-drive NSM (Network Storage Module) 100 boxes; DSM 4.2 includes both the NSM hardware and management software, SCC (Storage Control Console), and the NSM OS. Its modular nature means companies can order a DSM with one or more NSMs, based on their needs.
Also, DSM now supports larger eight-drive, 2U-sized NSM 200 boxes. It can provision storage for Windows Server 2003 and offers optional Remote IP Copy software that supports asynchronous replications of remote volumes.
I ran DSM 4.2 on three rack-mountable NSM 200 modules. Each unit mounts eight hot-swappable drives in 160GB or 250GB capacity, includes two GbE (Gigabit Ethernet) ports, and has a redundant, field-replaceable power supply.
I connected the NSMs to my GbE switch, set their IP addresses, and installed the Java-based SCC management software on a Mandrake Linux machine (Windows is also an option).
LeftHand Networks developed its own connectivity protocol, AEBS (Advanced Ethernet Block Storage), while iSCSI was still on the drawing board, so I had to install AEBS drivers on my Windows Server 2003 and Windows 2000 machines. iSCSI support is slated to arrive by year's end, and although they weren't in my test set, AEBS drivers for Linux should be available at the end of October.
Each NSM box includes controllers and storage enclosures, and can work in peer-to-peer cooperation to form storage systems with exceptional, self-managed resilience, performance, and scalability. To achieve this, the SCC management software acts as the glue to group NSMs into homogeneous storage pools.
Using SCC, I automatically discovered the three NSMs in my LAN and assigned them to a Management Group. Administrators can use Management Groups to separate storage for production and test environments or for different departments, placing capacity where it is needed most.
Within each Management Group, you can further aggregate NSMs in clusters, storage pools from which to carve volumes. To optimize performance, volumes automatically spread across all NSMs in that cluster, according to built-in algorithms. Adding a new NSM to a cluster automatically reallocates existing volumes to take advantage of the additional spindles, without disrupting client access.
To test this, I started with a cluster containing a single NSM box, created a 3GB volume, assigned that volume to an Authorization Group (similar to a Fibre Channel SAN zone), and granted access by adding one of my servers to the same group.
Moving to that server, I acquired the new volume using AEBS, then did a format using the standard Windows Disk Manager. Using a script, I wrote some data to that volume and launched Iometer to measure performance. The average response time was 48 milliseconds.
Back in SCC, I added the other two NSMs to my cluster. In the SCC GUI, my volume changed from normal to a "restriping" state as the system automatically re-spread my volume across the NSMs. During restriping, I still had full read-write access to my volume.
When complete, I measured a faster average response time of 36 milliseconds, which I expected as volume access was now spread over 24 physical drives instead of eight. This self-tuning ability is probably the most striking characteristic of DSM and should save on management cost and complexity.
Multilayer resiliency is another interesting aspect of DSM. At the lowest layer, each NSM can set its disk drives to RAID 0 or RAID 10; the latter can preserve the content of an NSM 200 even if three disks fail. Replacing damaged disks automatically triggers a healing process, again without client disruption.
Administrators can also enforce volume-level resilience, creating volumes that automatically keep one or two replicas of their content. DSM spreads redundant copies on alternate NSMs so one failed unit will not affect all replicas. Clusters are the third level of resilience: To make them impervious to failures, administrators can assign a hot spare NSM that automatically replaces an ailing module.
Built-in remote connectivity gives IP-based SANs a significant advantage over FC; from SCC, you can transparently access a Management Group located at the remote end of a WAN.
However, WAN connections are usually less reliable and slower than local links, so activities such as replicating a database at a remote branch require tools that can tolerate connection hiccups. That's where Remote IP Copy comes in: an asynchronous volume-copy tool that can reliably complete the replica to a remote location and overcome temporary disconnections.
Remote IP Copy was easy to implement from SCC and worked well on a simulated broken WAN link, resuming the copy automatically when I restored the connection. For companies with distributed processing requirements, Remote IP Copy is the icing on the DSM cake.
The only concern I have about DSM 4.2 is the lack of iSCSI support, which could prevent use of HBAs with TCP offload on CPU-bound servers. Otherwise, DSM 4.2 offers a SAN with good management tools and excellent scalability, resilience, and performance at a good price for companies looking beyond DAS.