New storage project, part two: Setting priorities
Thursday, March 15, 2007 01:59 PM
Scott Lowe sets his priorities in looking for a new storage solution. As he moves ahead on his project, follow along and see how he lays the groundwork for a major IT purchasing decision.
In the first part of this series, I described my environment at a small, liberal arts college and our current storage dilemma. I also explained what I consider important when it comes to making an appropriate storage decision. In part two of this series, I will expand on my priority list and explain what I'm trying to achieve.
Block-level shared storage that
easily connects to servers
When it comes to
shared storage in an enterprise environment, there are two choices:
Network-attached storage (NAS) and block-level shared storage in the form of a
storage area network (SAN). A NAS device, which generally provides file-level
access to its contents, is not what I am seeking for this particular project. Instead,
I am looking for a device that provides block-level storage similar to what
local disks provide. In fact, with volumes housed on a SAN, the storage behaves
as if it is locally attached. With the exception of initial volume creation,
SAN-based volumes are managed using OS-specific tools. For example, although
the volume space is dedicated on the SAN using the SAN's management tools,
under Windows, formatting the volume and other management tasks are done
through Disk Management. Think of the initial SAN-based volume space as an
empty disk partition.
While NAS' file-level sharing is useful in some instances, such as when you intend to use a NAS device as a file server, a SAN's block-level storage provides a number of benefits, including:
- Very high availability.
- Ability to be used with a wider variety of applications.
- Higher levels of performance.
- Better overall use of the available storage space.
There are three block-level SAN technologies available on the market, including Fibre Channel, iSCSI, and AOE, or ATA over Ethernet. Unfortunately, AOE has not yet really taken off in the marketplace, leaving Fibre Channel and iSCSI as the two main contenders for my project.
100 percent redundant
Let’s face the
brutal facts: Data storage is a pretty darn important component of a data
center of any size. A whole lot of money is spent at the time of server
purchase on things like RAID controllers and enough reliable disks to make up,
in many cases, a RAID 5 array. After all, if you’re running in a non-redundant
scenario, a single failed disk can ruin your whole day…as well as the days for
a whole lot of people who can no longer access the affected services.
Why would anyone buy a SAN that did not live up to the reliability numbers provided by local storage?
In my search, I’m looking for a solution that is fully redundant and that can withstand the failure of any single component, including the infrastructure between the SAN and the servers. Any solution I consider must support dual controllers, have redundant power supplies, and run in a reasonable RAID configuration.
As I stated, I also plan to make sure the linking network infrastructure, be it Fibre Channel switches or gigabit Ethernet switches for iSCSI, is bullet-proof. Every server will get multiple connections to the storage, via separate switches, creating a mesh architecture that can withstand the loss of any NIC/Fibre Channel HBA, cable, or switch.
Reasonably priced
Like many of you, I'm operating on a pretty tight IT budget. So, I have to be pretty careful
about what I buy. I probably won't be sitting with my CFO discussing the merits
of EMC's Symmetrix
product line. While that product line is absolutely appropriate in some
environments, I just don’t need it. Heck, even something on the lower end of
the EMC scale, such as a Clariion
AX150 (with redundant controllers, of course), might work for me.
This is also a good time to talk about overall solution performance. I don’t need something that costs a ton of money just to squeeze out a few more IOPS (input/output operations per second). In fact, I’m still up in the air as to whether or not to go iSCSI, or go with lower-end Fibre Channel. On paper, of course, Fibre Channel, running at 2 Gbps and 4 Gbps, is far superior to iSCSI, which tops out at 1 Gbps, in terms of theoretical speed. However, in reality, the difference is really not that extraordinary when considering the type of traffic that I need to support. My solution needs to support:
- About 1,250 Exchange mailboxes. We’ll also be going to Exchange 2007’s Unified Messaging solution this summer, but this just becomes a part of the Exchange information store.
- A few SQL Server 2000 databases, including the ones that support our primary student administrative system, fundraising efforts, and help desk.
- A significant (for us) SharePoint 2007 data store, running on SQL Server 2005.
- Some virtual machines running atop VMware ESX 3, which now supports a number of iSCSI SANs.
- File storage. We will be moving our files away from our existing NAS device.
Just about any reasonable iSCSI or low-end Fibre Channel SAN can support these needs with one controller tied behind its backplane. Any potential bottlenecks are not likely to be the storage connection, even if that connection is running at "only" 1 Gbps.
Snapshots
Snapshots can be
a life-saver. A snapshot is exactly what it sounds like--a point-in-time
"picture" of a volume of SAN-housed data. In my previous position, snapshots
saved our behinds when someone ran a query against a SQL database and corrupted
it. We were able to restore the database back from a snapshot that was only 10
minutes old, thus preserving the hard work that had been done all day long
working with the database.
While I would love a solution that has great snapshot capability, my main driving factors are product reliability and cost. However, at least some level of snapshotting is essential. Some vendors think this is still an optional feature and charge more for it.
Replication
Like snapshots,
this is on my "would be nice if" list for this project, as we are working on a
disaster recovery plan. With replication, I would be able to, in the future,
add a second unit somewhere else on campus or at a facility in another location
entirely, and replicate the contents from my primary array. If a disaster affected
my data center, data would be protected and fairly up-to-date. Ideally, my
disaster recovery plan would include replication ability plus virtual machines
housed on the SAN and running VMware’s Vmotion tool with just a couple of hot-standby
servers at the remote location.
Summary
Now you know
exactly what I’m looking for and some of the reasoning behind it. In part three
of this series, I’ll talk about how I narrowed the playing field and selected
the solutions for my short list, and I will also present my short list.









There are currently no comments for this post.