Monday, 21 January 2013

Storage 101

So what are the differences between the various storage systems available? We hear about all kinds of offering but the basic requirements for users are DAS, NAS or SAN.

DAS or Direct Attached Disk refers to hard drives installed directly on a server or client computer such as the c: drive on your desktop or laptop. Desktop systems are not normally capable of RAID based drives but this has been put forward by some manufacturers of late. Server based DAS drives are capable of RAID configurations with locale RAID controllers and usually support several types of RAID along with spare drive capability. This is usually seen in small shops that do not need or cannot afford larger centralized shared storage.

NAS or Network Attached Storage is a file based storage system catering to users who only require flat file non database storage. An example of these file types would be word and excel files. Typically, these file folder shares are mounted on a server and shared through the operating system as remote drive maps to the user’s personal systems. File formats are typically CIFS or NFS protocol based and use the Ethernet network to service the file shares to client systems and users. Creating a file share on a windows server is the typical method of sharing this storage to client computers.

SAN’s or Storage Area Networks are medium to large storage array systems, which typically hold block based data for programs like Oracle or Microsoft SQL. These systems also are able to hold file based structures such as word and excel documents but are less efficient for this purpose. Typically these storage arrays are connected to the servers through a high speed Fibre Channel network but are also available in a iSCSI (Internet Small Computer System Interface) or SAS (Serial Attached SCSI).


Protocol
Speed
Fiber Channel
2,4,8 and 16Gb/s
iSCSI
1 and 10Gb/s
SAS
3 and 6Gb/s
 
In all RAID (redundant array of independent disks) based systems, many types of spare disk protection are available. The primary types are RAID 0,1,5 and 6. Each has its use and pros and cons for each.

Level
Description
Minimum # of drives**
Space efficiency
Fault tolerance
Array failure rate***
Read perf.
Write perf.
Figure
Block-level striping without parity or mirroring
2
1
0 (none)
1−(1−r)n
nX
nX


Mirroring without parity or striping
2
1/n
n−1 drives
rn
nX
1X

Block-level striping with distributed parity
3
1 − 1/n
1 drive
½n(n−1)r2
(n−1)X*
 (n−1)X*


Block-level striping with double distributed parity
4
1 − 2/n
2 drives
⅙n(n-1)(n-2)r3
(n−2)X*
(n−2)X*
 

* Assumes hardware is fast enough to support
** Assumes a non-degenerate minimum number of drives
*** Assumes independent, identical rate of failure amongst drives

The fastest RAID for reading and writing is RAID 0 but offers no protection in case of a disk failure. This might be used for non critical data where in the case of a disk failure, data may be rebuilt with minimum effort and has zero impact on the business. The higher the cost of data reconstruction, the more advanced the RAID protection needs to be. RAID 6 has the highest protection levels allowing for up to 2 drives to fail before data may no longer be read.

Many arrays on the market today also allow for multiple spare disks, which are not really part of the data RAID set but will automatically be brought into the array set when a disk fails. The disk controllers will calculate the missing blocks of data from the remaining drives in the set and place the missing data on the new drive.

It is vitally important to manage the array so that failed drives may be replaced to ensure the optimal amount of spare capacity. Drive systems that are completely ignored and unmanaged tend to fail though neglect over time and not through failed drives or hardware. Like your car, do not ignore the gauges.

The second line of defence is the controllers themselves. Raid systems come as single cards within a server and as external, single controllers, active/passive and active/active based systems. A single controller is a single point of failure and when it fails, all paths to storage are lost. This is true of both internal server cards based arrays and external arrays.

Active/Passive based arrays tend to have a single controller active to handle data transmission and in the event of a controller failure pass the data I/O and connectivity seamlessly to a ”hot” standby controller. Usually no manual intervention need be applied to continue computing but the failed controller needs to be addressed as this array is in a reduced functional state. Failure of the second controller will mean downtime. With this type of RAID system, it is important to ensure a single controller may handle the entire I/O load from the servers.

Active/Active based arrays allow you to “split” the I/O load between two active controllers. This has the advantage of multi-pathing the I/O through different cards and switches to the storage system and will auto heal failures of server card ports, interconnect switches and array controllers. Heavy use of the storage from a single server such as a database server will also not affect the other servers as they may be load balanced over to the less busy controller. Again, it is important to ensure a single controller may handle the entire I/O load from the servers in case of a controller issue.

Do not make the mistake of looking at the cheapest type of storage for your organization. Understand your data is what drives your business and put the proper system in place. Down time may be measured in lost production and sales because your storage system failed to support the business. Please feel free to contact us with your storage questions and needs.

By Bill Wilson, CD CET MASE
      Enterprise Solutions Engineer at Audcomp

1 comment:

  1. Best web hosting companies. Reviews, rates, statistics of top hosting companies.
    Find best hosting company at HostingCompaniesz.com

    ReplyDelete