From: www.itworld.com

Setting up RAID volumes with Solaris Volume Manager

by Sandra Henry-Stocker

May 18, 2005 —

 

Imagine you've just received shipment of a new Sun server, pre-installed with Solaris 9 and ready for setup as a web server for your small company. Now imagine that you want to take advantage of Solaris Volume Manager (previously called DiskSuite) to configure your new server to be resistant to disk failures. How do go about configuring the system and how do you set up its disks to provide fault resistent data storage?



One choice is to set up your application partition as a RAID 5 volume. Why might you want to use RAID 5? Well, RAID 5 allows your system to continue running when any one of its component disks or partitions fails. It does this by calculating parity values that allow it to reconstruct the content of any single disk that might fail from the content of the other volume components.



RAID 5 works by computing parity values -- usually generated using a series of simple XOR calculations. On a four-disk RAID 5 volume, four pieces of data would be XORed with each other to generate a fifth value to serve as the parity. If any of the four values or the parity were unavailable due to a component failure, the remaining values could reconstruct the missing one. To understand how this works, let's look at some simple examples written in Perl.



In the following line of Perl, we've XORed four values together (Note that the parentheses are needed to preserve the order of operations) and computed a parity value of 2. Note that ^ represents a bitwise XOR.

> perl -e '$parity = ((( 6 ^ 7 ) ^ 8 ) ^ 11 ); print "$parity\n"'
2


Now let's pretend that one of the values has become unavailable. Pick any of the four values listed in the original expression and replace it with the computed parity.

> perl -e '$missing = ((( 6 ^ 2 ) ^ 8 ) ^ 11 ); print "$missing\n"'
7


Notice how the value of $missing is the same as the value you dropped. RAID 5 devices used a calculation of this variety (though not necessarily with such small pieces of data) to construct parity values and recover data when a volume component fails.



RAID 5 is also relatively conservative with respect to disk space. Whether you have a three-disk system or a five-disk system, for example, only one disk is used to provide this redundancy. The space equivalent of the others serves as usable storage. If you built a RAID 5 volume on a four-disk system, for example, you would have the space of three of your disks for data storage and still be able to lose any one of them without going down (although you would run at reduced efficiency because your missing data would have to be recomputed for every operation).



Since our hypothetical system has a nice pre-install of Solaris 9 on disk 1 and an acceptable partition map, we can even set up our RAID volumes without having to reinstall. Let's see how this can be done.



Partitioning



The partitioning on the original disk seems to be quite acceptable. We have 10G+ root (system disk containing /usr etc.), a separate /var of roughly 6 GB with which to keep log files and such from overrunning important application space and a large /space partition into which we can load our applications. We also have 2 MB of swap.



Assuming we want to keep this partitioning, the first thing we need to do is partition the other two drives. Since components in a RAID array should all be the same size, we probably want to reflect the partitioning of the first drive on the other two -- at least for those partitions we plan to put into a RAID volume. Let's say we decide to partition the second two drives identically with the first.



Once all three disks are partitioned, we will have something that looks like this:

+-------+	+-------+	+-------+ 
| 0 /   |       | 0     |       | 0     |
+-------+	+-------+	+-------+ 
| 1 swap|       | 1     |       | 1     |
+-------+	+-------+	+-------+ 
| 5 var |       | 5     |       | 5     |
+-------+	+-------+	+-------+ 
| 7     |       | 7     |       | 7     |
| space |       |       |       |       |
+-------+	+-------+	+-------+ 


Putting /space into a RAID 5 array, therefore, requires only that we claim all three instances of partition 7 and issue the Volume Manager commands to create our RAID volume.



The first Volume Manager command that must be run to create RAID volumes is metadb. This command sets up the database that will maintain information about the RAID volume. This database is sometimes stored in its own partition, but let's say we decide to store it in partition 7. We also decide to store the database in several locations, so we use a command like this:

# metadb -a -f c0t0d0s7 c0t1d0s7 c0t2d0s7


The -a argument says to attach the new database device. The -f (force) argument creates the initial state database.



Since the database that keeps track of the volumes can't be built on top of an existing file system, we've placed it in a set of unused slices.



To confirm the databases that were just set up, we use the command metadb -i.



The -i argument is an inquiry into the state of the replicas.

# metadb -i
        flags           first blk       block count
     a        u         16              8192            /dev/dsk/c0t0d0s7
     a        u         16              8192            /dev/dsk/c0t1d0s7
     a        u         16              8192            /dev/dsk/c0t2d0s7


We are now ready to group our three partition 7s into a RAID 5 array. We do this with the metainit command:

# metainit d7 -r c0t0d0s7 c0t1d0s7 c0t2d0s7
d7: RAID is setup


The first argument in this command, d7, provides a name for the new volume that we are creating. We will use this name in place of the normal partition names (such as c0t0d0s0) to identify the RAID 5 volume. Its new name will be d7 or, fully expressed, /dev/md/dsk/d7.



To confirm the setup of our RAID 5 volume, we issue the metastat command:

# metastat
d7: RAID
    State: Initializing
    Initialization in progress: 12.2% done		<== not done yet
    Interlace: 32 blocks
    Size: 208618176 blocks (99 GB)
Original device:
    Size: 208627648 blocks (99 GB)
        Device     Start Block  Dbase        State Reloc  Hot Spare
        c0t0d0s7      10506       Yes Initializing   Yes
        c0t1d0s7      10506       Yes Initializing   Yes
        c0t2d0s7      10506       Yes Initializing   Yes

Device Relocation Information:
Device   Reloc  Device ID
c0t0d0   Yes    id1,sd@SSEAGATE_ST373307LSUN72G_3HZ9YZWN000075284SPY
c0t1d0   Yes    id1,sd@SFUJITSU_MAT3073N_SUN72G_000515B0358P____AAN0P540358P
c0t2d0   Yes    id1,sd@SFUJITSU_MAT3073N_SUN72G_000515B036CR____AAN0P54036CR


Note, however, that the configuration of this device is not yet complete. Its state is "Initializing" (see line 2) and the command is only 12.2% done (see line 3).



While we're waiting for metainit to complete the setup of our new volume, we can edit our /etc/vfstab file so that the RAID volume will be mounted when the system boots.

/dev/md/dsk/d7   /dev/md/rdsk/d7   /opt  ufs     3    yes	-


Notice that this looks the same as a single partition file system vfstab entry. Only the names of the devices are different. And, of course, before we mount the new volume, we need to first create a file system on the new volume with newfs. For this, we use the newfs command as we would with any normal partition.

# newfs /dev/md/rdsk/d7


By the way, the "md" component of the volume name stands for "metadisk" -- a term used to differentiate RAID volumes from simple disk partitions.



In the second part of this column, we'll look at a couple trickier manuevers with Solaris Volume Manager -- where we want to set up RAID 5 volumes or mirrors on existing file systems.