RAID
A soft RAID vs Hard RAID:
a> Software RAID is an abstraction layer in an OS between physical and logical disk, and this abstraction layer will consume some CPU resources. Hardware RAID is not the problem;
b> can support hot-swappable RAID hard disks, the benefits of this can be brought online to replace a damaged disk. But a new generation of SATA software RAID can also support hot-swappable (command line must first be removed before pulling);
c> Hardware RAID usually also features a number of vendors to provide extra;
d> independent requires specialized hardware RAID device, which means that pay more money;
2, semi-hard semi-soft RAID, in addition to soft RAID and RAID hard there this one kind exist. It is the IO to complete work to the CPU, they generally do not support RAID5;
3, pay attention to value and step value mkfs block when the group of soft RAID (after the detail);
4, RAID 0:
a> non-nested RAID in the best performance;
b> You can not just two disc set RAID 0, three or more can be, but the performance is marginal effect (ie, assuming a floppy disk performance is 50MB per second, and two floppy disk RAID 0 performance about 96MB per second, three floppy disks RAID 0 is perhaps 130MB instead of 150MB per second per second);
c> capacity = smallest piece of disk capacity x number of disks (there are some software RAID can do without this restriction, linux implementations can be), the storage efficiency of 100%;
d> a bad disk, all data is finished;
e> Note: The read performance and write performance will not get that kind of a big upgrade, the main improvement is the write performance;
f> Scenario: non-critical data, but do not often write backup requires high-speed write performance data;
5, RAID 1:
a> mirror. May consist of two or more disks, data is copied to a disk for each group, as long as one is not broke, do not lose data, reliability of the strongest;
b> Array count only the smallest piece of disk capacity, storage efficiency (100 / N)%, N is the number of disks in the array. Is the lowest of all the RAID utilization;
c> a minimum of two or more disks (recommendation 2);
d> write performance slight decrease significantly enhanced read performance (if it is soft RAID need to strengthen multi-threaded operating system, such as Linux);
to be close to the performance of disk e> group (preferably the same as the disc), so that only the load balancing;
f> can be used as a hot backup program (added to the RAID 1 disk, sync after a good pull down is a backup disk of);
g> Scenarios: requiring high redundancy, while the cost, performance, capacity, less sensitive to the scene;
6, RAID 2:
a> RAID modified version 0;
b> can be composed of a minimum of three disks,
c> will be distributed to each disk data encoding (Hamming Code) after;
d> add ECC checksum, so RAID 0 ? consuming than some disk space;
7, RAID 3:
a> also will be distributed to different physical disks after data encoding;
b> using Bit-interleaving (data interleaved storage) data dispersed manner;
c> will be used to verify the value of the recovered another piece written separate disk;
d> The use of bit partitions, each bit of data may need to read start reading all of the disks, so it is suitable for large amounts of data to read;
8, RAID 4:
a> Most RAID 3 and the same;
b> The difference is that it uses Block-interleaving instead of Bit-interleaving for data partitions;
c> Whether to read and write data on the disk parity disks are what piece needs to be read, so this parity disk load was significantly higher than the other disk, so the parity disk is very easy to damage;
9, RAID 5:
a> Use Disk Striping data partition. Partition the way left symmetrical, left asymmetric, right symmetry, RHEL default choice left-symmetric, because less time addressing, which is the highest read performance choice;
b> does not store copies of data, but store a checksum value (for example, using the XOR, actually also XOR), when the data block is missing, according to the data blocks and parity values ??with the rest of the group to recover data;
c> checksum value is distributed to every single piece of a disk rather than on disk, thus avoiding a piece of disk load was significantly higher or lower than other disk;
d> requires at least two disks to form, performance, and redundancy have some improvement, eclectic selection of RAID 0 and RAID 1’s;
e> storage utilization is high, as 100 x (1 – 1 / N)%, N is the number of physical disk capacity is equivalent to consuming a plate, but the cost of capacity is dispersed in each tray, rather than a one physical disk;
f> Performance can mediate (assign read and write strips) by adjusting the stripe size;
g> broken a dish, you can also reconstruct the data, but due to reconstruction (degraded) by checking the value of the data needs to be calculated, so the rebuilding process will be a sharp decline in performance. And renewal is a long process;
h> application scenarios: high read performance requirements, fewer write performance requirements, cost considerations;
10, RAID5 write performance is poor, because each write actual will trigger four times (or more) of I / O, although the actual cache mechanism will reduce I / O overhead. As follows:
a> about to be written data is read out from the disk;
b> write data, but this time check value has not been updated;
c> to the same group all the blocks of data are read in, the parity block read in;
d> calculate the checksum value and the new value is written back to parity check block;
11, RAID 6:
a> including a variety of mechanisms, including zoning and RAID5 is basically the same;
b> except that it saves two different algorithms (the textbooks that are different for the grouping, such as the calculated value of 1,2-4 1-3 count value 2) the checksum is calculated, so that the disc 2 will consume capacity (also scattered across the disc);
c> composed of a minimum of four disks, allowing simultaneous bad 2;
d> a higher degree of redundancy than RAID 5, while the write performance is more rotten than 5 RAID (In fact, because a lot of times will lead to actual I / O, very bad);
e> storage utilization is 100 x (1 – 2 / N)%, N is the number of physical disks;
f> rebuild faster than RAID 5 blocks reconstruction performance than RAID 5, the risk of failure when rebuilding lower than RAID 5 (RAID 5 in the reconstruction process a lot easier to be broken due to the pressure of a plate);
e> scenarios: higher demand than RAID 5 redundancy protection, the demand for higher read performance lower write performance and cost considerations;
12, RAID 7:
RAID 7 is not open RAID standard, but Storage Computer Corporation’s patented hardware product name,
RAID 7 RAID 3 and RAID 4 is based on the development, but after strengthening to address some of the original restrictions.
In addition, the use of large amounts of cache memory and a dedicated real-time processor for array management in asynchronous implementations,
making RAID 7 can handle a large number of IO requirements, so performance even beyond the standards of many other RAID implementations products.
But because of this, in terms of price is very high.
13, RAID 10:
a> Nested RAID, can be broken down into 1 + 0 and 0 + 1;
b> requires at least four disks, storage efficiency (100 / N)%, N is the number of mirrors;
c> both performance and redundancy that is slightly lower utilization;
10:00 d> create software RAID generally create two RAID 1, RAID 10 and then create on this basis;
14, there is generally nested RAID 50 and RAID RAID 53;
15, less practical application of RAID is RAID2,3,4, because RAID5 has covered the required functionality. So RAID2,3,4 mostly only in the areas of research have achieved, but the practical application of RAID5 or RAID6 oriented places.
16, RAID DP (dual parity) is designed NetApp systems is the use of RAID4 (instead of legends RAID 6, because it is independent of the parity disk) design concept, as each deal only with two disks. Such as RAID and RAID 6 also allows the simultaneous bad two plates, they developed many years WAFL file system for RAID DP made ??specifically optimized efficient than RAID 6;
17, optimized RAID parameters (striping parameters):
a> adjust striping RAID parameters have a great impact on for RAID 0, RAID 5, RAID 6;
b> chunk-size. RAID will be filled with a chunk size disk to move to the next one, and sometimes it is directly striping granularity. chunk size should be an integer multiple of the block size of;
c> to reduce the chunk-size means that the file will be divided into more pieces, distributed on more physical disks. This will improve the transmission efficiency, but may reduce the efficiency of locating seek (some hardware will wait to fill a stripe really written, it will seek to offset some of the positioning of consumption);
d> increasing chunk-size will be more than the opposite effect;
e> stride (step value) is a value type ext2 file system for data structure in the class ext2 seek, it is so designated: mke2fs -E stride = N. N (that is, the step value) should be designated as the chunk-size / filesystem block size (file system block size). (Mke2fs -j -b 4096 -E stride = 16 / dev / md0)
f> value adjustment for the above two, you can improve the efficiency of RAID parallel on multiple disks, so that the performance of the entire RAID varies with the number of physical disks to improve and upgrade;
18, more RAID principle visible http://blog.chinaunix.net/space.php?uid=20023853&do=blog&id=1738605;
19, soft RAID information can be viewed in several places in the following systems:
a> / proc / mdstat. Easy to see where the new reconstruction progress bar, as well as a bad piece of plate, etc.;
b> mdadm -detail / dev / md0. Show details;
c> / sys / block / mdX / md. Easy to see here cache_size other information (unavailable elsewhere);
20, Major / Minor number. linux system on behalf of the former type of device / driver, which is the same type / drive multiple devices in the system ID number;
21, RAID technology originally developed by IBM, Bell Labs invention rather than (http://en.wikipedia.org/wiki/RAID#History);
22, RAID support online expansion;
23, the hard drive has bad sectors are basically appearances, but vendors will set aside part of the track (and usually in the outer ring) came out to replace bad sectors consume capacity. In contrast, enterprise-class hard disk capacity is reserved for the more consumer-grade large hard drive, which is more expensive and longer life it is one of the reasons;
24, under RHEL soft RAID configuration files (used for boot automatically discover RAID) in /etc/mdadm.conf. This file can be so generated: mdadm -verbose -examine -scan> /etc/mdadm.conf. This file administrator needs to configure how mail notification when RAID is broken;
25, Critical Section MeteData RAID is stored in the data. It is a meta-data, if it lost, the data on the RAID whole lost. Various RAID, only RAID Critical Section 1 is written on the disk tail (the other are written in the disk head), so RAID 1 is more suited to the system tray (easy recovery MeteData).
26, when the soft RAID reorganization (reshape) (plus disc inside, say, or change the chunk size or RAID level), if you want to retain the original data, the backup Critical Section is very important. Internal reorganization process is this:
a> the RAID set to read only;
b> Backup Critical Section;
c> Redefine RAID;
d> successful, then delete the original Critical Section, recovery can be written;
27, it is best to manually specify the Critical Section backup location (the best place to be backed up to change the RAID outside), because if reshape failure Critical Section is the need for manual recovery. If there is no manual developed position, Critical Section will be present on the backup disk RAID, and if there is no spare disk RAID is stored in memory (easily lost, dangerous);
28, if the RAID change (such as stretching), filesystem above must remember to make the appropriate changes (for example, the pull of God);
29, the system starts rc.sysinit script in order to start the service by default is this: udev-> selinux-> time-> hostname-> lvm-> Soft RAID. So do the lvm RAID reasonable, unless you manually change the startup script;
30, expanding the capacity RAID 5, RAID 5 by gradually in the way disk drives replaced with larger capacity each to complete (do not forget to resize filesystem);
31, on the same machine between different soft RAID group can be shared spares;
32, if you want to migrate software RAID between machines, remember that before the migration should be changed to migrate RAID on a target machine does not exist (it does not conflict) RAID device name (mdX). Renamed when you can identify with Minor numbers to be renamed the RAID;
33, each block Bitmap The block is used to record whether the physical disk RAID synchronization with the whole lot, actually. It will be periodically written. Enabling it can greatly speed up the recovery process of RAID;
34, Bitmap can be stored in external RAID or RAID. If you tell the RAID to RAID Bitmap exist, the absolute path is given within the RAID, RAID will deadlock. If then the presence of external RAID, file systems only support extX;
35, added to the active RAID Bitmap on the premise that it’s Superblock is good;
36, you can enable Write Behind mechanism, if it is enabled Bitmap. This will successfully written to the disk in the specified application will return after the success of the other disk writes data asynchronously, which is particularly useful for RAID 1;
37, RAID mechanism does not take the initiative to detect bad blocks. Only when forced to read bad block could not be read as it will go to try to fix (find prepare block alternative), if it can not repair the disk (prepared block ran out), it will mark the piece of disk error, then kicked RAID, enable spare;
38, because of the bad block is negative RAID processing, so the process of rebuilding the RAID 5 is likely to find hidden bad blocks, leading RAID recovery failed. Disk capacity, the greater the risk;
39, RAID with crontab regular check for bad blocks is a good suggestion. In Kernel 2.6.16 can be triggered after we check:
echo check >> /sys/block/mdX/md/sync_action.
Recent Comments