Rick Lane sent this to me recently.
I know this has always been a somewhat confusing topic, so when I came
across this on the Progress board I thought it would be good to pass along:
(written by the VP of Technology at Progress)
1. Does the Progress database work with disk arrays in a RAID 5
configuration?
Yes.
Assuming the storage system itself is reliable, there are no inherent
reliability problems that occur purely because the storage system is
configured for RAID 5. You do not run the risk of database corruption just
by using RAID 5.
2. Does the Progress database work /well/ with disk arrays in a RAID 5
configuration?
NO. RAID 5 configurations should /never/ be used for database storage
because you will experience poor performance-possibly /extremely/ poor
performance. This is true for all database systems, not just Progress.
Read on.
2. Why is RAID 5 performance poor for databases?
All the various RAID configurations involve tradeoffs and compromises of a
variety of factors. RAID 5 is optimized for the wrong thing- cost instead
of disk accesses. Disk accesses per second is a precious commodity.
There are several reasons RAID 5 is bad for database performance:
a) At a minimum, all write operations to RAID 5 arrays require writing
the data to one disk and writing an equal amount of "parity" or
error-correction information to a second disk. In many cases, a single
write operation will actually require 4 disk i/o operations -- two reads to
get the previous data and parity information, and two writes to update the
new data and parity information.
b) Write operations always consume /half/, and sometimes more than half
of the total available disk bandwidth. For example, with a 4 disk array,
only two simultaneous writes are possible since each write operation always
updates two disks.
But when there is already a write taking place, there is a 50%
probability that a second write operation will be delayed because it
requires updating one of the two disks that are already busy from the first
write.
c) Because write operations consume so much of the disk bandwidth, read
performance will also be reduced.
d) The parity information recorded by the array enables recovery of
lost data if one disk should fail. But the recovery process requires
reading all the data from all the remaining disk drives while a failed disk
is being reconstructed. This causes overall performance to be very bad
during the recovery operation. The system may become unusable.
These performance disadvantages become worse and worse as the system
workload and disk activity increases.
3. Doesn't caching solve these problems?
It helps to mitigate the disadvantages of RAID 5, but it does not eliminate
them. And the need for large, reliable cache memories adds greatly to the
overall complexity and expense of the disk array.
a) In order for the disk subsystem to be reliable in the face of power
outages and other failures, the cache memory must be provided with battery
power.
b) In order for the cache memory to be useful, it must be fairly large.
Certainly it will help performance a bit, but disk capacity has
become so great that caches are generally a tiny percentage of the total
storage capacity.
c) Caching reads is fine. But caching writes is another matter.
When writes are cached, the database will think that stuff has been
written to disk and it has not been. If any of the cached data are lost due
to some type of failure, for example a power failure, your database will be
smashed and /cannot/ be recovered. Some disk subsystems have
battery-backed-up cache memories. You have to be very, very certain that a
failure will not cause the cache contents to be lost. The number of things
that can go wrong is large. How sure are you that you can fix the problem
before the batteries are used up? And what if you have to disassemble the
computer to fix it?
4. Can I used RAID 5 if I don't update my database?
Yes. But if you are not updating the database, then you are not writing to
the disks and then you don't gain much benefit from RAID 5 anyway.
5. If I shouldn't use RAID 5, why do the storage vendors recommend it?
Because they compete with each other on price and don't care about
performance. They may also tout the labor savings from increased
manageability, without bothering to mention the performance.
6. Do you have any data to support your assertions?
Please see the following: <
http://www.netapp.com/tech_library/3022.html>
This paper by Network Appliance discusses the results obtained by running a
benchmark called PostMark to test various filesystems. While this is not a
database workload, the results are still worth examining. In Section V,
there are several tables comparing the performance of various
configurations. Note that in /every/ test, RAID 5 performs worse than all
the others.
6. What about software RAID?
a) RAID is always software, it is just a question of /where/ the
software is-in the operating system, in the disk controller, or in a
separate disk subsystem. You don't want it in the operating system because
then it uses the same processors that you want to use. RAID in the OS can
use anywhere from 2 to 10 percent of the CPU cycles, depending on processor
speed, bus speed, number of disks, and other factors.
b) Aside from performance, regardless of level, RAID done in the OS
tends to be a bit less reliable than in the disk subsystem. This is because
the RAID software is mixed in with a lot of other stuff instead of isolated.
If something goes wrong with the other stuff, it could affect the disk
buffers.
7. What about vendor x¹s "magical wonderful mumbo-jumbo RAID"?
Read the descriptions of what it does and how it works carefully. If it
turns out to be RAID 5 dressed up as something else, beware. Remember that
YOU will suffer the consequences of choosing RAID 5, not the storage vendor.
It must be your decision, not the vendor's.
--
|regards,
|gus
|-------------------------------------------------------------------
| Gus Bjorklund, Wizard and Vice President, Technology
| Progress Software, Bedford MA.
Rick Lane
Intelligent Systems Integration, Inc.
600 Weber Drive
Wadsworth, OH 44281
PH: 330-335-5291
FX: 330-335-7275
www.intelligentsi.com
"Helping Business Make Intelligent Use of Technology"
-----Original Message-----
From: Jasper Recto [mailto:
jrecto@...]
Sent: Thursday, October 24, 2002 3:59 PM
To: Vantage Groups (E-mail)
Subject: [Vantage] RAID Level?
Hello,
Does anybody know what kind of RAID our server should be configured for
that Progress recommends?
Jasper Recto
Director of Information Technology
LOC Performance
734.453.2300
Useful links for the Yahoo!Groups Vantage Board are: ( Note: You must have
already linked your email address to a yahoo id to enable access. )
(1) To access the Files Section of our Yahoo!Group for Report Builder and
Crystal Reports and other 'goodies', please goto:
http://groups.yahoo.com/group/vantage/files/.
(2) To search through old msg's goto:
http://groups.yahoo.com/group/vantage/messages
(3) To view links to Vendors that provide Vantage services goto:
http://groups.yahoo.com/group/vantage/links
Your use of Yahoo! Groups is subject to
http://docs.yahoo.com/info/terms/