I would like help identifying if I have a disk performance issue or not. First the background: we have a j2ee application using the MS SQL 2005 JDBC driver and Hibernate on 4 application servers, and an active-passive SQL Server 2005 cluster. All of the servers reside in the same physical rack and switch.
Our application is typically bounded by CPU on the app server, or throughput from the database. Several months ago we were using SQL 2000 and would often max out the CPU on the database server before anything else, but often the database could keep up. Now we have 2005 on a much more powerful machine and more app servers, but we seem to be running up against a problem with throughput from the database.
The issue is not CPU. The total cpu average, as monitored in perfmon on 30 second intervals, stays consistently below 40%. But what concerns me is the Average Disk Queue Read Length, particularly for our E: drive. On this machine, the transaction log, the system and temp dbs, and our application's database are all on separate EMC partitions, connected via fibre chanel. The E drive houses the app data and is a 15-way meta device (fifteen 10GB logical devices striped at 960k for a 150GB device) in a RAID-S configuration.
I have read various articles online describing how to interpet the Average Disk Queue Read Length performance counter with regards to SQL Server. Some have said this should not exceed the number of physical spindles * 2. We are seeing values of 32 consistently, averaging over 60 during peak processing hours, and spiking to well over 100 on a scale of 1.0. (3-second sample interval).
So since our application servers seem to be waiting on their database calls (a lot of inserts with frequent, but small-resultset selects) and do not show I/O issues either with their local storage, memory, or network interface. The database server again has no CPU, network, or memory issues. I should add the the Average Disk Queue Write Length counter does not have any issues; its always below 1 (on a 1.0 scale). The EMC Celerra array has both read and write caching. The indexes of the application database are rebuilt weekly and defragmented every day, with stats rebuilt after the defrag.
So how can I further determine where my performance problem lies? All thoughts appreciated! Thanks!
-tuj
First of all, you are always waiting on something. Either the CPU, memory, or hard drives. The question is, is the performance unacceptable and how much will it cost to fix?
Generally, faster the processors will help CPU. Memory will help both applications and disk cache to eliminate going to the hard drives, which are generally the slow dog in the race.
From what you describe, your bottleneck must be the disk access.
There are many schools on how to configure RAID properly. I am concerned by your comments: The E drive houses the app data and is a 15-way meta device (fifteen 10GB logical devices striped at 960k for a 150GB device) in a RAID-S configuration.
Are you saying you have split physical drives, into logical drives RAID arrays? You never, ever want to do that. Also, a stripe size of 960K is HUGE. That means every single read/write has to read/write in 960K.
Intel has some basic guidelines on configuring RAID here http://www.intel.com/support/motherboards/server/srcs16/sb/cs-020782.htm.
As a general rule: create a RAID-5 array using all the physical drives in a unit. If you have 15 drives in a unit, make a 15 drive RAID-5 array. THEN, you split that into logical drives using the OS. Otherwise, you are killing the performance of the RAID-5.
I am sure we will get confilicting information from other people.
|||Please understand, we are using EMC storage whcih uses its own proprietary version of RAID-5 called RAID-S. This is an enterprise-class Symmetrix system. The storage array is connected to the SQL Server via fibre-channel and lives in the same rack.
Back to my original question? How can I quantify that I have a disk performance issue. My storage folks tell me that the storage is performing fine, at least from their perspective. How can I tell if I am maxing out my I/O capabilities?
Disk queue length is one indicator for disk IO issue.
For SQL server, you can use fn_virtualfilestats to sample the IO stalk periodically (say every 3 mins) on the database or file you concerned. Check the difference of the IoStallMS between two samples and calculate the IOStalls perminute (Ref. BOL for details).
If you have high IoStallMS/Min, your DB is suffering IO issue.
Once one of my client had 50sec/min IOStalls during some period of time. That convinced the San team to reconfigure the storage.
|||Ok, so I setup a process to use the fn_virfualfilestats function to sample by IoStallMS value every 1 second.DECLARE @.myDBID int
SET @.myDBID = db_id()
SELECT * FROM ::fn_virtualfilestats (@.myDBID, -1)
go
On my data file, I see IoStallMS values like '1754339800'. I stored the values in a table, then I found the max IoStallMS value by minute.
select datepart(hh, ts) as hr, datepart(n, ts) as min,
min(ts) as ts, max(iostallms) as stalls, file_id
from io_stall i1
group by datepart(hh, ts), datepart(n, ts), file_id
I then joined the grouping to itself, shifting the join minute by 1. This allowed me to calculate the
IoStallMS per minute.
select a.hr, a.min, b.min, a.stalls, b.stalls, b.stalls - a.stalls as diff
from
(select datepart(hh, ts) as hr, datepart(n, ts) as min,
min(ts) as ts, max(iostallms) as stalls, file_id
from io_stall i1
group by datepart(hh, ts), datepart(n, ts), file_id) a
inner join
(select datepart(hh, ts) as hr, datepart(n, ts) as min,
min(ts) as ts, max(iostallms) as stalls, file_id
from io_stall i1
group by datepart(hh, ts), datepart(n, ts), file_id) b
on a.hr = b.hr and a.min + 1 = b.min
and a.file_id = b.file_id
where a.file_id = 1
order by a.hr, a.min
So... my results look like this for my data file:
hr min min2 avgiostall a avg io stall b b - a
14 20 21 1677684043 1679200642 1516599
14 21 22 1679200642 1679671839 471197
14 22 23 1679671839 1680146618 474779
14 23 24 1680146618 1681345822 1199204
14 24 25 1681345822 1681816656 470834
14 25 26 1681816656 1682260233 443577
14 26 27 1682260233 1682689827 429594
14 27 28 1682689827 1684180645 1490818
These seem to be exceptionally high, no?
On my log file for this database, I get these reults:
hr min min2 avgiostall a avg io stall b b - a
15 11 12 17382490 17473367 90877
15 12 13 17473367 17473370 3
15 13 14 17473370 17473372 2
15 14 15 17473372 17473374 2
15 15 16 17473374 17473375 1
15 16 17 17473375 17473377 2
15 17 18 17473377 17540468 67091
15 18 19 17540468 17543980 3512
15 19 20 17543980 17544013 33
Which in general seem much more reasonable, but it appears to still be spiking.
1. Have I calcuated the IoStalls / Minute correct?
2. If so, it looks like I clearly have an I/O problem to my SAN, right?
|||Run this command:
DECLARE @.myDBID int
SET @.myDBID = db_id()
SELECT ReadBytesPerMS = (BytesRead/IoStallReadMS),
WriteBytesPerMS = (BytesWritten/IoStallWriteMS),
DBFileName = physical_name
FROM ::fn_virtualfilestats (@.myDBID, null) vs
JOIN sys.database_files df ON vs.FileID = df.file_id
My results are all >9000, but my situation might be different than yours.
|||My results:
4169 2860 E:\MSSQL2K5\Data\TW.mdf
18802 5731 F:\MSSQL2K5\Data\TW_log.ldf
29696 7235 G:\MSSQL2K5\Data\TW_log2.ldf
So it looks like my reads and writes to E: are much slower?
|||Yes, it does look like your reads and writes are much slower on E. The writes in general are not real fast. Do you have write caching turned on? How much cache do you have on the array controller? Reads are generally twice as fast as writes. But in your case they are 3 times. That would indicate the controller is not performing well on writes.
It the E drive a logical partition on the same physical array? Can you move the log file to the local drive, or another drive, and see if performance increases?
|||Yes our EMC is configured to cache both reads and writes. Interestingly, the average disk queue WRITE length for the E drive is almost always < 1 while the average disk queue READ length is often over 100. That makes me think that the writes are being cached and written ok, although the numbers from SQL Server say different.
The F and G partitions are separate physical SAN shares from the E drive SAN share. So they should all be physically independent, although they do all live in the same array and utilize the same FC connection to the server. We moved the tempdb and transaction log to separate SAN shares a while back to try to reduce I/O against E.
We see no average disk queue length problems for read nor write on the other drives besides E. We cannot put the data or transaction log on a local drive because the system is a cluster.
|||
The iostall value is pretty high. While running the job, you can also trace the following under physical disk in performance monitor:
Avg Disk sec/Read, Avg Disk sec/Write , Disk Writes/sec, Disk Reads/sec
Based on the RAID configuration (RAID5, RAID10 etc) , and hard drive (spindle) specs, you can estimate how many spindles you need to eliminate the bottle neck.
No comments:
Post a Comment