May 07, 2001, 3:34 PM —
Q: I have a performance problem. What do you need to know to help me?
A: I see a lot of questions
from users or administrators who have decided that they have a
performance problem but don't know where to start or what
information to provide when they ask for help. I have seen e-mail
from people who just say "my system is slow" and give no additional
information at all. I have also seen 10-megabyte e-mail messages
with 20 attachments containing days of
iostat reports, but with no
indication of what application the machine is supposed to be
running. In this section, I'll lead you through the initial
questions that need to be answered. This may be enough to get you on
the right track to solving the problem yourself, and it will make it
easier to ask for help effectively.
1. What is the business function of the system?
What is the system used for? What is its primary application? It could
be a file server, database server, end-user CAD workstation, Internet
server, or embedded control system.
2. Who and where are the users?
How many users are there? How do they use the system, and what kind of
work patterns do they have? They might be a classroom full of students,
people browsing the Internet from home, data entry clerks, development
engineers, real-time data feeds, batch jobs. Are the end users directly
connected? From what kind of device?
3. Who says there is a performance problem? What is slow?
Are the end users complaining, or do you have some objective business
measure like batch jobs not completing quickly enough? If there are no
complaints, then you should be measuring business-oriented throughput
and response times, together with system utilization levels. Don't
waste time worrying about obscure kernel measurements. If you have
established a baseline of utilization, business throughput, and
response times, then it is obvious when there is a problem because the
response time will have increased, and that is what drives user
perceptions of performance. It is useful to have real measures of
response times or a way to derive them. You may get only subjective
measures -- "it feels sluggish today" -- or have to use a stopwatch to
4. What is the system configuration?
How many machines are involved; what is the CPU, memory, network, and
disk setup; what version of Solaris is running; what relevant patches