Introduction to Memory Bottlenecks
If ever your Windows 2003 server is running slowly, then the first place to look for a bottleneck
is memory. Another way of looking at server performance, is that machines with
plenty of RAM rarely give problems. A bonus of plenty of memory is,
that to a degree, abundant RAM compensates for
strain on other resources.
On old servers, lack of memory would give you the full sensory input, you could hear the disk paging, see the light flashing, and 'Mad' Mick swears you could smell the disk thrashing. Even with these
sensory clues, it is still worth while monitoring memory with Performance Logs. Please also remember the big picture. So once you have had a quick look at memory, remember to check the processor and
disk counters.
The servers most likely to suffer from memory shortage are pure database servers
for example, Oracle or SQL. Email servers also require plenty of RAM.
Pure domain controllers are less likely to experience memory problems.
Memory Topics
♦
The more available memory the faster the server can respond. When I check a server's memory with performance monitor, the first
counter that I add to the log is Memory\Available bytes. As long as the trace
indicates more than 10MB of free memory, then I conclude that the server has
sufficient RAM.
Diagram 1 shows a white descending line,
and the legend confirms
that Available bytes are down to 3MB. Clearly this machine needs more memory.
Suppose a spreadsheet wants to start a new thread or a database needs to
sort data, what each needs is memory. The operating system
provides this memory at
least 100 times faster using RAM, than it could using a disk based pagefile. This is why a
large pool of free memory is so important to an application server.
Take care to distinguish between these two paging counters: 1) Pages
/sec (Hard page faults) 2) Page Faults /sec counter is likely to be at
least twice the value of the above.
Two problems with monitoring in general, firstly no counter should
be taken in isolation, secondly spikes should be ignored, or at least played
down. The less paging the better your server's performance. Most authorities
agree that Memory: Pages / sec is a key memory counter. This counter
measures 'hard' page faults, in other words the page in nowhere in memory,
so the VMM (Virtual Memory Manager) has to fetch the data from the pagefile
on the disk; in computing terms that takes an age.
I am reluctant to disagree with other authorities, but from my experience,
I would put the threshold as high as 20 pages /sec, before blaming paging as
the bottleneck. Moreover, I
would not trust pages /sec as an indicator of a bottleneck without confirmation from low Available bytes.
(see above)
In truth, if you put 5 experts in the same room, they could all spot a
memory bottleneck, but when they wrote up their notes, they would use
different time slices and different thresholds, consequently, it would seem that there
was a conflict where none actually existed.
2) Memory: Page Faults / Sec
Page faults / sec is the sum of hard and soft page faults. Soft page faults are
where the data is found elsewhere in RAM. For example, Word has opened
the spellchecker, and now Outlook wishes to use it, there is no need for
another
call to the disk as the spellchecker is already in memory. Hard
page faults are generated whenever the VMM has to fetch data from the pagefile
on the disk.
3) Page File: % Usage
While the pagefile is less likely to be a bottleneck it is easy to check, and
satisfying to fix. You could also confirm that it is on the most suitable
disk, and if possible split the pagefile over two disks and thus improve access
times. (Note the object here is Page File not Memory)
Because the changes are so gradual, you are better off using this Page File % Usage counter
as an alert, rather than a log. I suggest setting an alert
on a limit of over 70.
Guy Recommends: The Orion Network Performance Monitor (NPM) 9.5
Orion's performance monitor is designed for detecting network outages.
This NPM will guide you
through troubleshooting by indicating whether the root cause is a broken link,
faulty equipment or resource overload. Because it produces
network-centric views, it is intuitive to navigate, and as result you can
see easily what's working and what's not.
Perhaps Orion's best feature is the way it suggests solutions. Moreover, if
problems arise out of the blue, then you can configure Orion NPM 9.5 to notify
members of your team what's changed and how to fix it.
If you are interested in testing a professional performance monitor on your
network, then I recommend that you take advantage of Solarwinds' offer of a
download a free trial of Orion's Network Performance Monitor.
Creating a Memory Bottleneck
If you really want to see a memory bottleneck that you can measure with
performance monitor, then add the MAXMEM switch
to your server's boot.ini. For example:
multi(0)disk(0)rdisk(0)partition(1)\Windows="Windows Server 2003" /MAXMEM=256
|