Apache capacity planning

Apache capacity planning

LogFormat “%{X-Forwarded-For}i %l %u %t \”%r\” %>s %b \”%{Referer}i\” \”%{User-Agent}i\”” proxy
SetEnvIf X-Forwarded-For “^.*\..*\..*\..*” forwarded
CustomLog “logs/access_log” combined env=!forwarded
CustomLog “logs/access_log” proxy env=forwarded

The apache MPM

http://articles.slicehost.com/2010/5/19/configuring-the-apache-mpm-on-centos

http://www.howtoforge.com/configuring_apache_for_maximum_performance

The apache MPM

Part of the apache web server installation is the “MPM”, which stands for “Multi-Processing Method”.
The MPM determines the mechanism apache uses to handle multiple connections. Now that we have an idea of where apache keeps its configs we’ll cover in detail
how the main MPMs are configured and how you might optimize their settings for your environment.
The difference

The first thing to know is that there are several MPMs that apache can use, but the main MPMs are “worker” and “prefork”.

The worker MPM primarily handles connections by creating new threads within a child process, while the prefork MPM spawns a new process to handle each connection.
The worker MPM is considered more efficient, but some modules aren’t stable when running under the worker MPM.
The yum apache package defaults to the prefork MPM for the best compatibility with modules.
Most users won’t notice a difference in performance between the MPMs, but it’s good to know they’re there.
If you find your site is having trouble scaling, for example, you might want to switch to the worker MPM even though it isn’t recommended by a module you’re using. PHP,
for instance, will switch apache to the prefork MPM when aptitude installs it, but newer versions of PHP can be compiled with worker MPM support.
For a change like that, you’ll want to consult your module’s documentation to see what it may have to say about apache MPMs.

The prefork MPM

Default:

StartServers 8
MinSpareServers 5
MaxSpareServers 20
ServerLimit 256
MaxClients 256
MaxRequestsPerChild 4000

StartServers

This is the number of child server processes created at startup, ready to handle incoming connections.
If you’re expecting heavy traffic you might want to increase this number so the server is ready to handle a lot of connections right when it’s started.

MinSpareServers

The minimum number of child server processes to keep in reserve.
MaxSpareServers

Maximum number of child server processes that will be held in reserve. Any more than the maximum will be killed.

ServerLimit

The ServerLimit directive sets an absolute limit on the MaxClients directive. The reasons for this aren’t interesting enough to go into here, so the main thing to know about this directive is that
it should usually be set to the same value as MaxClients, and probably shouldn’t be set at all if you set MaxClients lower than 256.

MaxClients

Sets the maximum simultaneous requests that Apache will handle. Anything over this number will be queued until a process is free to action the request.

MaxClients is not the same as the maximum number of visitors you can have. It is the maximum number of requests that can be fielded at the same time.

Remember the KeepAliveTimeout? This was set low so the connections used by idle web clients can be recycled more quickly to handle new web clients.
Each active connection uses memory and counts toward the MaxClients total. If you hit the number of connections in the MaxClients setting,
web clients will be stuck waiting for a connection slot to free up.

The trick with MaxClients is that you want the number to be high enough that visitors don’t have to wait before connecting to your site,
but not so high that apache needs to grab more memory than is available on your server. If you go over the available memory for your server
it will start dipping into swap memory, which is slow and ugly and trust me you don’t want to do that.
For the prefork MPM, a new process is started when apache handles a new connection. That means MaxClients sets the maximum number of processes apache
will create to handle incoming clients. Memory can definitely be a limiting factor here.

MaxRequestsPerChild

Sets how many requests a child process will handle before terminating. The default is zero, which means it will never die.

Why change this if the Max numbers are set as shown above? Well, it can help in managing your Slice memory usage.

If you change the default you give a child a finite number of actions before it will die.

This will, in effect, reduce the number of processes in use when the server is not busy, thus freeing memory.

Freeing it for what though? If other software needed memory then it would also need it when the server is under load. It is unlikely you will have anything that requires memory only when the server is quiet.

The worker MPM

Defaults:

StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0

Configuring Apache for Maximum Performance

2.1 Load only the required modules:

Run apache with only the required modules. This reduces the memory footprint and hence the server performance. Statically compiling modules will save RAM that’s used for supporting dynamically loaded modules,
but one has to recompile Apache whenever a module is to be added or dropped

2.2 Choose appropriate MPM:

orker MPM uses multiple child processes. It’s multi-threaded within each child and each thread handles a single connection.
Worker is fast and highly scalable and the memory footprint is comparatively low. It’s well suited for multiple processors. On the other hand,
worker is less tolerant to faulty modules and faulty threads can affect all the threads in a child process.

Prefork MPM uses multiple child processes, each child handles one connection at a time. Prefork is well suited for single or double CPU systems,
speed is comparable to that of worker and it’s highly tolerant to faulty modules and crashing children. But the memory usage is high,
more traffic leads to more memory usage.

3.1 DNS lookup:

The HostnameLookups directive enables DNS lookup so that hostnames can be logged instead of the IP address. This adds latency to every request since the DNS lookup has to be completed before the request is finished.
HostnameLookups is Off by default in Apache 1.3 and above. Leave it Off and use post-processing program such as logresolve to resolve IP addresses in Apache’s access logfiles.
.Logresolve ships with Apache.

When using Allow from or Deny from directives, use IP address instead of a domain name or a hostname.
Otherwise a double DNS lookup is performed to make sure that the domain name or the hostname is not being spoofed.

3.2 AllowOverride:
If AllowOverride is not set to ‘None’, then Apache will attempt to open .htaccess file (as specified by AccessFileName directive) in each directory that it visits. For example:

DocumentRoot /var/www/html

AllowOverride all

3.3 FollowSymLinks and SymLinksIfOwnerMatch:
If FollowSymLinks option is set, then the server will follow symbolic links in this directory. If SymLinksIfOwnerMatch is set, then the server will follow symbolic links only if the target file or directory is owned by the same user as the link.

If SymLinksIfOwnerMatch is set, then Apache will have to issue additional system calls to verify whether the ownership of the link and the target file match.
Additional system calls are also needed when FollowSymLinks is NOT set. For example:

DocumentRoot /vaw/www/html

Options SymLinksIfOwnerMatch

3.4 Content Negotiation:

Avoid content negotiation for fast response. If content negotiation is required for the site, use type-map files rather than Options MultiViews directive. With MultiViews,
Apache has to scan the directory for files, which add to the latency.

3.5 MaxClients:

The MaxClients sets the limit on maximum simultaneous requests that can be supported by the server. No more than this much number of child processes are spawned.
It shouldn’t be set too low such that new connections are put in queue, which eventually time-out and the server resources are left unused.
Setting this too high will cause the server to start swapping and the response time will degrade drastically.
Appropriate value for MaxClients can be calculated as: MaxClients = Total RAM dedicated to the web server / Max child process size —-
[4] Child process size for serving static file is about 2-3M. For dynamic content such as PHP, it may be around 15M. The RSS column in

3.6 MinSpareServers, MaxSpareServers, and StartServers:

3.7 MaxRequestsPerChild:

3.8 KeepAlive and KeepAliveTimeout:

4 HTTP Compression & Caching

5 Separate server for static and dynamic content