What is sysctl?
sysctl is an interface to view and dynamically change parameters in Linux and other *NIX operating systems. In Linux, most of the dynamic Kernel settings can be changed via sysctl. The parameters set by sysctl are also available under the virtual /proc filesystem.
How do I use sysctl?
To read values you’ve two options:
reading parametersShell
# Option 1: Using the sysctl command to read current parameters:
sysctl net.ipv4.ip_forward # display specific parameter
sysctl net.ipv4 # display all net.ipv4.* parameters
sysctl -a # display all parameters
# Option 2: Using the /proc filesystem:
cat /proc/sys/net/ipv4/ip_forward
# Option 1: Using the sysctl command to read current parameters:
sysctl net.ipv4.ip_forward # display specific parameter
sysctl net.ipv4 # display all net.ipv4.* parameters
sysctl -a # display all parameters
# Option 2: Using the /proc filesystem:
cat /proc/sys/net/ipv4/ip_forward
To write values you can use both options again:
changing parametersShell
# Option 1: Using the sysctl command to change a parameter:
sysctl net.ipv4.ip_forward=1
# Option 2: Using the /proc filesystem to change a parameter:
echo 1 >/proc/sys/net/ipv4/ip_forward
1
2
3
4
5
# Option 1: Using the sysctl command to change a parameter:
sysctl net.ipv4.ip_forward=1
# Option 2: Using the /proc filesystem to change a parameter:
echo 1 >/proc/sys/net/ipv4/ip_forward
However, these parameters are not persistent. You’ve to configure them in /etc/sysctl.conf or /etc/sysctl.d/* if you want them active after a reboot.
sysctl configuration files
/etc/sysctl.conf
/etc/sysctl.d/
1
2
/etc/sysctl.conf
/etc/sysctl.d/
Please note that configuration changes will not be detected automatically. You’ve to trigger the reload manually:
reload sysctl configuration fileShell
sysctl -p [filename]
1
sysctl -p [filename]
Tuning Linux with sysctl
Kernel
To automatically reboot a system after a kernel panic, you can set the following parameter to the amount of seconds to wait before reboot:
reboot system after kernel panicShell
kernel.panic = 60
kernel.panic = 60
Linux Kernels provide a magic SysRq key, which allows the user to perform low-level commands regardless of the systems state. To enable this magic key you’ve to set:
enable magic SysRq keyShell
kernel.sysrq = 1
kernel.sysrq = 1
To make sure core dumps will always be written set the following parameter:
write core dumpsShell
fs.suid_dumpable = 2
fs.suid_dumpable = 2
It can be useful to have the PID appended on the filename of core dumps. This can be especially useful for debugging multi-threaded applications and it’s easy to setup:
add PID to core dumpsShell
kernel.core_uses_pid = 1
kernel.core_uses_pid = 1
To increase the maximum number of used process IDs you can define the following parameter:
increase maximum PIDShell
kernel.pid_max = 65536
kernel.pid_max = 65536
Memory
To tune the memory (VM) behaviour in Linux, you can set some vm.* parameters.
For example to tell the Kernel how aggressively memory pages should be written to disk (aka swapping), you’ve to change the swappiness value. The higher the value, the more aggressive the swapping:
swappinessShell
vm.swappiness
1
vm.swappiness
When you look at filesystems then most of the time some kind of cache is involved. The amount of filesystem cache is based on the percentage of total available memory. To set the maximum amount of filesystem cache can be defined with:
maximum filesystem cacheShell
vm.dirty_ratio = 40
vm.dirty_ratio = 40
When the defined percentage of memory is reached, then all I/O writes are blocked until enough dirty pages have been flushed to disk by pdflush. This is quite suboptimal because on a healthy system you don’t want to have blocked I/O writes at all. Therefor there’s another parameter, which defines the minimal percentage of dirty memory before the background pdflush process starts to flush out dirty memory pages:
background filesystem cache flushesShell
vm.dirty_background_ratio = 10
vm.dirty_background_ratio = 10
As already described before, pdflush is in charge of flushing dirty pages to disk. So you can optionally change the flush interval by setting the following parameter (in hundredths of seconds, e.g. 500 = 5s):
pdflush intervalShell
vm.dirty_writeback_centisecs = 500
vm.dirty_writeback_centisecs = 500
Of course pdflush needs to know when data can be removed from cache. Sometimes it makes sense to increase the time how long “untouched” data lives be in the cache before it’s marked as expired. Just overwrite the following parameter (again in hundredths of seconds):
pdflush intervalShell
vm.dirty_expire_centiseconds = 3000
vm.dirty_expire_centiseconds = 3000
If you want to have more informations about the memory on your system, just have a look at:
display memory informationsShell
cat /proc/meminfo
cat /proc/meminfo
Filesystem
To increase the maximum amount of file descriptors you can use.
increase maximum filedescriptorsShell
fs.file-max = 65535
fs.file-max = 65535
Exec Shield
Exec Shield is a protection against worms and other automated remote attacks on Linux systems. It was invented by Red Hat in 2002. To enable Exec Shield:
enable Exec Shield protectionShell
kernel.exec-shield = 1
kernel.randomize_va_space = 1
kernel.exec-shield = 1
kernel.randomize_va_space = 1
Network Core
Some applications are configured for performance and sometimes an application can handle huge buffers. To increase the maximum buffer size for all sockets / connections (this will affect all buffers, e.g. net.ipv4.tcp_rmem) you can use:
increase max buffer sizeShell
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
When a system is under heavy load and an interface receives a lot of packets, then the Kernel might not process them fast enough. You can increase the number of packets hold in the queue (backlog) by changing:
increase maximum backlog size for net devicesShell
net.core.netdev_max_backlog = 5000
net.core.netdev_max_backlog = 5000
IPv4
First of all we recommend you tune ICMP a bit. You can do that by ignoring ICMP broadcasts, which will protect you from ICMP floods. We also ignore bogus responses to broadcast frames (violation against RFC1122), so that our log isn’t full of Kernel warnings:
hardening ICMPShell
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
net.ipv4.icmp_echo_ignore_broadcasts = 1
net.ipv4.icmp_ignore_bogus_error_responses = 1
SYN floods are a type of DDoS and can harm your system. To protect from it you should enable SYN cookies, resize the SYN backlog (queue size) and reduce SYN/ACK retries:
enable SYN cookiesShell
# Turn on SYN cookies to protect from SYN flood attacks.
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.tcp_synack_retries = 3
# Turn on SYN cookies to protect from SYN flood attacks.
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 2048
net.ipv4.tcp_synack_retries = 3
To log packets with impossible addresses simply enable:
log impossible IPv4 addressesShell
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.default.log_martians = 1
To disable IP source routing (SRR), so that nobody can tell us which path a packet should take:
deny packets with SRR optionShell
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.default.accept_source_route = 0
By default, routers router everything and even packages which don’t belong to their network(s). To avoid that we’ve to make sure strict reverse path filtering is enabled as defined in RFC3704:
enable strict reverse path filteringShell
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.default.rp_filter = 1
Some applications support higher read and write buffers for sockets. The buffer size parameters are defined by 3 values (min, default, max). To increase the maximum buffer set:
increase max TCP buffer sizeShell
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
net.ipv4.tcp_rmem = 4096 87380 8388608
net.ipv4.tcp_wmem = 4096 87380 8388608
To get better throughput in a network, it might make sense to enable TCP window scaling as defined in RFC1323:
enable TCP window scalingShell
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_window_scaling = 1
Disable (ICMP) redirects at all. Please note that the send_redirects parameters should be enabled on routers:
disable redirectsShell
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0 # Don’t disable this on routers!
net.ipv4.conf.default.send_redirects = 0 # Don’t disable this on routers!
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.default.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0 # Don’t disable this on routers!
net.ipv4.conf.default.send_redirects = 0 # Don’t disable this on routers!
Finally disable IPv4 forwarding on non-routing systems:
disable forwardingShell
net.ipv4.ip_forward = 0
net.ipv4.ip_forward = 0
IPv6
Those who don’t use IPv6 at all should disable it:
disable IPv6Shell
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.all.disable_ipv6 = 1
If you’re already using IPv6 you might be interested in the following parameters.
On non-routing systems you should disable router solicitations:
disable router solicitationsShell
net.ipv6.conf.default.router_solicitations = 0
net.ipv6.conf.all.router_solicitations = 0
net.ipv6.conf.default.router_solicitations = 0
net.ipv6.conf.all.router_solicitations = 0
You should also don’t accept routing preferences from router advertisements:
disable router preferences in RAShell
net.ipv6.conf.default.accept_ra_rtr_pref = 0
net.ipv6.conf.all.accept_ra_rtr_pref = 0
net.ipv6.conf.default.accept_ra_rtr_pref = 0
net.ipv6.conf.all.accept_ra_rtr_pref = 0
Don’t try to learn prefix information in router advertisements:
don’t learn prefix informations in RAShell
net.ipv6.conf.default.accept_ra_pinfo = 0
net.ipv6.conf.all.accept_ra_pinfo = 0
net.ipv6.conf.default.accept_ra_pinfo = 0
net.ipv6.conf.all.accept_ra_pinfo = 0
Don’t accept hop limits from router advertisements:
don’t accept hop limits from RAShell
net.ipv6.conf.default.accept_ra_defrtr = 0
net.ipv6.conf.all.accept_ra_defrtr = 0
net.ipv6.conf.default.accept_ra_defrtr = 0
net.ipv6.conf.all.accept_ra_defrtr = 0
Disable IPv6 auto configuration, so that no unicast addresses can automatically be configured on your interface from a router advertisement:
disable auto configuration from RAShell
net.ipv6.conf.default.autoconf = 0
net.ipv6.conf.all.autoconf = 0
net.ipv6.conf.default.autoconf = 0
net.ipv6.conf.all.autoconf = 0
If you don’t want your system to be verbose about its neighbours, you should disable neighbour solicitations at all:
disable auto configuration from RAShell
net.ipv6.conf.default.dad_transmits = 0
net.ipv6.conf.all.dad_transmits = 0
net.ipv6.conf.default.dad_transmits = 0
net.ipv6.conf.all.dad_transmits = 0
Unless you need more than one global unicast address, you should fix the number of assigned global unicast addresses per interface to 1:
disable auto configuration from RAShell
net.ipv6.conf.default.max_addresses = 1
net.ipv6.conf.all.max_addresses = 1
net.ipv6.conf.default.max_addresses = 1
net.ipv6.conf.all.max_addresses = 1
.all & .default
A lot of sysctl parameters have several values, because there’s a .default, .all and sometimes even a .
According to a comment on the linux-kernel mailing list, there’s one major difference:
The default value will only be applied ONCE, at the point when an interface is created.
The all value will ALWAYS applied in addition.
This means when an interface is created, the default value will be applied to it once. However, you can overwrite that with the interface-specific parameter. The global .all parameter will always be applied in addition and in the end it depends of the logical operator how the “final value” looks like.
For example there are parameters where all settings need to be 1 (aka AND), where only one of the settings need to be 1 (aka OR) or where the highest value will be used (aka MAX).
So it’s important to know that existing interfaces might have a different value than the one you’ve set as default or all.
Recent Comments