Linux log analyzing
From Oxxus Wiki
Contents |
Log analyzing overview
It's an important part of checking off your system and troubleshooting problems. Linux servers have a dedicated main logging directory in /var/log
There you have several different log files that rotate automatically, where most recent is the exact log name, and previous records are stored in log.1, log.2, etc. Some of the logs might be compressed in .GZ format to save space on the machine.
Applications that are installed on your server might do the logging in their own directory structure. For example, Tomcat server does logging in logs/ directory relative to its base directory. JBOSS in server/default/logs/ and so on.
Using head/tail to monitor the logs
The mostly used tool to monitor the log file is tail command. It gives you the last 10 lines of the file. You can also use it to get any number of lines like: tail -150 (gets the last 150 lines of the file).
Analog to tail is a head command which does the same but from the beginning of the file.
The most useful feature of the tail is real time file stream reading. So, basically with command: tail -f /var/log/messages, you can monitor the file /var/log/messages in real time logging.
Using grep to find log entries
GREP is a tool that does regular expression matches against files and patterns you specify. The most useful grep switches are:
grep -i - ignores case when searching (ex. grep -i oct /var/log/secure) grep -v - displays all but matched lines. Useful when you want to exclude some patterns (ex. grep -v Oct /var/log/secure # excludes October from the log file) grep -e - matches regular pattern against the text) (ex. grep -e "^$" file # find all the empty lines in the file) grep -c - prints only the number of matched lines
Piping the commands to get the relevant entries
Pipe in Linux as "|" is used to redirect the standard output from one program to another's standard input.
Let's say for example that there is a following log entry in /var/log/secure
Nov 5 15:31:04 test sshd[9580]: Failed password for root from 91.205.189.15 port 37814 ssh2 Nov 5 15:31:04 test sshd[9581]: Received disconnect from 91.205.189.15: 11: Bye Bye Nov 5 15:31:04 test sshd[9582]: Failed password for invalid user bksales from 91.205.189.15 port 58339 ssh2 Nov 5 15:31:04 test sshd[9583]: Received disconnect from 91.205.189.15: 11: Bye Bye Nov 5 15:31:14 test sshd[9584]: Did not receive identification string from UNKNOWN Nov 5 15:31:14 test sshd[9585]: Did not receive identification string from 91.205.189.15
If you want to search for all the Failed password attempts, you would use grep "Failed password" /var/log/secure to get the output.
Something like below will show:
Nov 5 15:30:52 test sshd[9570]: Failed password for root from 91.205.189.15 port 37097 ssh2 Nov 5 15:30:55 test sshd[9572]: Failed password for root from 91.205.189.15 port 57664 ssh2 Nov 5 15:30:56 test sshd[9574]: Failed password for invalid user bksales from 91.205.189.15 port 37345 ssh2 Nov 5 15:31:00 test sshd[9576]: Failed password for root from 91.205.189.15 port 37625 ssh2 Nov 5 15:31:01 test sshd[9578]: Failed password for root from 91.205.189.15 port 57974 ssh2
You can exclude invalid user from the output by adding one more grep command: grep "Failed password" /var/log/secure |grep -v "invalid user" and will receive the following output:
Nov 5 15:30:52 test sshd[9570]: Failed password for root from 91.205.189.15 port 37097 ssh2 Nov 5 15:30:55 test sshd[9572]: Failed password for root from 91.205.189.15 port 57664 ssh2 Nov 5 15:31:00 test sshd[9576]: Failed password for root from 91.205.189.15 port 37625 ssh2 Nov 5 15:31:01 test sshd[9578]: Failed password for root from 91.205.189.15 port 57974 ssh2
You can pipe as many commands as you like and they are not limited to one tool such as GREP. You will see in the next section how to combine different commands to format the output.
Using cut, awk, sort and uniq commands for log analyzing
These four commands are very useful in getting the information and formatting from the log files. cut - splits text awk - text processing tool sort - text sort tool uniq - tool that removes duplicates (text needs to be sorted for the algorithm to work)
In the following example, we will use all tools to get the list of IP addresses from the /var/log/secure that have Failed password attempts. The example of the log can be seen in the previous section. We will first use cut to split the line by the delimiter ":" into fields and will print just the forth field (notice that date uses ":" to format itself). So the 4th field is of our interest: Command: grep "Failed password" /var/log/secure |cut -d ":" -f 4 Output:
Failed password for root from 91.205.189.15 port 57041 ssh2 Failed password for root from 91.205.189.15 port 36840 ssh2 Failed password for root from 91.205.189.15 port 57376 ssh2 Failed password for root from 91.205.189.15 port 37097 ssh2 Failed password for root from 91.205.189.15 port 57664 ssh2 Failed password for invalid user bksales from 91.205.189.15 port 37345 ssh2
We will now use grep to remove 'invalid user' from the output and add awk command to print the sixth field. Command: grep "Failed password" /var/log/secure |cut -d ":" -f 4 |grep -v "invalid user" |awk '{print $6 }'
Output:
92.53.107.226 92.53.107.226 92.53.107.226 91.205.189.15 91.205.189.15 ...
Now we just need to sort the output and add uniq to the command to get all the IP addresses.
Command: grep "Failed password" /var/log/secure |cut -d ":" -f 4 |grep -v "invalid user" |awk '{print $6 }' |sort -n |uniq
Output:
83.103.59.130 91.205.189.15 92.53.107.226 93.185.34.100
It's simple ;)
The awk tool
Awk, as mentioned before, needs a text processing utility. While it has many features, we will work on few that are most useful for log analyzing. The example from the previous section with IP addresses will be used. We will add switch -c to uniq which will print the number of occurrences of the removed duplicates. Command: Command: grep "Failed password" /var/log/secure |cut -d ":" -f 4 |grep -v "invalid user" |awk '{print $6 }' |sort -n |uniq -c
Output:
26 91.205.189.15 26 92.53.107.226 50 93.185.34.100 5 115.236.59.123 2 119.161.145.205
Before we go on from here, we just need to define few rules: 1. awk scripts uses dollar sign for fields. So $1 refers to the first field, $2 to the second, etc. $0 refers to the whole line 2. With awk, you may use if conditions as if ($var) print ... 3. print command prints the text on the standard output
We will use print only the IPs that are found more then 10 times: Command: grep "Failed password" /var/log/secure |cut -d ":" -f 4 |grep -v "invalid user" |awk '{print $6 }' |sort -n |uniq -c |awk '{ if ($1>10) print $2 }'
Output:
91.205.189.15 92.53.107.226 93.185.34.100
You can do even more with awk and add regular expression for matching. If you want to learn awk beyond the basics, there are good online tutorials on this topic.
Regular expression and grep
While regular expression is a huge language, you'll need to be familiar with only few basic rules to add to your grep power. You need to use grep -e or egrep to match the patterns.
The sign ^ represents the begging of the line, and when inserted before the word, it expects to find the exact match on the beginning on the line. The same goes for the end of the line but with $ sign. If you need to find a line that starts with security, you would search grep -e "^security" file.
You can also specify ranges for parsing. For example, if you want to list all the lines that begin with a number you would use: grep -e "^[0-9]" file
Or if you want to find all the lines ending with the alpha character, you would use grep -e "[a-Z]$" file
If you find the above instructions too complicated, leave VPS health in the hands of our capable engineers by submitting a support ticket and check out our latest offers.