Why Perl?
- shell or awk programming with
- grep, cut, sort, and sed
Access Log in WebLogic
By default, WebLogic Server (WLS) keeps a log of all HTTP transactions in a text file. The file is named access.log and is located in the
$DOMAIN_HOME/servers/Xxx/logs
directory.
The log provides true timing information from WebLogic, in terms of how long each individual application request takes. This timing information can be important in troubleshooting a slow system.
For more details, read [2] or other more updated information at the Oracle official site.
Awk is a pattern scanning and processing language, which is good for purposes of extracting or transforming text, such as producing formatted reports. Read [4] for more details.
Perl is a powerful programming language due to its unsurpassed regular expression and string parsing abilities. In Perl, you can use patterns to locate the parts of strings that you want to change with its “search and replace” .
Search and replace is performed using s/regex/replacement/modifiers. The replacement is a Perl double-quoted string that replaces in the string whatever is matched with the regex . If there is a match, s/// returns the number of substitutions made; otherwise it returns false.
The log provides true timing information from WebLogic, in terms of how long each individual application request takes. This timing information can be important in troubleshooting a slow system.
For more details, read [2] or other more updated information at the Oracle official site.
Case Study
In this article, we will use the below sample access log entry for the illustration:
2020-08-23 15:54:02 0.031 479 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/system/skins/activetheme 404 "4cb1bd49-1deb-4b6f-84c5-1153f22e3739-0000000c" "1.4cb1bd49-1deb-4b6f-84c5-1153f22e3739-0000000c;kXKwo3hCQtRLGmjE0ZJOoOTLkKPOoLRKlSODoITT_G" - -
Notice that the above fields are separated by horizontal tab (i.e., ht), not spaces.
0000000 2 0 2 0 - 0 8 - 2 3 ht 1 5 : 5 4
0000020 : 0 2 ht 0 . 0 3 1 ht 4 7 9 ht G E
0000040 T ht / b i - c o n t e n t s t o
0000060 r a g e / a p i / v 1 / i n s t
0000100 a n c e s / b o o t s t r a p /
0000120 a r t i f a c t s / n a m e s p
0000140 a c e s / c o n t e n t : c a t
0000160 a l o g / a t t r i b u t e s /
0000200 s y s t e m / s k i n s / a c t
0000220 i v e t h e m e ht 4 0 4 ht " 4 c
0000240 b 1 b d 4 9 - 1 d e b - 4 b 6 f
0000260 - 8 4 c 5 - 1 1 5 3 f 2 2 e 3 7
0000300 3 9 - 0 0 0 0 0 0 0 c " ht " 1 .
0000320 4 c b 1 b d 4 9 - 1 d e b - 4 b
0000340 6 f - 8 4 c 5 - 1 1 5 3 f 2 2 e
0000360 3 7 3 9 - 0 0 0 0 0 0 0 c ; k X
0000400 K w o 3 h C Q t R L G m j E 0 Z
0000420 J O o O T L k K P O o L R K l S
0000440 O D o I T T _ G " ht - ht - nl
Awk
For well-formatted access.log in WLS, awk can be handy for extracting fields such as:
- cs-method — The request method, for example GET or POST. This field has type <name>, as defined in the W3C specification.
- cs-uri — The full requested URI. This field has type <uri>, as defined in the W3C specification.
- sc-status — Status code of the response, for example (404) indicating a "File not found" status. This field has type <integer>, as defined in the W3C specification.
bash-4.2$ awk '{ print $5, $6, $7 }' sample.log | grep "\s404" | sort -r | uniq -c
1 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/users/18446744073709551615 404
1 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/system/skins/activetheme 404
1 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/maintenancemode 404
Perl
bash-4.2$ perl -n -p -e 's/^[^A-Z]*([A-Z]+)\s([^\"]*)\s(\".*)/$1 $2/g' sample.log | grep "\s404" | sort -r | uniq -c
1 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/users/18446744073709551615 404
1 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/system/skins/activetheme 404
1 GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/maintenancemode 404
Note that the above Perl example is not the optimal command for the designed purpose. But, we just try to demonstrate as many Perl's features as possible in one example.
In the above example, our substitution operator is:
s/^[^A-Z]*([A-Z]+)\s([^\"]*)\s(\".*)/$1 $2/g
or
regex: "^[^A-Z]*([A-Z]+)\s([^\"]*)\s(\".*)"replacement: "$1 $2"modifiers: "g"
where
- regex
- ^[^A-Z]* matches 2020-08-23 15:54:02 0.031 479
- The first capturing group ([A-Z]+) or $1 match GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/system/skins/activetheme
- The second capturing group ([^\"]*) or $2 matches 404
- Note that we have discarded the third capturing group or $3
- \s matches a whitespace character (i.e., ht)
- replacement
- The whole line was changed to "$1 $2" or
- GET /xx-contentyyyyyyy/api/v1/instances/bootstrap/artifacts/namespaces/content:catalog/attributes/system/skins/activetheme 404
- modifiers
- The global modifier /g allows the matching operator to match within a string as many times as possible. In our example, it is not needed. But, for illustration only.
You can read this Perl script example to learn more of its features.
Acknowledgement
This author would like thank his co-worker Mohan Tadepalli for providing Perl examples and inspiring me to write this article.