Insights on Oracle & Tech: Linux

Showing posts with label Linux. Show all posts

Sunday, August 30, 2020

NTP—How to Check It Is Working?

Network Time Protocol (NTP) synchronizes timekeeping among a set of distributed time servers and clients. This synchronization allows events to be correlated when system logs are created and other time-specific events occur.

Clock Skew

Clock skew is a phenomenon in computers in which the same sourced clock signal arrives at different components at different times. The instantaneous difference between the readings of any two clocks is called their skew.

Causes of Clock Skew could include:

No running NTP services
Not properly configured NTP services^[1]
NTP attack^[2,7-9]

Some of which result in shifting time on NTP clients
Another threat consideration is a malicious insider, who could modify system time in attempts to hide events or manipulate time sensitive transactions.

Network is congested or lossy^[3]

NTP—How Does It Work?

NTP uses the User Datagram Protocol (UDP) as its transport protocol. All NTP communication uses Coordinated Universal Time (UTC). An NTP network usually receives its time from an authoritative time source, such as a radio clock or an atomic clock attached to a time server. NTP distributes this time across the network.

NTP is extremely efficient; no more than one packet per minute is necessary to synchronize two machines to within a millisecond of each other.

Stratum

NTP uses the concept of a “stratum” to describe how many NTP “hops” away a machine is from an authoritative time source. A “stratum 1” time server typically has an authoritative time source (such as a radio or atomic clock, or a GPS time source) directly attached, a “stratum 2” time server receives its time via NTP from a “stratum 1” time server, and so on.

Was Your NTP Service Properly Configured?

Your web server’s system time can keep on slipping far into the future or past if NTP is not properly configured. Having accurate system time is critical for:

Application logic
Scheduled jobs
Logging

If the system time is off, log forensics and log correlation of security events across systems becomes a nightmare

and this is especially true for virtual machine based deployments.

How to Sync the Clock on VMs?

If you use Red Hat Enterprise Linux, here are some existing knowledge base documents on how to sync the clock on VMs, such as:

How to check NTP is working?

You can use the below commands to check:

ntpq — standard NTP query program
ntpstat — show network time synchronisation status
timedatectl — show or set info about ntp using systemd

ntpq

The ntpq utility program is used to monitor NTP daemon ntpd operations and determine performance.

$ ntpq

ntpq> pe
     remote      refid       st t when poll reach delay offset jitter
=====================================================================
-isipc6.cairn.ne .GPS1.        1 u  18  64  377  65.59 2 -5.891  0.044
+saicpc-isiepc2. pogo.udel.edu 2 u 241 128  370  10.477 -0.117  0.067
+uclpc.cairn.net pogo.udel.edu 2 u  37  64  177 212.111 -0.551  0.187
*pogo.udel.edu   .GPS1.        1 u  95 128  377   0.607  0.123  0.027

The tattletale symbol at the left margin displays the synchronization status of each peer. The currently selected peer is marked *, while additional peers designated acceptable for synchronization, but not currently selected, are marked +.
Peers marked * and + are included in the weighted average computation to set the local clock; the data produced by peers marked with other symbols are discarded. See ntpq for the meaning of these symbols.

remote

Correspond to the server and peer entries listed in the configuration file; however, the DNS names might not agree if the names listed are not the canonical DNS names.

refid

Shows the current source of synchronization

Reveals the stratum

The type (u = unicast, m = multicast, l = local, - = don't know)

when (in secs)

Shows the time since the peer was last heard in seconds

poll (in secs)

The poll interval

reach

Shows the status of the reachability register (see RFC-1305) in octal.

delay (in ms)

Show the latest round-trip delay

offset (in ms)

Show the latest offset

Offset generally refers to the difference in time between an external timing reference and time on a local machine.
The greater the offset, the more inaccurate the timing source is. Synchronized NTP servers will generally have a low offset.

jitter (in ms)

Show the latest jitter (or estimated error) in milliseconds

The jitter associated with a timing reference indicates the magnitude of variance, or dispersion, of the signal. Different timing references have different amounts of jitter. The more accurate a timing reference, the lower the jitter value.
Note that in NTP Version 4 what used to be the dispersion column has been replaced by the jitter column.

To avoid possible distractions due to name resolution problems, run the ntpq program using the -n switch.

-n

Output all host addresses in dotted-quad numeric format rather than converting to the canonical host names.

# ntpq -np

remote refid st t when poll reach delay offset jitter

==============================================================================

*xxx.254.169.yyy 192.168.0.151 2 u 141 1024 377 0.545 0.066 0.131

ntpstat

ntpstat is a script which prints a brief summary of the system clock's synchronization status when the ntpd or chronyd daemon is running.

# ntpstat

synchronised to NTP server (xxx.254.169.yyy) at stratum 3

time correct to within 56 ms

polling server every 1024 s

# echo $?

You can also use the exit status (return values) to verify its operations from a shell script or command line itself. If exit status is

0 – Clock is synchronized
1 – Clock is not synchronized
2 – If clock state is indefinite or questionable, for example if ntpd is not contactable

How "time correct to within 56 ms" was calculated?

The ntp query outputs from ntpd and chronyd are different. The below discussion is based on ntpd.

distance=$(echo "$delay $disp" | awk '{ printf "%.3f", $1 / 2.0 + $2 }')

if [ -n "$distance" ]; then

printf " time correct to within %.0f ms" "$distance"

Therefore, distance = (delay / 2 + dispersion)

delay = 0.649
dispersion = 55.480

and "time correct to within 56 ms" was printed

Raw Data

# ntpq -c rv

associd=0 status=0615 leap_none, sync_ntp, 1 event, clock_sync,

version="ntpd 4.2.6p5@1.2349-o Tue Jun 23 15:14:56 UTC 2020 (1)",

processor="x86_64", system="Linux/4.14.35-1902.10.8.el7uek.x86_64",

leap=00, stratum=3, precision=-24, rootdelay=0.649, rootdisp=55.480,

refid=xxx.254.169.yyy,

reftime=e2f66690.7a8a26cd Sun, Aug 30 2020 17:55:28.478,

clock=e2f66814.1623cd2d Sun, Aug 30 2020 18:01:56.086, peer=35146,

tc=10, mintc=3, offset=-0.116, frequency=13.462, sys_jitter=0.000,

clk_jitter=0.057, clk_wander=0.009

# ntpstat

synchronised to NTP server (xxx.254.169.yyy) at stratum 3

time correct to within 56 ms

polling server every 1024 s

timedatectl

If you are using systemd based system, timedatectl may be used to query and change the system clock and its settings.

Run the following command to check the service status:

# timedatectl status

Local time: Sun 2020-08-30 17:12:19 UTC

Universal time: Sun 2020-08-30 17:12:19 UTC

RTC time: Sun 2020-08-30 17:12:20

Time zone: UTC (UTC, +0000)

NTP enabled: yes

NTP synchronized: yes

RTC in local TZ: no

DST active: n/a

How can I see the Time Difference between Client and Server?

Normally ntpd maintains an estimate of the time offset. To inspect these offsets, you can use the following commands:^[5]

ntpq -np will display the offsets for each reachable server in milliseconds
ntpdc -c loopinfo will display the combined offset in seconds, as seen at the last poll. If supported, ntpdc -c kerninfo will display the current remaining correction, just as ntptime does.

The first can be used to check what ntpd thinks the offset and jitter is currently, relative to the preferred/current server, the second can tell you something about the estimated offset/error all the way to the stratum 1 source.

References

Sunday, August 2, 2020

How ABRT Avoid Storing Duplicated Crashes — Deduplication

Processes crash for a multitude of reasons and it’s often difficult to understand the root causes that contribute to such crashes. The Automatic Bug Reporting Tool, commonly abbreviated as ABRT, could offer help for forensic investigation.

ABRT

ABRT consists of the abrtd daemon and a number of system services and utilities to process, analyze, and report detected problems.

The daemon runs silently in the background most of the time, and springs into action when an application crashes or a kernel oops is detected. The daemon then collects the relevant problem data such as a core file if there is one, the crashing application's command-line parameters, and other data of forensic utility.

Why ABRT?

Earlier when applications crashed, core dumps were generated, but not limited, which could quickly fill up the disk.

A solution is to use ABRT. For example, it can

Rotate cores within a size limit by deleting the oldest^[11]
Avoid storing duplicate crashes by deduplication^[9]

Elements Collected by ABRT

In the below table, it shows a shortened list of elements collected by ABRT and their descriptions. For a full list see [4]. These elements are stored in the form of files in a single directory per detected problem (such a directory is called 'dump directory').

core_backtrace	Machine readable backtrace with no private data
coredump	Coredump of the crashing process
count	Number of times this problem occured
crash_function	Function which crashed
dmesg	Copy of dmesg
docker_inspect	Output of docker inspect $(container_id)
dso_list	List of dynamic libraries loaded at the time of crash
duphash	Hash of the crash's backtrace
environ	Dump of process environment variable along with their values
event_log	Messages produced by ABRT tools during processing the detected problem
executable	Executable path of the component which caused the problem.
global_pid	Value of %P as passed by kernel to the core_pattern helper (see man core for more details)
hostname	Hostname of the affected machine
kernel	Kernel version string
kernel_log	Results of vmcore crash analysis performed by retrace-server
kernel_tainted_long	Tainted kernel description
kernel_tainted_short	Kernel tainted flags (For more information about tainted flags see [1])
last_occurrence	Time of the last occurence (unixtime)

Deduplication

When ABRT catches new crash, it compares it to the rest of the stored problems to avoid storing duplicate crashes:

It first checks if there is core_bactrace or uuid item in the problem directory it is processing
If there is a core_backtrace

It iterates over all other dump directories and computes similarity to their core backtraces (if any). If one of them is similar enough to be considered duplicate, event processing is stopped and only notify-dup event is fired.

Or if there is an uuid item (and no core backtrace)

Simple comparison of uuid hashes is used for duplicate detection.

You can read abrt-action-analyze-backtrace for more information.^[6]

count & last_occurrence

After the forensic investigation, you can use:

abrt-cli rm <path to the problem directory>

to remove the specified problem data directory with all its contents.

[abrt]# abrt-cli rm ccpp-2019-08-21-13:59:02-31929

PrivateReports is disabled. Run abrt-cli-root to see all problems detected by ABRT.

rm 'ccpp-2019-08-21-13:59:02-31929'

However, note that ABRT performs a detection of duplicate problems by comparing new problems with all locally saved problems.

For a repeating crash, ABRT requires you to act upon it only once. But, if you delete the crash dump of that problem, the next time this specific problem occurs, ABRT will treat it as a new crash: ABRT will alert you about it, prompt you to fill in a description, and report it. To avoid having ABRT notifying you about a recurring problem, do not delete its problem data.

If you didn't remove a specific problem data directory, here is what would happen when ABRT catches a new crash :

ABRT compares it to the rest of locally stored problems
If it's a new problem, a new problem directory will be created
Otherwise, ABRT will update the recurring problem by:

Incrementing "count" by one
Updating "last_occurrence" with a new epoch

[ccpp-2019-08-21-13:59:02-31929]# ls -lrt

total 868572

-rw-r-----. 1 abrt abrt 3 Aug 21 2019 uid

-rw-r-----. 1 abrt abrt 10 Aug 21 2019 time

-rw-r-----. 1 abrt abrt 32 Aug 21 2019 os_release

-rw-r-----. 1 abrt abrt 30 Aug 21 2019 kernel

-rw-r-----. 1 abrt abrt 24 Aug 21 2019 hostname

-rw-r-----. 1 abrt abrt 6 Aug 21 2019 architecture

-rw-r-----. 1 abrt abrt 70033 Aug 21 2019 maps

-rw-r-----. 1 abrt abrt 1323 Aug 21 2019 limits

-rw-r-----. 1 abrt abrt 88 Aug 21 2019 cgroup

-rw-r-----. 1 abrt abrt 4 Aug 21 2019 type

-rw-r-----. 1 abrt abrt 90 Aug 21 2019 reason

-rw-r-----. 1 abrt abrt 39 Aug 21 2019 pwd

-rw-r-----. 1 abrt abrt 5 Aug 21 2019 pid

-rw-r-----. 1 abrt abrt 2072 Aug 21 2019 open_fds

-rw-r-----. 1 abrt abrt 48 Aug 21 2019 executable

-rw-r-----. 1 abrt abrt 14722 Aug 21 2019 environ

-rw-r-----. 1 abrt abrt 48 Aug 21 2019 cmdline

-rw-r-----. 1 abrt abrt 4 Aug 21 2019 analyzer

-rw-r-----. 1 abrt abrt 5 Aug 21 2019 abrt_version

-rw-r-----. 1 abrt abrt 886996992 Aug 21 2019 coredump

-rw-r-----. 1 abrt abrt 7 Aug 21 2019 username

-rw-r-----. 1 abrt abrt 1846076 Aug 21 2019 sosreport.tar.xz

-rw-r-----. 1 abrt abrt 0 Aug 21 2019 event_log

-rw-r-----. 1 abrt abrt 93 Aug 21 2019 machineid

-rw-r-----. 1 abrt abrt 378414 Aug 21 2019 core_backtrace

-rw-r-----. 1 abrt abrt 40 Aug 21 2019 uuid

-rw-r-----. 1 abrt abrt 1424 Aug 21 2019 dso_list

-rw-r-----. 1 abrt abrt 199 Aug 21 2019 var_log_messages

-rw-r-----. 1 abrt abrt 2 Jul 25 08:00 count

-rw-r-----. 1 abrt abrt 10 Jul 25 08:00 last_occurrence

[ccpp-2019-08-21-13:59:02-31929]# cat count

[ccpp-2019-08-21-13:59:02-31929]# cat last_occurrence

1595664006

[ccpp-2019-08-21-13:59:02-31929]# date -u -d @1595664006

Sat Jul 25 08:00:06 UTC 2020

[ccpp-2019-08-21-13:59:02-31929]# cat reason

Process /u01/app/xxx/server/bin/yyy was killed by signal 11 (SIGSEGV)

ABRT Configuration Files

Standard ABRT installation currently provides the following ABRT specific configuration files:

/etc/abrt/abrt.conf — allows you to modify the behavior of the abrtd service.
/etc/abrt/abrt-action-save-package-data.conf — allows you to modify the behavior of the abrt-action-save-package-data program.
/etc/abrt/plugins/CCpp.conf — allows you to modify the behavior of ABRT's core catching hook.

For example, the default location where problem data directories are created and in which problem core dumps and all other problem data are stored is:

/var/spool/abrt

[~]# cd /var/spool/abrt

[abrt]# ls -lrt

total 32

-rw-------. 1 root root 23 Mar 8 05:18 last-via-server

-rw-------. 1 root root 48 Jul 25 08:00 last-ccpp

drwxr-x---. 2 abrt abrt 4096 Jul 28 15:22 ccpp-2019-08-21-13:59:02-31929

Read [11] for all the details of ABRT configuration files.

References

ABRT Documentation (Release 2.14)
How to properly delete a report problem in ABRT
AUTOMATIC BUG REPORTING TOOL (ABRT)
Elements collected by ABRT
Basic ABRT components
abrt-action-analyze-backtrace

Analyzes C/C++ backtrace, generates duplication hash, backtrace rating, and identifies crash function in problem directory DIR
Then it saves this data as new elements global_uuid, rating, crash_function in this problem directory

abrt-backtrace
ABRT FAQ
ABRT Design
backtrace_rating (Red Hat doc)

Numerical representation of quality of backtrace based on ratio of unrecognized frames among all backtrace frames

ABRT SPECIFIC CONFIGURATION

Thursday, April 26, 2018

How to Debug "java.io.IOException: Connection reset by peer"?

There are many reasons that WebLogic server may throw below exception:

java.io.IOException: Connection reset by peer

In this article, we will use one specific case for discussion.

Stack Trace

####<Apr 26, 2018, 8:42:37,381 AM UTC> <Error> <HTTP> <myserver> <CloudConsoleServer_MyServices> <[ACTIVE] ExecuteThread: '23' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <XaUWE0Co100000000> <1524732157381> <[severity-value: 8] [rid: 0:1:2] [partition-id: 0] [partition-name: DOMAIN] > <BEA-101019> <[ServletContext@15863685[app:cp-myservices.ear module:mycloud path:null spec-version:3.1 version:_18.2.4.0.0_180422.1400]] Servlet failed with an IOException.

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcherImpl.write0(Native Method)

at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)

at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)

at sun.nio.ch.IOUtil.write(IOUtil.java:65)

at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)

at weblogic.socket.NIOOutputStream$SingleBufferWrite.writeTo(NIOOutputStream.java:841)

at weblogic.socket.NIOOutputStream$BlockingWriter.flush(NIOOutputStream.java:455)

at weblogic.socket.NIOOutputStream$BlockingWriter.write(NIOOutputStream.java:334)

at weblogic.socket.NIOOutputStream.write(NIOOutputStream.java:220)

at weblogic.servlet.internal.ChunkOutput.writeChunkTransfer(ChunkOutput.java:625)

at weblogic.servlet.internal.ChunkOutput.writeChunks(ChunkOutput.java:587)

at weblogic.servlet.internal.ChunkOutput.flush(ChunkOutput.java:471)

at weblogic.servlet.internal.ChunkOutput$3.checkForFlush(ChunkOutput.java:757)

at weblogic.servlet.internal.ChunkOutput.write(ChunkOutput.java:373)

at weblogic.servlet.internal.ChunkOutputWrapper.write(ChunkOutputWrapper.java:165)

at weblogic.servlet.internal.ServletOutputStreamImpl.write(ServletOutputStreamImpl.java:186)

at java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:167)

at oracle.adfinternal.view.faces.caching.filter.ResponseOutputStream.writeContentTo(ResponseOutputStream.java:74)

at oracle.adfinternal.view.faces.caching.filter.AdfFacesCachingResponse._flushContent(AdfFacesCachingResponse.java:147)

at oracle.adfinternal.view.faces.caching.filter.AdfFacesCachingResponse.flush(AdfFacesCachingResponse.java:136)

How to Debug

In this case, our server is connected to many applications in other servers. So, the peer-in-suspect could be from either a browser or an application running in our infrastructure.

Given the stack trace, the first thing to check is find some clues from it. For example, in this case, we saw:

weblogic.servlet.internal.ServletOutputStreamImpl.write(ServletOutputStreamImpl.java:186)
java.io.ByteArrayOutputStream.writeTo(ByteArrayOutputStream.java:167)
oracle.adfinternal.view.faces.caching.filter.ResponseOutputStream.writeContentTo(ResponseOutputStream.java:74)
oracle.adfinternal.view.faces.caching.filter.AdfFacesCachingResponse._flushContent(AdfFacesCachingResponse.java:147)
oracle.adfinternal.view.faces.caching.filter.AdfFacesCachingResponse.flush(AdfFacesCachingResponse.java:136)

which means that WebLogic server is

Writing the servlet response back to the client when the IOException was thrown

If this client were a browser, what happened could be:

The browser has shutdown the connection — either the browser crashed, or the browser shut the connection explicitly because the user closed that page or cancelled navigation on that page.

If this client were an application (i.e., a selenium or other testing tools), what happened could be:

Timeouts in those tools for how long they will wait for the response.

then

Maybe their logs would show you that they had closed the socket after some time.

HTTP Keep-Alive

In [1], the author surmized that it could be:

" Very likely an issue with HTTP keepalive (persistent connections)."

However, this is not our case because:

Keep-alive is to make sure the socket stays open between requests. Our case is in the middle of a request, so there is no keepalive in use at that time. But conceptually it is sort of the same thing: if the client (i.e., browser) decides that the response isn't coming, it closes the socket.

If You Find Out Who's the Peer

Let assume the client is another Linux application, here are possible debugging steps:

The only thing to check at the system level is that if the machine was up the entire time — you can check its uptime, and look at dmesg for messages about the link going up or down. Otherwise, maybe the application logs will tell you if the process restarted/crashed, which is the more likely cause.

Could tcpdump help in this case? Probably not because

There will probably be too much data from a tcpdump unless you know how to filter what you are looking for.

References

Possible Causes for "Connection reset by peer" when using NIOReferences

Sunday, April 16, 2017

Idiosyncrasies of ${HOME} that is an NFS Share

NFS is perhaps best for more 'permanent' network mounted directories such as /homedir or regularly accessed shared resources. In this article, we will cover the following topics:

Set up NFS share via automounter
Idiosyncrasies of /homedir that is an NFS share

Automounter

One drawback to using /etc/fstab is that, regardless of how infrequently a user accesses the NFS mounted file system, the system must dedicate resources to keep the mounted file system in place. This is not a problem with one or two mounts, but when the system is maintaining mounts to many systems at one time, overall system performance can be affected.

An alternative to /etc/fstab is to use the kernel-based automount utility. An automounter consists of two components:^[1]

A kernel module

implements a file system

A user-space daemon

performs all of the other functions

The automount utility can mount and unmount NFS file systems automatically (on demand mounting) therefore saving system resources. The automount utility can be used to mount other file systems including AFS, SMBFS, CIFS and local file systems.

${HOME}

When your home directory is automounted, it has different behaviors than other file systems due to its sharing. For example, you could run into the following two issues:

cp: cannot stat "KeePass-2.14.zip": Permission denied^[2]
".bashrc" E509: Cannot create backup file (add ! to override)"

In the below sections, we will discuss these two issues in more details.

cp: cannot stat "KeePass-2.14.zip" : Permission denied

In [2], the author has described an issue in which she has tried to copy a file from her home directory to /usr:

$ chmod 777 KeePass-2.14.zip
$ cp KeePass-2.14.zip /usr/keepass/
cp: cannot create regular file `/usr/keepass/KeePass-2.14.zip': Permission denied
$ sudo cp KeePass-2.14.zip /usr/keepass/
cp: cannot stat `KeePass-2.14.zip': Permission denied

However, sudo cp can't statKeePass-2.14.zi because${HOME} is on an NFS mount and the NFS server doesn't grant your machine root permission to the NFS share.

To workaround this "cannot stat: Permission denied" issue, you need to copy the file to another directory (e.g., /tmp) first:

cp KeePass-2.14.zip /tmp
sudo cp /tmp/KeePass-2.14.zip /usr/keepass/

".bashrc" E509: Cannot create backup file (add ! to override)"

One time when I edited and saved my $HOME/.bashrc, the system has thrown the following message:

".bashrc" E509: Cannot create backup file (add ! to override)"

Then I used "df" command to find the disk space available on my homedir:

$ df -h .
Filesystem            Size  Used Avail Use% Mounted on
server1:/export/home4/myusername
                      5.0T  1.4T  3.7T  28% /home/myusername

It showed that there were still plenty of space. However, because ${HOME} is NFS shared for the home directories of many others, every user has been assigned a disk quota. To find out how much quota you have been assigned for your homedir, you can run:

$ quota -Q -s
Disk quotas for user myusername (uid 40000):
     Filesystem  blocks   quota   limit   grace   files   quota   limit   grace
server1:/export/home4/myusername

                  1624M   2048M   2048M               0       0       0

So, to resolve this issue, you can simply remove other junk files form the homedir to gain some disk space for saving the file.

References

Saturday, March 19, 2016

Linux: How to Read Large Text File—/var/log/messages

To support Cloud Services, IaaS is the hardware and software that powers it all – servers, storage, networks, operating systems. These days Linux (or Windows) servers used in IaaS are more and more powerful. Hence they also generate more log files.

Very often we will run into large message files above 1 GB. These log files can be viewed by regular text editors. However, most text editors have a limitation of supporting files over a certain size.

In this article, we will cover how to read large message files (e.g., /var/log/messages) generated on Linux systems.

/var/log/messages

To debug issues in Cloud environments, it's essential for you to know where the log files are and what is contained in each log file. On Linux servers, over a dozen log files are located in /var/log directory. Here we only focus on one of them:

/var/log/messages^[7]

This log aims at storing "general system activity" messages.

There are several things that are logged in /var/log/messages including mail, cron, daemon, kern, auth, etc.
The severity of messages could be

[INFO]
[DEBUG]
[WARNING]
[ERR]
etc

Older message files are archived periodically with their name annotated with the date.

If your Linux system uses rsyslogd utility, its configuration file is

/etc/rsyslog.conf

in which you can specify rules (i.e., selector + action) of logging. For example, you can log anything of level informational or higher except mail, cron, or private authentication message:

*.info;mail.none;authpriv.none;cron.none /var/log/messages

and messages are logged into a file named /var/log/messages.

Limitations of Text Editors

Some editors have limitations of supporting certain sizes of text file. For example, the following popular editors on Windows have described limitation:

Notepad^[3]

64 kilobytes (KB)

Wordpad^[4]

It's said of no size limit. But, the real problem is performance.
Depends on the version of Wordpad, some people say it can support files of size up to 20 MB without performance issues.

Textpad^[8]

It can handle file sizes up to the largest contiguous chunk of 32-bit virtual memory.

Solutions

Basically, there are two solutions of dealing with large text files:

Find a more capable text editor
Divide and conquer

If you google search "large text file", you may find many suggestions on Large Text File Reader. Some editors may be able to open and read large text files. However, the performance (e.g., searching a pattern) of it could be slow.

On Linux systems, a good approach is 'divide-and-conquer" by using split command like:

split -b1000m messages-20160315T2201 split-messages

After splitting, a good text editor such as Textpad will be able to read a file of 1000 MB easily.

References

Wednesday, December 30, 2015

Jumbo Frames—Design Considerations for Efficient Network

Each network has some maximum packet size, or maximum transmission unit (MTU). Ultimately there is some limit imposed by the technology, but often the limit is an engineering choice or even an administrative choice.^[1]

Many Gigabit Ethernet switches and Gigabit Ethernet network interface cards can support jumbo frames.^[2] There are performance benefits to enable Jumbo Frames (MTU: 9000). However, existing transmission links may still impose smaller MTU (e.g., 1500). This could exhibit issues along transit paths, which is referred to here as MTU Mismatch.

In this article, we will examine issues manifested by MTU mismatch and their design considerations.

How to Accommodate MTU Differences

When a host on the Internet wants to send some data, it must know how to divide the data into packets. And in particular, it needs to know the maximum size of packet.

Jumbo frames are Ethernet frames with more than 1500 bytes of payload.^[3] Conventionally, jumbo frames can carry up to 9000 bytes of payload, but variations exist and some care must be taken using the term. In this article, we will use MTU: 9000 and MTU: 1500 as our examples to discuss MTU-mismatch issues.

Issues

MTU is a maximum—you tell a network device NOT to drop frames unless they are larger than the maximum. A device with an MTU of 1500 can still communicate with a device with an MTU of 9000. However, when large-size packets are sent from MTU 9000 device to MTU-1500 device, the following happens:

If DF (Don't Fragment) flag is set

Packets will be dropped and a router is required to return an ICMP Destination Unreachable message to the source of the datagram, with the Code indicating "fragmentation needed and DF set"

If DF flag is not set

Packets will be fragmented to accommodate MTU differences, which will beget a cost^[4]

How to Test Potential MTU Mismatch

Either ping, tracepath, or traceroute (with --mtu option) command can be used to test potential MTU-mismatches.

For example, you can verify that the path between two end nodes has at least the expected MTU using the ping command:

ping -M do -c 4 -s 8972

The -M do option causes the DF flag to be set.

The -c option sets the number of pings.

The -s option specifies the number of bytes of padding that should be added to the echo request. In addition to this number there will be 20 bytes for the internet protocol header, and 8 bytes for the ICMP header and timestamp. The amount of padding should therefore be 28 bytes less than the network-layer MTU that you are trying to test (9000 − 28 = 8972).

If the test is unsuccessful, then you should see an error in response to each echo request:

$ ping -M do -c 4 -s 8972 10.252.136.96

PING 10.252.136.96 (10.252.136.96) 8972(9000) bytes of data.
From 10.249.184.27 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 10.249.184.27 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 10.249.184.27 icmp_seq=1 Frag needed and DF set (mtu = 1500)
From 10.249.184.27 icmp_seq=1 Frag needed and DF set (mtu = 1500)

--- 10.252.136.96 ping statistics ---
0 packets transmitted, 0 received, +4 errors

Similarly, you can use tracepath command to test:

$ tracepath -n -l 9000

The -n option specifies not looking up host names (i.e, only print IP addresses numerically).

The -l option sets the initial packet length to pktlen instead of 65536 for tracepath or 128000 for tracepath6.

In the tracepath output, the last line summarizes information about all the path to the destination:

The last line shows detected Path MTU, amount of hops to the destination and our guess about amount of hops from the destination to us, which can be different when the path is asymmetric.

/* a packet of length 9000 cannot reach its destination */
$ tracepath -n -l 9000 10.249.184.27
1: 10.241.71.129 0.630ms
2: 10.241.152.60 0.577ms
3: 10.241.152.0 0.848ms
4: 10.246.1.49 1.007ms
5: 10.246.1.106 0.783ms
6: no reply
...
31: no reply
Too many hops: pmtu 9000
Resume: pmtu 9000

/* a packet of length 1500 reached its destination */
$ tracepath -n -l 1500 10.249.184.27
1: 10.241.71.129 0.502ms
2: 10.241.152.62 0.419ms
3: 10.241.152.4 0.543ms
4: 10.246.1.49 0.886ms
5: 10.246.1.106 0.439ms
6: 10.249.184.27 0.292ms reached
Resume: pmtu 1500 hops 6 back 59

When to Enable Jumbo Frames?

Enabling jumbo frame mode (for example, on Gigabit Ethernet network interface cards) can offer the following benefits:

Less consumption of bandwidth by non-data protocol overhead

Hence increase network throughput

Reduction of the packet rate

Hence reduce server overhead

The use of large MTU sizes allows the operating system to send fewer packets of a larger size to reach the same network throughput.
For example, you will see the decrease in CPU usage when transferring larger file

The above factors are especially important in speeding up NFS or iSCSI traffic, which normally has larger payload size.

Design Considerations

When jumbo frame mode is enabled, the trade-offs include:

Bigger I/O buffer

Required for both end nodes and intermediate transit nodes

MTU mismatch

May beget IP fragmentation or even loss of data

Therefore, some design considerations are required. For example, you can:

Avoid situations where you have jumbo frame enabled host NIC's talking to non-jumbo frame enabled host NIC's.

One design trick is to let your NFS or ISCSI traffic be sent via a dedicated NIC and your normal host traffic be sent via a non-jumbo-MTU enabled interface

If your workload only include small messages, then the larger MTU size will not help

Be sure to use commands with the Don't fragment bit set to ensure that your hosts which are configured for jumbo frames are able to successfully communicate with each other via jumbo frames.

Enable Path MTU Discovery (PMTUD)^[18]

When possible, use the largest MTU size that the adapter and network support, but constrained by Path MTU
Make sure the packet filter on your firewall process ICMP packets correctly

RFC 4821, Packetization Layer Path MTU Discovery, describes a Path MTU Discovery technique which responds more robustly to ICMP filtering.

Be aware of extra non-data protocol overhead if you configure encapsulation such as GRE tunneling or IPsec encryption.

References

The TCP Maximum Segment Size and Related Topics
Jumbo/Giant Frame Support on Catalyst Switches Configuration Example
Ethernet Jumbo Frames\
IP Fragmentation: How to Avoid It? (Xml and More)
The Great Jumbo Frames Debate
Resolve IP Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPSEC
Sites with Broken/Working PMTUD
Path MTU Discovery
TCP headers
bad TCP checksums
MSS performance consideration
Understanding Routing Table
route (Linux man page)
Docker should set host-side veth MTU #4378
Add MTU to lxc conf to make host and container MTU match
Xen Networking
TCP parameter settings (/proc/sys/net/ipv4)
Change the MTU of a network interface

To enable PMTUD on Linux, type:

echo 1 > /proc/sys/net/ipv4/tcp_mtu_probing
echo 1024 > /proc/sys/net/ipv4/tcp_base_mss

MTU manipulation
Jumbo Frames, the gotcha's you need to know! (good)
Understand container communication (Docker)
calicoctl should allow configuration of veth MTU #488 - GitHub
Linux MTU Change Size
Changing the MTU size in Windows Vista, 7 or 8
Linux Configure Jumbo Frames to Boost Network Performance
Path MTU discovery in practice
Odd tracepath and ping behavior when using a 9000 byte MTU
How to Read a Traceroute

Monday, August 12, 2013

How to Investigate: Failed to Bind to Port on Linux

From the server log file (i.e., CRMCommonServer_1.log) of WebLogic, I have found the following messages:

####<Aug 12, 2013 10:40:43 AM PDT> <Emergency> <Security> <myserver> <CRMCommonServer_1> <[STANDBY] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1376329243268> <BEA-090087> <Server failed to bind to the configured Admin port. The port may already be used by another process.>
####<Aug 12, 2013 10:40:43 AM PDT> <Error> <Server> <myserver> <CRMCommonServer_1> <DynamicListenThread[Default]> <<WLS Kernel>> <> <> <1376329243268> <BEA-002606> <Unable to create a server socket for listening on channel "Default". The address 10.241.88.31 might be incorrect or another process is using port 9004: java.net.BindException: Address already in use.>

In this article, I will show you how to investigate:

Which process is using port 9004?

Netstat Command on Linux

To investigate failed-to-bind-to -port issue, netstat comes in handy on Linux systems. netstat command can be used to:

Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships

In this detective work, we have used the following options:

-a, --all

Show both listening and non-listening sockets. With the --interfaces option, show inter-

faces that are not marked

-p, --program

Show the PID and name of the program to which each socket belongs.

The results are shown below:

$ netstat -ap | grep 9004 (Not all processes could be identified, non-owned process info will not be shown, you would have to be root to see it all.) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 myserver.us.ora:interserver myserver.oracle.com:9004 ESTABLISHED 12550/oidldapd tcp 0 0 myserver.us.oracle.com:9004 myserver.ora:interserver ESTABLISHED 22328/java

From the output, we know a Java application (i.e., process 22328) is using port 9004. When the first socket is bound to that port, then no other socket could be bound on port 9004 as long as the first socket remains open. To know which application it is, we check out that process' command line:

$ vi /proc/22328/cmdline

On the command line, we have found the following information:

-Dweblogic.Name=AdminServer

Also, BIDomain was mentioned there. So, that process is the AdminServer of BIDomain.

Port 7020

Similarly, we have seen port 7020 was used in another server's log file:

<BEA-002606> <Unable to create a server socket for listening on channel "Default". The address 10.241.88.31 might be incorrect or another process is using port 7020: java.net.BindException: Address already in use.>

When you tried:

# netstat -ap |grep 7020

No entries have been returned. However, if you use:

# netstat -an |grep 7020

You could find one entry:

tcp 0 0 ::ffff:10.241.88.31:7020 :::* LISTEN

In this case, we need to use the following command line:

# netstat -ap --numeric-ports |grep 7020

tcp 0 0 slcag044.us.oracle.com:7020 *:* LISTEN 21696/java

So, we know process 21696 is using port 7020. To investigate further, we typed:

# netstat -ap |grep 21696

tcp 0 0 slcag044.us.oracle.:dpserve *:* LISTEN 21696/java

It shows dpserve in the place of 7020. So, that's why our first search ended up with no entries. Now we know port 7020 was used by the dpserve protocol for service type dpserve^[2,3].

Our Solution

In our case, we need to re-order our start-up steps (see [4] for another approach). Instead of starting BIDomain first, we need to start it last. To fix our issue, we have done:

Shut down BIDomain
Start up CRMDomain
Start up BIDomain

Cross Column

Sunday, August 30, 2020

NTP—How to Check It Is Working?

Clock Skew

NTP—How Does It Work?

Stratum

Was Your NTP Service Properly Configured?

How to Sync the Clock on VMs?

How to check NTP is working?

ntpq

ntpstat

How "time correct to within 56 ms" was calculated?

Raw Data

timedatectl

How can I see the Time Difference between Client and Server?

References

Sunday, August 2, 2020

How ABRT Avoid Storing Duplicated Crashes — Deduplication

ABRT

Why ABRT?

Elements Collected by ABRT

Deduplication

count & last_occurrence

ABRT Configuration Files

References

Thursday, April 26, 2018

How to Debug "java.io.IOException: Connection reset by peer"?

Stack Trace

How to Debug

HTTP Keep-Alive

If You Find Out Who's the Peer

References

Sunday, April 16, 2017

Idiosyncrasies of ${HOME} that is an NFS Share

Automounter

${HOME}

cp: cannot stat "KeePass-2.14.zip" : Permission denied

".bashrc" E509: Cannot create backup file (add ! to override)"

References

Saturday, March 19, 2016

Linux: How to Read Large Text File—/var/log/messages

/var/log/messages

Limitations of Text Editors

Solutions

References

Wednesday, December 30, 2015

Jumbo Frames—Design Considerations for Efficient Network

How to Accommodate MTU Differences

How to Test Potential MTU Mismatch

When to Enable Jumbo Frames?

Design Considerations

References

Monday, August 12, 2013

How to Investigate: Failed to Bind to Port on Linux

Netstat Command on Linux

Port 7020

Our Solution

References