Friday, March 25, 2016

How-to: When a Missing Python Module Error Was Thrown

When updating Python from 2 to 3, you may want to get familiar with the following topics first:
  1. Can you install multiple Python versions in Linux?
  2. How to do when a missing module error was thrown?
  3. Learn about search path to locate modules in Python
  4. Know the differences between Python 2 and 3[1]
  5. How to resolve missing Python module 
    • ImportError: No module named 'encodings'
  6. Where is a specific Python module located?

Multiple Python Installations


In our system, we have both Python 2 and 3 installed under /usr/bin as such:
  • /usr/bin/python
  • /usr/bin/python3
To choose a specific version to use in your python scripts, you can specify shebang (or hashbang) as follows:
#!/usr/bin/python3

Python2

$ python
Python 2.4.3 (#1, Feb 24 2012, 13:04:26)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.


Python3

$ python3
Python 3.5.1 (default, Mar 24 2016, 20:01:47)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux
Type "help", "copyright", "credits" or "license" for more information.

In the rest of article, we will use Python 3.5.1 for illustration unless stated otherwise.  Read section "Python 2 vs Python 3" to learn the differences between them.

Missing Python Module


Python module is a file (e.g., with suffixes like .py.pyc, .pyo etc.):[13,17]
  • Containing Python definitions and statements 
  • Can be imported in a script or in an interactive instance of the interpreter
    • Imported only once per interpreter session
      • Simply for efficiency reasons
      • If you change your modules, you must restart the interpreter
      • If it’s just one module you want to test interactively, can also use importlib.reload().[14]
Oftentimes, you could run into missing Python Module reported by ImportError module like:

$ python3.5
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
In such cases, you may need to fix sys.path to include missing library paths.

sys.path


sys.path variable stores a list of strings that specifies the search path for modules. It is initialized from these locations:
  1. The directory containing the input script (or the current directory when no file is specified)
  2. PYTHONPATH (a list of directory names)
    • With the same syntax as the shell variable PATH
  3. The installation-dependent default

A program is free to modify this list for its own purposes. Only strings and bytes should be added to sys.path; all other data types are ignored during import.  See also Module site — This describes how to use .pth files to extend sys.path.

Python 2 vs 3  


In this section, we will show you how to display sys.path value from the command line without entering interactive mode.  To do that, we use a built-in module print.  However, as noted below, there are syntax differences between Python 2 and Python 3 in the way of invoking it.


Python 2

$ python -c 'import sys; print "\n".join(sys.path)'
/usr/lib64/python24.zip
/usr/lib64/python2.4
/usr/lib64/python2.4/plat-linux2
/usr/lib64/python2.4/lib-tk
/usr/lib64/python2.4/lib-dynload
/usr/lib64/python2.4/site-packages
/usr/lib64/python2.4/site-packages/Numeric
/usr/lib64/python2.4/site-packages/PIL
/usr/lib64/python2.4/site-packages/gtk-2.0
/usr/lib/python2.4/site-packages

Python 3

$ python3 -c 'import sys; print("\n".join(sys.path))'
/usr/lib/python35.zip
/usr/lib/python3.5
/usr/lib/python3.5/plat-linux
/usr/lib/python3.5/lib-dynload
/scratch/perf/.local/lib/python3.5/site-packages
/usr/lib/python3.5/site-packages

Where Is a Python Module Located?


When a module, say,  encodings is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named encodings.py in a list of directories given by the variable sys.path.

To find out where an imported module is located, you can use its attribute __file__.[11]  For example, module encodings is located under /usr/lib/python3.5 in our system:

$ python3
<snipped>
>>> import encodings
>>> print(encodings.__file__)
/usr/lib/python3.5/encodings/__init__.py

References

  1. What does “SyntaxError: Missing parentheses in call to 'print'” mean in Python?
    • This error message means that you are attempting to use Python 3 to follow an example or run a program that uses the Python 2 print statement:
  2. Install / Update Python 3.5.0 at Linux machine. (Youtube)
  3. Python 3.5.1 
  4. Python Module
  5. upgrade Python to 2.7.2
  6. How can I troubleshoot Python “Could not find platform independent libraries
  7. Py_Initialize: Unable to get the locale encoding in OpenSuse 12.3
  8. Environment Variables (Python)
  9. Python script header
  10. Standard modules (Python)
  11. How do I find the location of Python module sources?
  12. sys module — System-specific parameters and functions
  13. What do the python file extensions, .pyc .pyd .pyo stand for?
  14. How do I unload (reload) a Python module?
  15. Purpose of #!/usr/bin/python3 (important)
  16. shebang (or hashbang)
    • Under Unix-like operating systems, when a script with a shebang is run as a program, the program loader parses the rest of the script's initial line as an interpreter directive; the specified interpreter program is run instead, passing to it as an argument the path that was initially used when attempting to run the script.
  17. Importing Python Modules



Wednesday, March 23, 2016

How-to: Can't locate IPTables/IPv4/IPQueue.pm

In this article, we will cover the following topics:
  • How to resolve Perl module missing issue
  • Know about CPAN (Comprehensive Perl Archive Network)
  • Learn how to configure CPAN module (i.e, CPAN.pm)

Missing Perl Module


When a Perl script using IPTables::IPv4::IPQueue[1] was executed:
BEGIN
{ push @INC, "/scratch/perf/.../perl/5.8.8/x86_64-linux-thread-multi"; } use strict; use warnings; use IPTables::IPv4::IPQueue qw(:constants);

It threw the following error message:[6]
  • Can't locate IPTables/IPv4/IPQueue.pm in @INC (@INC contains:,,,
A Perl module is the Perl equivalent of the class in OOP. It defines how its source codes are packaged (much like Java packages) using namespaces. Its file structure mirrors the namespace structure. For instance, IPTables::IPv4::IPQueue could locate in your file system somewhere like:
  • /usr/local/lib64/perl5/auto/IPTables/IPv4/IPQueue
To resolve the missing module issue, you need to install it by entering:
cpan[1]> force install IPTables::IPv4::IPQueue
But, before you do it, make sure you understand the following sections first.

CPAN (Comprehensive Perl Archive Network)


CPAN is a software repository of over 150,929 modules written in the Perl programming language. The modules can be downloaded from metacpan.org and also from mirrored sites worldwide. The resources found on CPAN are easily accessible with the CPAN.pm module.

From metacpan.org home page, you can search for any Perl Module you need. For example, enter "IPTables::IPv4" in the search field. You will find the documentation for IPTables::IPv4 here.

CPAN Module (CPAN.pm)


The resources found on CPAN are easily accessible with the CPAN.pm module. If you want to use CPAN module, you use CPAN shell, which provides an interactive mode, in two ways:

perl -MCPAN -e shell
--or--
cpan

Configuration Steps


If you want to use CPAN.pm, lots of things have to be configured. So, when you use it the first time, you will be prompted to configure them. After the configuration, don't forget to commit by entering:

cpan[19]> o conf commit

to make the configuration permanent, which configuration data will be logged into below file:
  • /usr/share/perl5/CPAN/Config.pm

Only one CPAN process can be run at a time and this is protected by a mechanism using below lock:
  • /root/.cpan/.lock


How to Connect to the Internet behind a Proxy


After the first-time configuration effort, you can still modify configured data by entering:
cpan[20]> o conf init
Then you will be asked if you like to configure as much as possible automatically or not. Without the trouble of going through all configuration steps again, you can also specify which data to be configured. For example, if your server is behind a proxy server, you may run into the following issue:
  • As you did not allow me to connect to the internet you need to supply a valid CPAN URL now.


To work around, you can configure a proxy for CPAN by entering:[4,5]

cpan[21]> o conf init /proxy/
If you're accessing the net via proxies, you can specify them in the CPAN configuration or via environment variables. The variable in the $CPAN::Config takes precedence.
Your ftp_proxy? []

At the "Your http_proxy? " prompt, we have entered the following:
  • http://146.xx.xx.29:80

and it works fine afterwards. Besides proxy configuration, you may also want to configure a urllist to specify which mirror(s) to use for downloading:

cpan[22]> o conf init urllist

There are 235 registered sites around the world make up the N part of CPAN (the Network), you can find the full list here.

Saturday, March 19, 2016

Linux: How to Read Large Text File—/var/log/messages

To support Cloud Services, IaaS is the hardware and software that powers it all – servers, storage, networks, operating systems. These days Linux (or Windows) servers used in IaaS are more and more powerful. Hence they also generate more log files.

Very often we will run into large message files above 1 GB. These log files can be viewed by regular text editors. However, most text editors have a limitation of supporting files over a certain size.

In this article, we will cover how to read large message files (e.g., /var/log/messages) generated on Linux systems.

/var/log/messages


To debug issues in Cloud environments, it's essential for you to know where the log files are and what is contained in each log file. On Linux servers, over a dozen log files are located in /var/log directory. Here we only focus on one of them:
  • /var/log/messages[7]
    • This log aims at storing "general system activity" messages.
      • There are several things that are logged in /var/log/messages including mail, cron, daemon, kern, auth, etc.
      • The severity of messages could be
        • [INFO]
        • [DEBUG]
        • [WARNING]
        • [ERR]
        • etc
    • Older message files are archived periodically with their name annotated with the date.
If your Linux system uses rsyslogd utility, its configuration file is
/etc/rsyslog.conf
in which you can specify rules (i.e., selector + action) of logging. For example, you can log anything of level informational or higher except mail, cron, or private authentication message:
*.info;mail.none;authpriv.none;cron.none /var/log/messages
and messages are logged into a file named /var/log/messages.

Limitations of Text Editors


Some editors have limitations of supporting certain sizes of text file. For example, the following popular editors on Windows have described limitation:
  • Notepad[3]
    • 64 kilobytes (KB)
  • Wordpad[4]
    • It's said of no size limit. But, the real problem is performance.
    • Depends on the version of Wordpad, some people say it can support files of size up to 20 MB without performance issues.
  • Textpad[8]
    • It can handle file sizes up to the largest contiguous chunk of 32-bit virtual memory.

Solutions


Basically, there are two solutions of dealing with large text files:
  1. Find a more capable text editor
  2. Divide and conquer
If you google search "large text file", you may find many suggestions on Large Text File Reader. Some editors may be able to open and read large text files. However, the performance (e.g., searching a pattern) of it could be slow.

On Linux systems, a good approach is 'divide-and-conquer" by using split command like:
split -b1000m messages-20160315T2201 split-messages

Tuesday, March 8, 2016

Excel: Get Every Third Row with Formula: INDEX and ROWS*3

I have used TextPad to clean up data with bookmark and macro as described in [1]:



The next task is to extract start time and end time from column A to calculate elapsed time of each individual event. Start time and End time are located in different rows:
  • Start Time: A1, A4, ..., A{N*3+1}
  • End Time: A3, A6, ..., A{N*3+3}
where N = 0 to 64.

This article has followed an excellent video describing how to achieve the task using INDEX and ROWS functions in Excel.

Formula Used


To retrieve start time, here is the formula I have defined in cell J1:
=INDEX($A$1:$A$195, ROWS($J$1:J1)*3)

To retrieve end time, the formula in cell K1 is defined as:
=INDEX($A$1:$A$195,ROWS($K$1:K1)*3-2)

Details of Formula


Dollar Sign ($)

When you copy J1 and paste it to J2, the formula will be changed from
=INDEX($A$1:$A$195, ROWS($J$1:J1)*3)
to
=INDEX($A$1:$A$195, ROWS($J$1:J2)*3)
The reference (i.e., J1) in the formula just point to itself. After being pasted to J2, the reference will automatically be set to J2. But, notice that the following references:
$A$1
$A$195
$J$1
remain the same after being pasted into J2 because we have prefixed them with dollar sign ($). For example, instead of A1, we have named it $A$1.

INDEX Function

INDEX function has two forms:
  • Array form
    • INDEX(array, row_num, [column_num])
  • Reference form
    • INDEX(reference, row_num, [column_num], [area_num])
Our formula uses the array form by specifying a range of cells as:
$A$1:$A$195
and row_num is defined using ROWS function.


ROWS Function

ROWS function has the following syntax:
ROWS(array)
where array can be an array, an array formula, or a reference to a range of cells for which you want the number of rows.

Our formula defines array as a range of cells. For example, in cell J2, the range is J1:J2 and, in cell J3, the range is J1:J3. In other words, we just count the number of rows from current cell to the first cell in column J. Then we use that count multiplied by three to retrieve every third row from the array.

References

  1. TextPad: How to Remove All Lines Except the Ones Containing a Pattern
  2. Excel Magic Trick 1142: Get Every Other Row with Formula: INDEX and ROWS*2
  3. EXCEL TIP: The dollar sign ($) in a formula - Fixing cell references


Tuesday, March 1, 2016

SSH: How to Simplify Connection Using Configuration Files

ssh (SSH client) is a program for logging into a remote machine and for executing commands on a remote machine. ssh obtains configuration data from the following sources in the following order (for each parameter, the first obtained value will be used):
  1. command-line options
    • If a configuration file is given on the command line (i.e., ssh -F ), the system-wide configuration file (/etc/ssh/ssh_config) will be ignored
  2. user's configuration file
    • ~/.ssh/config
  3. system-wide configuration file
    • /etc/ssh/ssh_config
In this article, we will focus on the specifications of directives via ssh's configuration file (specifically user's configuration file).

Advantages of Using Configuration File


There are some advantages of using configuration file to specify ssh directives:
  1. Can use shorthand to avoid long keystrokes
  2. Avoid mistakes
    • Especially when you have lots of parameters to be specified and/or some of them using non-standard connection values.
  3. Can provide options in different scopes (per-host vs per-user)

User's Configuration File


Here are the sample contents from a user's configuration file (i.e., ~/.ssh/config):
Host dev
    HostName dev.example.com
    Port 22000
    User fooey
Host github.com
    IdentityFile ~/.ssh/github.key
Instead of specifying:
  • ssh fooey@dev.example.com -p 22000

now you can just use the shorthand "dev" and the options will be read from the configuration file:
  • ssh dev

Ssh session normally will prompt you for a password. However, you can also set up public/private keys for password-less logins.[4]

Format of Configuration File


To get you started, here are the basics:[3]
  • Section
    • Separated by "Host" specifications
      • A single ‘*’ as a pattern can be used to provide global defaults for all hosts
        • See here for more information on patterns
      • The matched host name is usually the one given on the command line
  • Comment
    • Empty lines and lines starting with ‘#’ are comments.
  • Keyword
    • Case-insensitive
    • Examples
      • Host, Match, etc.
  • Directive/Argument
    • Directive
      • Used to specify session details including:
        • Identity
          • Username
        • Bind address
          • [bind_address:]port:host:hostport
        • Address family
          • any”, “inet” (use IPv4 only), or “inet6” (use IPv6 only)
        • Other options
      • See the directive reference here
    • Argument
      • Arguments are case-sensitive
      • Arguments may optionally be enclosed in double quotes (") in order to represent arguments containing spaces.

/var/log/secure


Linux has an extensive set of log files under the /var/log directory.[5] This directory is the central place where all applications and programs put their log files. Most log files are text files that can be viewed using a standard text editor.

/var/log/secure – This file contains all security related messages on the system. This includes authentication failures, possible break-in attempts, SSH logins, failed passwords, sshd logouts, invalid user accounts etc.

-rw------- 1 root root 3091237 Sep 14 11:54 secure
-rw------- 1 root root 2429153 Aug 18 01:50 secure-20130818
-rw------- 1 root root 4695728 Aug 25 03:29 secure-20130825
-rw------- 1 root root 12348973 Sep 1 02:24 secure-20130901
-rw------- 1 root root 7211819 Sep 8 01:22 secure-20130908


As shown above, old secure files are archived periodically with their name annotated with the date. 
/var/log/messages – This file contains messages of various programs and services including the SSH server.[6,7] Old message files are also archived periodically with their name annotated with the date.

References

  1. How to Keep Alive SSH Sessions
  2. Simplify Your Life With an SSH Config File
  3. ssh_config(5)
  4. How do I set up ssh so that I don't have to use a password? (Xml and More)
  5. 20 Linux Log Files that are Located under /var/log Directory
  6. How do I debug SSH problems?
  7. Difference between /var/log/messages, /var/log/syslog, and /var/log/kern.log?
  8. Verifying SSH Key Fingerprint and More (Xml and More)