Saturday, March 26, 2016

How-to: Installing Python 3.5.1 in Linux

When installing a new software in Linux, you could experience the following:
  • Expected
    • For example, if you have read this companion article (or watch this video), you would know that:
      • You can install multiple versions of Python on the same Linux Server (but in different PATHONHOME).[1]
      • There are differences between Python 2 and 3. So, be careful of reading articles that refer to different versions of Python installation (2 vs 3).
  • Unexpected
    • Surprises always happen even with careful planning. For example, we have run into at least two issues:
      • /usr/bin/install: cannot change permissions of `/usr/local/lib': No such file or directory
      • The directory '/home/<usrname>/.cache/pip' or its parent directory is not owned by the current user

In this article, we will cover the installation of Python 3.5.1 in Linux and how to resolve the issues encountered.

Downloads


You can click on this link to download Python 3.5.1 in Gzipped source tarball format. Go to that page and scroll down to Files section.

VersionOperating SystemDescriptionMD5 SumFile SizeGPG
Gzipped source tarballSource releasebe78e48cdfc1a7ad90efff146dce6cfe20143759SIG

If you want to download it using command lines, you can do:

$ export http_proxy=http://www-proxy.us.xxx.com:80 $ export https_proxy=http://www-proxy.us.xxx.com:80 $ wget https://www.python.org/ftp/python/3.5.1/Python-3.5.1.tgz $ tar -xvf Python-3.5.1.tgz $ cd Python-3.5.1

Configure and Install (Overview)


In the Python-3.5.1 folder, there is a README file. Read the information inside and follow the steps as follows:

On Unix, Linux, BSD, OSX, and Cygwin:

$ ./configure
$ make
$ make test
$ sudo make install

This will install Python as python3. Note that only make install need to be run as a root user.

Considerations of Configuration


Before you execute the configure command, do plan in advance for your new PYTHONHOME.[1] This is especially important if you have:
  • Multiple versions of Python installed on the system, or
  • Some parts of file system are read-only

Enter ./configure --help to learn how to customize installation directories in the configuration step.

$./configure --help Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [/usr/local] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, `make install' will install all the files in `/usr/local/bin', `/usr/local/lib' etc. You can specify an installation prefix other than `/usr/local' using `--prefix', for instance `--prefix=$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --datarootdir=DIR read-only arch.-independent data root [PREFIX/share] --datadir=DIR read-only architecture-independent data [DATAROOTDIR] --infodir=DIR info documentation [DATAROOTDIR/info] --localedir=DIR locale-dependent data [DATAROOTDIR/locale] --mandir=DIR man documentation [DATAROOTDIR/man] --docdir=DIR documentation root [DATAROOTDIR/doc/python] --htmldir=DIR html documentation [DOCDIR] --dvidir=DIR dvi documentation [DOCDIR] --pdfdir=DIR pdf documentation [DOCDIR] --psdir=DIR ps documentation [DOCDIR]

In our system, both default installation directories /usr/local/bin and /usr/local/lib  are read-only. So, we need to configure it with different PREFIX and EPREFIX as follows:

$ ./configure --prefix=/usr --exec-prefix=/usr
creating Modules/Setup creating Modules/Setup.local creating Makefile

After resolving all installation issues, you could find out where the final installation directories are by entering:[2]

$ python3 -c 'import sys; print("\n".join(sys.path))'
/usr/lib/python35.zip /usr/lib/python3.5 /usr/lib/python3.5/plat-linux /usr/lib/python3.5/lib-dynload /home/<usrname>/.local/lib/python3.5/site-packages /usr/lib/python3.5/site-packages

Potential Issues


Without specifying PREFIX and/or EPREFIX, you might run into the following issues (read [2] for further help):

$ python3 Could not find platform independent libraries Could not find platform dependent libraries Consider setting $PYTHONHOME to [:] Fatal Python error: Py_Initialize: Unable to get the locale encoding ImportError: No module named 'encodings'


To resolve another issue as shown below, try "sudo -H make install" as suggested (note that you may want to clean up first—see next section):

$ sudo make install The directory '/home/<usrname>/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag. The directory '/home/<usrname>/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.

Cleanup and Retry


In case you have run into any issues, you can cleanup your environment and retry with fixes. In your Python-3.5.1 folder, type the following:

# make clean
find . -depth -name '__pycache__' -exec rm -rf {} ';' find . -name '*.py[co]' -exec rm -f {} ';' find . -name '*.[oa]' -exec rm -f {} ';' find . -name '*.s[ol]' -exec rm -f {} ';' find . -name '*.so.[0-9]*.[0-9]*' -exec rm -f {} ';' find build -name 'fficonfig.h' -exec rm -f {} ';' || true find: build: No such file or directory find build -name '*.py' -exec rm -f {} ';' || true find: build: No such file or directory find build -name '*.py[co]' -exec rm -f {} ';' || true find: build: No such file or directory rm -f pybuilddir.txt rm -f Lib/lib2to3/*Grammar*.pickle rm -f Programs/_testembed Programs/_freeze_importlib rm -rf build

References

  1. Environment Variables (Python)
    • sys.path
      • A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default.
  2. How-to: When a Missing Python Module Error Was Thrown
  3. What does “SyntaxError: Missing parentheses in call to 'print'” mean in Python?
    • This error message means that you are attempting to use Python 3 to follow an example or run a program that uses the Python 2 print statement:
  4. Install / Update Python 3.5.0 at Linux machine. (Youtube)
  5. Python 3.5.1
  6. Python Module
  7. upgrade Python to 2.7.2
  8. How can I troubleshoot Python “Could not find platform independent libraries
  9. Py_Initialize: Unable to get the locale encoding in OpenSuse 12.3
  10. Python script header
  11. Standard modules (Python)
  12. How do I find the location of Python module sources?
  13. sys module — System-specific parameters and functions
  14. What do the python file extensions, .pyc .pyd .pyo stand for?
  15. How do I unload (reload) a Python module?
  16. Python Packaging User Guide
  17. Purpose of #!/usr/bin/python3 (important)

Friday, March 25, 2016

How-to: When a Missing Python Module Error Was Thrown

When updating Python from 2 to 3, you may want to get familiar with the following topics first:
  1. Can you install multiple Python versions in Linux?
  2. How to do when a missing module error was thrown?
  3. Learn about search path to locate modules in Python
  4. Know the differences between Python 2 and 3[1]
  5. How to resolve missing Python module 
    • ImportError: No module named 'encodings'
  6. Where is a specific Python module located?

Multiple Python Installations


In our system, we have both Python 2 and 3 installed under /usr/bin as such:
  • /usr/bin/python
  • /usr/bin/python3
To choose a specific version to use in your python scripts, you can specify shebang (or hashbang) as follows:
#!/usr/bin/python3

Python2

$ python
Python 2.4.3 (#1, Feb 24 2012, 13:04:26)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2
Type "help", "copyright", "credits" or "license" for more information.


Python3

$ python3
Python 3.5.1 (default, Mar 24 2016, 20:01:47)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux
Type "help", "copyright", "credits" or "license" for more information.

In the rest of article, we will use Python 3.5.1 for illustration unless stated otherwise.  Read section "Python 2 vs Python 3" to learn the differences between them.

Missing Python Module


Python module is a file (e.g., with suffixes like .py.pyc, .pyo etc.):[13,17]
  • Containing Python definitions and statements 
  • Can be imported in a script or in an interactive instance of the interpreter
    • Imported only once per interpreter session
      • Simply for efficiency reasons
      • If you change your modules, you must restart the interpreter
      • If it’s just one module you want to test interactively, can also use importlib.reload().[14]
Oftentimes, you could run into missing Python Module reported by ImportError module like:

$ python3.5
Fatal Python error: Py_Initialize: Unable to get the locale encoding
ImportError: No module named 'encodings'
In such cases, you may need to fix sys.path to include missing library paths.

sys.path


sys.path variable stores a list of strings that specifies the search path for modules. It is initialized from these locations:
  1. The directory containing the input script (or the current directory when no file is specified)
  2. PYTHONPATH (a list of directory names)
    • With the same syntax as the shell variable PATH
  3. The installation-dependent default

A program is free to modify this list for its own purposes. Only strings and bytes should be added to sys.path; all other data types are ignored during import.  See also Module site — This describes how to use .pth files to extend sys.path.

Python 2 vs 3  


In this section, we will show you how to display sys.path value from the command line without entering interactive mode.  To do that, we use a built-in module print.  However, as noted below, there are syntax differences between Python 2 and Python 3 in the way of invoking it.


Python 2

$ python -c 'import sys; print "\n".join(sys.path)'
/usr/lib64/python24.zip
/usr/lib64/python2.4
/usr/lib64/python2.4/plat-linux2
/usr/lib64/python2.4/lib-tk
/usr/lib64/python2.4/lib-dynload
/usr/lib64/python2.4/site-packages
/usr/lib64/python2.4/site-packages/Numeric
/usr/lib64/python2.4/site-packages/PIL
/usr/lib64/python2.4/site-packages/gtk-2.0
/usr/lib/python2.4/site-packages

Python 3

$ python3 -c 'import sys; print("\n".join(sys.path))'
/usr/lib/python35.zip
/usr/lib/python3.5
/usr/lib/python3.5/plat-linux
/usr/lib/python3.5/lib-dynload
/scratch/perf/.local/lib/python3.5/site-packages
/usr/lib/python3.5/site-packages

Where Is a Python Module Located?


When a module, say,  encodings is imported, the interpreter first searches for a built-in module with that name. If not found, it then searches for a file named encodings.py in a list of directories given by the variable sys.path.

To find out where an imported module is located, you can use its attribute __file__.[11]  For example, module encodings is located under /usr/lib/python3.5 in our system:

$ python3
<snipped>
>>> import encodings
>>> print(encodings.__file__)
/usr/lib/python3.5/encodings/__init__.py

References

  1. What does “SyntaxError: Missing parentheses in call to 'print'” mean in Python?
    • This error message means that you are attempting to use Python 3 to follow an example or run a program that uses the Python 2 print statement:
  2. Install / Update Python 3.5.0 at Linux machine. (Youtube)
  3. Python 3.5.1 
  4. Python Module
  5. upgrade Python to 2.7.2
  6. How can I troubleshoot Python “Could not find platform independent libraries
  7. Py_Initialize: Unable to get the locale encoding in OpenSuse 12.3
  8. Environment Variables (Python)
  9. Python script header
  10. Standard modules (Python)
  11. How do I find the location of Python module sources?
  12. sys module — System-specific parameters and functions
  13. What do the python file extensions, .pyc .pyd .pyo stand for?
  14. How do I unload (reload) a Python module?
  15. Purpose of #!/usr/bin/python3 (important)
  16. shebang (or hashbang)
    • Under Unix-like operating systems, when a script with a shebang is run as a program, the program loader parses the rest of the script's initial line as an interpreter directive; the specified interpreter program is run instead, passing to it as an argument the path that was initially used when attempting to run the script.
  17. Importing Python Modules



Wednesday, March 23, 2016

How-to: Can't locate IPTables/IPv4/IPQueue.pm

In this article, we will cover the following topics:
  • How to resolve Perl module missing issue
  • Know about CPAN (Comprehensive Perl Archive Network)
  • Learn how to configure CPAN module (i.e, CPAN.pm)

Missing Perl Module


When a Perl script using IPTables::IPv4::IPQueue[1] was executed:
BEGIN
{ push @INC, "/scratch/perf/.../perl/5.8.8/x86_64-linux-thread-multi"; } use strict; use warnings; use IPTables::IPv4::IPQueue qw(:constants);

It threw the following error message:[6]
  • Can't locate IPTables/IPv4/IPQueue.pm in @INC (@INC contains:,,,
A Perl module is the Perl equivalent of the class in OOP. It defines how its source codes are packaged (much like Java packages) using namespaces. Its file structure mirrors the namespace structure. For instance, IPTables::IPv4::IPQueue could locate in your file system somewhere like:
  • /usr/local/lib64/perl5/auto/IPTables/IPv4/IPQueue
To resolve the missing module issue, you need to install it by entering:
cpan[1]> force install IPTables::IPv4::IPQueue
But, before you do it, make sure you understand the following sections first.

CPAN (Comprehensive Perl Archive Network)


CPAN is a software repository of over 150,929 modules written in the Perl programming language. The modules can be downloaded from metacpan.org and also from mirrored sites worldwide. The resources found on CPAN are easily accessible with the CPAN.pm module.

From metacpan.org home page, you can search for any Perl Module you need. For example, enter "IPTables::IPv4" in the search field. You will find the documentation for IPTables::IPv4 here.

CPAN Module (CPAN.pm)


The resources found on CPAN are easily accessible with the CPAN.pm module. If you want to use CPAN module, you use CPAN shell, which provides an interactive mode, in two ways:

perl -MCPAN -e shell
--or--
cpan

Configuration Steps


If you want to use CPAN.pm, lots of things have to be configured. So, when you use it the first time, you will be prompted to configure them. After the configuration, don't forget to commit by entering:

cpan[19]> o conf commit

to make the configuration permanent, which configuration data will be logged into below file:
  • /usr/share/perl5/CPAN/Config.pm

Only one CPAN process can be run at a time and this is protected by a mechanism using below lock:
  • /root/.cpan/.lock


How to Connect to the Internet behind a Proxy


After the first-time configuration effort, you can still modify configured data by entering:
cpan[20]> o conf init
Then you will be asked if you like to configure as much as possible automatically or not. Without the trouble of going through all configuration steps again, you can also specify which data to be configured. For example, if your server is behind a proxy server, you may run into the following issue:
  • As you did not allow me to connect to the internet you need to supply a valid CPAN URL now.


To work around, you can configure a proxy for CPAN by entering:[4,5]

cpan[21]> o conf init /proxy/
If you're accessing the net via proxies, you can specify them in the CPAN configuration or via environment variables. The variable in the $CPAN::Config takes precedence.
Your ftp_proxy? []

At the "Your http_proxy? " prompt, we have entered the following:
  • http://146.xx.xx.29:80

and it works fine afterwards. Besides proxy configuration, you may also want to configure a urllist to specify which mirror(s) to use for downloading:

cpan[22]> o conf init urllist

There are 235 registered sites around the world make up the N part of CPAN (the Network), you can find the full list here.

Saturday, March 19, 2016

Linux: How to Read Large Text File—/var/log/messages

To support Cloud Services, IaaS is the hardware and software that powers it all – servers, storage, networks, operating systems. These days Linux (or Windows) servers used in IaaS are more and more powerful. Hence they also generate more log files.

Very often we will run into large message files above 1 GB. These log files can be viewed by regular text editors. However, most text editors have a limitation of supporting files over a certain size.

In this article, we will cover how to read large message files (e.g., /var/log/messages) generated on Linux systems.

/var/log/messages


To debug issues in Cloud environments, it's essential for you to know where the log files are and what is contained in each log file. On Linux servers, over a dozen log files are located in /var/log directory. Here we only focus on one of them:
  • /var/log/messages[7]
    • This log aims at storing "general system activity" messages.
      • There are several things that are logged in /var/log/messages including mail, cron, daemon, kern, auth, etc.
      • The severity of messages could be
        • [INFO]
        • [DEBUG]
        • [WARNING]
        • [ERR]
        • etc
    • Older message files are archived periodically with their name annotated with the date.
If your Linux system uses rsyslogd utility, its configuration file is
/etc/rsyslog.conf
in which you can specify rules (i.e., selector + action) of logging. For example, you can log anything of level informational or higher except mail, cron, or private authentication message:
*.info;mail.none;authpriv.none;cron.none /var/log/messages
and messages are logged into a file named /var/log/messages.

Limitations of Text Editors


Some editors have limitations of supporting certain sizes of text file. For example, the following popular editors on Windows have described limitation:
  • Notepad[3]
    • 64 kilobytes (KB)
  • Wordpad[4]
    • It's said of no size limit. But, the real problem is performance.
    • Depends on the version of Wordpad, some people say it can support files of size up to 20 MB without performance issues.
  • Textpad[8]
    • It can handle file sizes up to the largest contiguous chunk of 32-bit virtual memory.

Solutions


Basically, there are two solutions of dealing with large text files:
  1. Find a more capable text editor
  2. Divide and conquer
If you google search "large text file", you may find many suggestions on Large Text File Reader. Some editors may be able to open and read large text files. However, the performance (e.g., searching a pattern) of it could be slow.

On Linux systems, a good approach is 'divide-and-conquer" by using split command like:
split -b1000m messages-20160315T2201 split-messages

Tuesday, March 8, 2016

Excel: Get Every Third Row with Formula: INDEX and ROWS*3

I have used TextPad to clean up data with bookmark and macro as described in [1]:



The next task is to extract start time and end time from column A to calculate elapsed time of each individual event. Start time and End time are located in different rows:
  • Start Time: A1, A4, ..., A{N*3+1}
  • End Time: A3, A6, ..., A{N*3+3}
where N = 0 to 64.

This article has followed an excellent video describing how to achieve the task using INDEX and ROWS functions in Excel.

Formula Used


To retrieve start time, here is the formula I have defined in cell J1:
=INDEX($A$1:$A$195, ROWS($J$1:J1)*3)

To retrieve end time, the formula in cell K1 is defined as:
=INDEX($A$1:$A$195,ROWS($K$1:K1)*3-2)

Details of Formula


Dollar Sign ($)

When you copy J1 and paste it to J2, the formula will be changed from
=INDEX($A$1:$A$195, ROWS($J$1:J1)*3)
to
=INDEX($A$1:$A$195, ROWS($J$1:J2)*3)
The reference (i.e., J1) in the formula just point to itself. After being pasted to J2, the reference will automatically be set to J2. But, notice that the following references:
$A$1
$A$195
$J$1
remain the same after being pasted into J2 because we have prefixed them with dollar sign ($). For example, instead of A1, we have named it $A$1.

INDEX Function

INDEX function has two forms:
  • Array form
    • INDEX(array, row_num, [column_num])
  • Reference form
    • INDEX(reference, row_num, [column_num], [area_num])
Our formula uses the array form by specifying a range of cells as:
$A$1:$A$195
and row_num is defined using ROWS function.


ROWS Function

ROWS function has the following syntax:
ROWS(array)
where array can be an array, an array formula, or a reference to a range of cells for which you want the number of rows.

Our formula defines array as a range of cells. For example, in cell J2, the range is J1:J2 and, in cell J3, the range is J1:J3. In other words, we just count the number of rows from current cell to the first cell in column J. Then we use that count multiplied by three to retrieve every third row from the array.

References

  1. TextPad: How to Remove All Lines Except the Ones Containing a Pattern
  2. Excel Magic Trick 1142: Get Every Other Row with Formula: INDEX and ROWS*2
  3. EXCEL TIP: The dollar sign ($) in a formula - Fixing cell references


Tuesday, March 1, 2016

SSH: How to Simplify Connection Using Configuration Files

ssh (SSH client) is a program for logging into a remote machine and for executing commands on a remote machine. ssh obtains configuration data from the following sources in the following order (for each parameter, the first obtained value will be used):
  1. command-line options
    • If a configuration file is given on the command line (i.e., ssh -F ), the system-wide configuration file (/etc/ssh/ssh_config) will be ignored
  2. user's configuration file
    • ~/.ssh/config
  3. system-wide configuration file
    • /etc/ssh/ssh_config
In this article, we will focus on the specifications of directives via ssh's configuration file (specifically user's configuration file).

Advantages of Using Configuration File


There are some advantages of using configuration file to specify ssh directives:
  1. Can use shorthand to avoid long keystrokes
  2. Avoid mistakes
    • Especially when you have lots of parameters to be specified and/or some of them using non-standard connection values.
  3. Can provide options in different scopes (per-host vs per-user)

User's Configuration File


Here are the sample contents from a user's configuration file (i.e., ~/.ssh/config):
Host dev
    HostName dev.example.com
    Port 22000
    User fooey
Host github.com
    IdentityFile ~/.ssh/github.key
Instead of specifying:
  • ssh fooey@dev.example.com -p 22000

now you can just use the shorthand "dev" and the options will be read from the configuration file:
  • ssh dev

Ssh session normally will prompt you for a password. However, you can also set up public/private keys for password-less logins.[4]

Format of Configuration File


To get you started, here are the basics:[3]
  • Section
    • Separated by "Host" specifications
      • A single ‘*’ as a pattern can be used to provide global defaults for all hosts
        • See here for more information on patterns
      • The matched host name is usually the one given on the command line
  • Comment
    • Empty lines and lines starting with ‘#’ are comments.
  • Keyword
    • Case-insensitive
    • Examples
      • Host, Match, etc.
  • Directive/Argument
    • Directive
      • Used to specify session details including:
        • Identity
          • Username
        • Bind address
          • [bind_address:]port:host:hostport
        • Address family
          • any”, “inet” (use IPv4 only), or “inet6” (use IPv6 only)
        • Other options
      • See the directive reference here
    • Argument
      • Arguments are case-sensitive
      • Arguments may optionally be enclosed in double quotes (") in order to represent arguments containing spaces.

/var/log/secure


Linux has an extensive set of log files under the /var/log directory.[5] This directory is the central place where all applications and programs put their log files. Most log files are text files that can be viewed using a standard text editor.

/var/log/secure – This file contains all security related messages on the system. This includes authentication failures, possible break-in attempts, SSH logins, failed passwords, sshd logouts, invalid user accounts etc.

-rw------- 1 root root 3091237 Sep 14 11:54 secure
-rw------- 1 root root 2429153 Aug 18 01:50 secure-20130818
-rw------- 1 root root 4695728 Aug 25 03:29 secure-20130825
-rw------- 1 root root 12348973 Sep 1 02:24 secure-20130901
-rw------- 1 root root 7211819 Sep 8 01:22 secure-20130908


As shown above, old secure files are archived periodically with their name annotated with the date. 
/var/log/messages – This file contains messages of various programs and services including the SSH server.[6,7] Old message files are also archived periodically with their name annotated with the date.

References

  1. How to Keep Alive SSH Sessions
  2. Simplify Your Life With an SSH Config File
  3. ssh_config(5)
  4. How do I set up ssh so that I don't have to use a password? (Xml and More)
  5. 20 Linux Log Files that are Located under /var/log Directory
  6. How do I debug SSH problems?
  7. Difference between /var/log/messages, /var/log/syslog, and /var/log/kern.log?
  8. Verifying SSH Key Fingerprint and More (Xml and More)