Python Packages and Virtual Environments

If you need a python package that isn't installed on Kodiak, contact HPRCS staff and just request it. Usually we can install what you need but if there are compatibility issues or if you need a different version, you may need to install it yourself using the instructions below.

Before trying to install a python package (or requesting that one be installed) make sure that you have loaded a python module and are not trying to run one of the system default copies of python. You can check this with which python (or which python3). If it returns "/usr/bin/python(3)" then you need to load either python/2.7.10, python/2.7.14, or python/3.7.2. Then run pip list and see if the python package is already installed.

Note: If you require the "tensorflow" package, do not pip install it, either as a "--user installation" or within a virtual environment as described below. We have found that the various prebuilt versions that can be downloaded often have compatibility issues with Kodiak. Instead, we have built tensorflow from source and installed that on Kodiak. To use it, load one of the tensorflow/1.13.1 modules (cpu or gpu). This installation will use its own internal copy of python 3.7.2 so if you need packages that aren't installed in the "tensorflow python" you may still need to create a virtual environment or otherwise install packages using the instructions below. (But request that it be installed first...)

You can create your own private "copy" of python, i.e., a virtual environment, in which you can install specific packages that aren't installed with the system version. You will need to use a version of python that includes venv or virtualenv and pip (python's package installer).

There are two ways to create a virtual environment. The current, officially blessed, method is to use "venv". This is supported in python 3.5 and later. The older method is "virtualenv" and is required for python 2.x. Although virtualenv still works with python 3.x, you should use venv.

Creating a Virtual Environment With "venv"

This requires python 3.5 or later. On Kodiak, there are modules for two versions of python 3.7.2. The module for the regular version is python/3.7.2 which has several installed packages. There is also a module, python/3.7.2-virtualenv, which is a minimal installation, i.e., no packages other than pip. The "3.7.2-virtualenv" module is the one you will want to use.

Note: It is possible, when creating your virtual environment, to include access to the system packages installed in the regular version of python if you want them. To do so, load the python/3.7.2 module instead of python/3.7.2-virtualenv, and add the --system-site-packages venv option when creating your virtual environment. This document assumes you want to create a virtual environment without any of the default packages.

First, load the python/3.7.2-virtualenv module. You may want/need to module purge first to make sure no other python modules are loaded, just in case.

$ module purge

$ module list
No Modulefiles Currently Loaded.

$ module avail python
------------------------ /usr/local/Modules/modulefiles ------------------------
python/2.7.10(default)  python/3.6.6  python/3.7.2             
python/2.7.14           python/3.7.0  python/3.7.2-virtualenv  

$ module load python/3.7.2-virtualenv

Note that python 2.7.10 is still the default version, so if you just module load python you will get that one instead and venv will not work. Also, when using python 3.x, be sure to run python3 and not python. The latter may run the system default version of python which is not what you want.

You can test this by using the which command that will show you which copy of python3 will get run:

$ which python3
$ /usr/local/python/3.7.2-virtualenv/bin/python3

So to create a virtual environment, cd to the directory where you want your python directory to go. Here, we'll put it in /home/bobby/projects but you can put it wherever you want. You'll need to create the directory if it doesn't exist yet. Note that this is not the virtual environment directory itself, but is the parent directory where the virtual environment directory will be located.

$ mkdir /home/bobby/projects
$ cd /home/bobby/projects

Now use venv to create the virtual environment. We'll create one and call it "my-python" but you can name it whatever you want. This will create a directory called "my-python" inside of the projects directory that will be the virtual environment copy of python.

$ python3 -m venv my-python

$ ls

$ ls my-python
bin  include  lib  lib64  pyvenv.cfg

$ ls my-python/bin
activate  easy_install-3.7  pip3    python
activate.csh  easy_install   pip               pip3.7  python3

To switch to this version of python, you'll run the activate script within the my-python/bin directory. Note that you have to source this script and not just run it so that it will modify your current session.

$ source /home/bobby/projects/my-python/bin/activate
(my-python) $ 

You should see the current virtual environment (my-python) in your prompt. Now make sure that you really are using this virtual environment with the which command again. You should see the copy of python3 within my-python/bin directory and not /usr/local/python.

(my-python) $ which python3

At this point, you should be able to install individual python packages with pip (or pip3). If you run pip3 list you will see which packages were copied by the venv command. There should only be 2 right now. (You could also pip3 list -v to show the paths to the packages if you want to confirm their location.)

(my-python) $ pip3 list
Package    Version
---------- -------
pip        18.1   
setuptools 40.6.2 
You are using pip version 18.1, however version 19.1.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

If you get the message about upgrading pip, just run the command above.

(my-python) $ pip3 install --upgrade pip
Collecting pip
  Using cached .../pip-19.1.1-py2.py3-none-any.whl
Installing collected packages: pip
  Found existing installation: pip 18.1
    Uninstalling pip-18.1:
      Successfully uninstalled pip-18.1
Successfully installed pip-19.1.1

Now, to install a package, just pip install [package-name].

$ pip3 install numpy
Collecting numpy
  Downloading (17.3MB)
     |████████████████████████████████| 17.3MB 10.5MB/s 
Installing collected packages: numpy
Successfully installed numpy-1.16.4

Run pip3 list again and hopefully see that numpy got installed.

(my-python) $ pip3 list
Package    Version
---------- -------
numpy      1.16.4 
pip        19.1.1 
setuptools 40.6.2 

The virtual environment python copy will automagically get deactivated when you log out or your qsub-ed job exits. But if you want to stay logged in and do other, "non-my-python" related, python work, you can explicitly deactivate it with the deactivate command. You can which python3 to verify that it's no longer using your copy.

(my-python) $ deactivate

$ which python3

In your qsub-ed jobs that need to use the "my-python" version of python, be sure to add source /home/bobby/projects/my-python/bin/activate to your script.

Creating a Virtual Environment With "virtualenv"

Coming real soon now...