Embedding Python in Android

Standard Library

April 25, 2014




This article is part of a tutorial series.

  1. Embedding Python in Android
  2. Java Native Interface
  3. Embedding the Python Interpreter
  4. Adding log to the Python Interpreter
  5. Include Python Standard Library
  6. Compile C++ Cython Extensions
  7. Compile C Cython Extensions

On the previous articles I've shown how to cross-compile Python for Android, build a native interface for accessing the interpreter from Java code and run some code. However, I still haven't show how to include modules from the standard library. For instance, the following code produces the following output:

PyRun_SimpleString("import this");

...
I/pyjni   ( 9460): Traceback (most recent call last):
I/pyjni   ( 9460):   File "<string>", line 1, in <module>
I/pyjni   ( 9460): ImportError: No module named this

In this article, I will show how to include modules from the standard library on our embedded Python Android app.

Location of the Standard Library files – Assets

By default, Android APK's are installed in /data/app. If you list the contents of that folder (you must have root permissions), you'll see a bunch of Apks in that folder. However, if you think that the files inside the APK are uncompressed somewhere else, you're wrong. As far as I know, the design of Android is to access all files inside the Apk in specific manners. For instance, to access Resources (files in /res) from Java code you must use things such as R.resourceID, etc. This poses a problem for our embedded Python interpreter, since it expects the standard library to be some files in some location in the file structure. Also, the contents of /res seems not to be organized in a file structure (and it seems that it can't handle folders inside, although I'm not sure).

But, there is an alternative. There is a location called /assets where Android expects to find raw files organized as a typical file system. It even compiles the files to the apk as they are, and provides mechanisms to list and read files as byte streams. Unfortunately, although one can access the byte streams both from the Java code as from the native code, again, Python expects a typical file system, and not some byte stream. For instance, when one imports "foo", the Python interpreter searches for a file named "foo.py" in the file system. Python would have to be patched before it could access Android APK byte streams (hint for Python core developers.. ).

However, there is a solution! By default, on installation, Android creates a directory on its internal file structure which app developers can use to keep their databases and other files (/data/data/package_name/). In fact, native shared libraries are uncompressed from the Apk to /data/data/package_name/lib. If we want to use the file structure to host modules from the Standard Library, we can use that location (we could also put them in the SD card, but if they are essential for the application, I suggest to use the internal file structure).

I've made a small class to extract automatically the contents of the /assets folder to /data/data/package_name/files/ for every first run, including app updates. Basically, it checks the Apk creation date and compares it with a file located in the target folder. Whenever the application is newer (which happens on first install and subsequent reinstalls), it extracts the contents of /assets (check the code to see which folders won't be extracted).

package com.example.pytest;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;

import android.app.Activity;
import android.util.Log;

import android.content.pm.PackageInfo;
import android.content.pm.PackageManager;
import android.content.res.AssetManager;

/**
 * Handles the extraction of assets from an APK file.
 *
 * When the run() method is called, it compares the APK timestamp
 * with the timestamp in the extracted assets. If the APK is newer
 * (or there are no extracted assets), it extracts them to the
 * application's data folder and resets the extracted assets' timestamp.
 *
 * The APK is newer whenever there is a new installation or an update.
 *
 */
public class AssetExtractor {

    private final static String LOGTAG = "AssetExtractor";

    private Activity mActivity;
    private AssetManager mAssetManager;
    private static String mAssetModifiedPath = "lastmodified.txt";


    public AssetExtractor(Activity activity) {
        mActivity = activity;
        mAssetManager = mActivity.getAssets();
    }

    /* Returns the path to the data files */
    public String getDataFilesPath() {
        return mActivity.getApplicationInfo().dataDir + "/files/";
    }

    /* Sets the asset modification time  */
    private void setAssetLastModified(long time) {
        String filename = this.getDataFilesPath() + mAssetModifiedPath;
        try {
            BufferedWriter bwriter = new BufferedWriter(new FileWriter(new File(filename)));
            bwriter.write(String.format("%d", time));
            bwriter.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    /* Returns the asset modification time  */
    private long getAssetLastModified() {
        String filename = this.getDataFilesPath() + mAssetModifiedPath;
        try {
            BufferedReader breader = new BufferedReader(new FileReader(new File(filename)));
            String contents = breader.readLine();
            breader.close();
            return Long.valueOf(contents).longValue();
        } catch (IOException e) {
            e.printStackTrace();
            return 0;
        }
    }

    /*
     * Returns the time of this app's last update
     * Considers new installs and updates
     */
    private long getAppLastUpdate() {
        PackageManager pm = mActivity.getPackageManager();
        try {
            PackageInfo pkgInfo = pm.getPackageInfo(mActivity.getPackageName(),
                    PackageManager.GET_PERMISSIONS);
            return pkgInfo.lastUpdateTime;
        } catch (Exception e) {
            e.printStackTrace();
            return 1;
        }
    }

    /* Recursively deletes the contents of a folder*/
    private void recursiveDelete(File file) {
        if (file.isDirectory()) {
            for (File f : file.listFiles())
                recursiveDelete(f);
        }
        Log.i(LOGTAG, "Removing " + file.getAbsolutePath());
        file.delete();
    }

    /**
     * Copy the asset at the specified path to this app's data directory. If the
     * asset is a directory, its contents are also copied.
     *
     * @param path
     * Path to asset, relative to app's assets directory.
     */
    private void copyAssets(String path) {
        // Ignore the following asset folders
        if (path.equals("images")
                || path.equals("sounds")
                || path.equals("webkit")
                || path.equals("databases") // Motorola
                || path.equals("kioskmode")) // Samsung
                    return;
        try {
            String[] assetList = mAssetManager.list(path);
            if (assetList == null || assetList.length == 0)
                throw new IOException();

            // Make the directory.
            File dir = new File(this.getDataFilesPath(), path);
            dir.mkdirs();

            // Recurse on the contents.
            for (String entry : assetList) {
                if (path == "")
                    copyAssets(entry);
                else
                    copyAssets(path + "/" + entry);
            }
        } catch (IOException e) {
            copyFileAsset(path);
        }
    }

    /**
     * Copy the asset file specified by path to app's data directory. Assumes
     * parent directories have already been created.
     *
     * @param path
     * Path to asset, relative to app's assets directory.
     */
    private void copyFileAsset(String path) {
        File file = new File(this.getDataFilesPath(), path);
        Log.i(LOGTAG, String.format("Extracting %s to %s", path, file.getAbsolutePath()));
        try {
            InputStream in = mAssetManager.open(path);
            OutputStream out = new FileOutputStream(file);
            byte[] buffer = new byte[1024];
            int read = in.read(buffer);
            while (read != -1) {
                out.write(buffer, 0, read);
                read = in.read(buffer);
            }
            out.close();
            in.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }


    public void run() {
        // Get the times of last modifications
        long appLastUpdate = this.getAppLastUpdate();
        long assetLastModified = this.getAssetLastModified();
        if (appLastUpdate > assetLastModified) {
            // Clear previous assets
            Log.i(LOGTAG, "Removing private assets");
            File file = new File(this.getDataFilesPath());
            this.recursiveDelete(file);
            file.mkdir();
            // Extract new assets
            Log.i(LOGTAG, "Extracting assets");
            this.copyAssets("");
            // Update extract asset's timestamp
            this.setAssetLastModified(appLastUpdate);
            Log.i(LOGTAG, "Done!");
        } else {
            // Extracted assets are up to date
            Log.i(LOGTAG, "Assets are up to date");
        }
    }
}

To use the AssetExtractor, copy this code to the onCreate method of your application.

// Extract assets on first run/update
new AssetExtractor(this).run();

Please note that you extract the assets, but you can't delete them from inside the Apk. This means that the assets will always be duplicated (inside the Apk, in /data/app/package.apk, and the extracted assets in /data/data/package_name/files/) and this is a severe limitation on Android's design. For big files you can always download them later from an external server, or consider using Apk expansion files.

Copy Python modules to /assets

Although you can use whatever structure you like for the Python modules you may which to include, my suggestion is to use the following structure which mimics somewhat what Python expects on a file system. Consider, however, that later you can always use sys.path to add other folders to the Python interpreter search path.

Basically, you have a /lib folder which contains a file named boot.py and a directory /lib/python2.7. Inside /lib/python2.7 you find module this.py and two folders which could not be there but will be used later. Inside python-for-android/dist/default/private/lib/python2.7/ you can find compiled (.pyo) modules.

I suggest to use compiled modules, or else python will create .pyc files on each import, something that will increases even more the size occupied by your application data. Just to clarify, using the AssetExtractor class as above, assets/lib/python2.7/this.py will be extracted to /data/data/package_name/files/lib/python2.7/this.py. You can copy you own scripts to anywhere inside the assets/ folder, just take care of setting the sys.path.

(Note: Don't leave empty directories inside the assets folder as they will get copied as files. This is a limitation of AssetExtractor because of another limitation from Android's own AssetManager)

Using boot.py to boot your own scripts

So far, we can copy some files to our device (which, by coincidence, can be Python modules), but we still have to make our Python interpreter to know their location and execute file scripts. We have to make the interpreter to know where are these files located in the file system. In other words, before we initialize the interpreter in our pyjni.c we have to set up Python's system path.

One solution is to hardcode the location on a string. But a finer solution is to pass the location of /data/data/package_name/files/ from Java code to our C/C++ code using JNI. In this case, I'm going to pass a string with the path on PythonWrapper.start(). We just need to rewrite the start() method of the PythonWrapper class to:

public static native int start(String datapath);

And on the onCreate method, we just need to pass the absolute path of the application's data location, such as:

// Start Python interpreter, pass the data files location
PythonWrapper.start(getFilesDir().getAbsolutePath());

Don't forget to rebuild the JNI header (see the JNI article), and update the start() function on pyjni.c to include the new string parameter. Since we now know the location of the app's data (datapath argument), we can now rewrite the C/C++ start function to something like:

JNIEXPORT jint JNICALL Java_com_example_pytest_PythonWrapper_start
  (JNIEnv *env, jclass jc, jstring datapath)
{
    LOG("Initializing the Python interpreter");
    // Get the location of data files
    jboolean iscopy;
    const char *data_path = (*env)->GetStringUTFChars(env, datapath, &iscopy);
    // Set Python environment variables
    setenv("PYTHONHOME", data_path, 1);
    setenv("PYTHONPATH", data_path, 1);
    // Initialize Python interpreter and log
    Py_OptimizeFlag = 1; // Allow to run .pyo files outside zip (one of [0,1,2])
    Py_Initialize();
    init_androidembed();
    // Execute boot script
    PyRun_SimpleString("import boot");
    return 0;
}

Simply explained, we set some Python environment variables (lines 9-10), so that Python searches for modules on our app data folder. Lines 13-14 start the interpreter and log (similarly as before), but now, in line 16, we import module boot.py. Since boot.py file will be found in sys.path, it will be executed, and we can use it to set other variables, etc.

Other solution to execute boot.py could be to use PyRun_SimpleFile, after passing the Android data location to the interpreter, and use boot.py to set sys.path. I prefer to keep things simple and this is how my boot.py looks like:

import sys
# True as of https://docs.python.org/2/library/sys.html#sys.path
ANDROID_DATA_PATH = sys.path[0]

print "Booting Python embedded on Android.."
print "Android data location: ", ANDROID_DATA_PATH

# Add site-packages to search path
sys.path.append(ANDROID_DATA_PATH + "/lib/python2.7/site-packages")
print sys.path

# Import a standard library module
import this

And this is what we get on Android's logcat:

I/pyjni   ( 3002): Booting Python embedded on Android..
I/pyjni   ( 3002): Android data location:  /data/data/com.example.pytest/files
I/pyjni   ( 3002): ['/data/data/com.example.pytest/files', '/data/data/com.example.pytest/files/lib/python27.zip', '/data/data/com.example.pytest/files/lib/python2.7/', '/data/data/com.example.pytest/files/lib/python2.7/plat-linux3', '/data/data/com.example.pytest/files/lib/python2.7/lib-tk', '/data/data/com.example.pytest/files/lib/python2.7/lib-old', '/data/data/com.example.pytest/files/lib/python2.7/lib-dynload', '/data/data/com.example.pytest/files/lib/python2.7/site-packages']
I/pyjni   ( 3002): The Zen of Python, by Tim Peters
I/pyjni   ( 3002): Beautiful is better than ugly.
I/pyjni   ( 3002): Explicit is better than implicit.
I/pyjni   ( 3002): Simple is better than complex.
I/pyjni   ( 3002): Complex is better than complicated.
I/pyjni   ( 3002): Flat is better than nested.
I/pyjni   ( 3002): Sparse is better than dense.
I/pyjni   ( 3002): Readability counts.
I/pyjni   ( 3002): Special cases aren't special enough to break the rules.
I/pyjni   ( 3002): Although practicality beats purity.
I/pyjni   ( 3002): Errors should never pass silently.
I/pyjni   ( 3002): Unless explicitly silenced.
I/pyjni   ( 3002): In the face of ambiguity, refuse the temptation to guess.
I/pyjni   ( 3002): There should be one-- and preferably only one --obvious way to do it.
I/pyjni   ( 3002): Although that way may not be obvious at first unless you're Dutch.
I/pyjni   ( 3002): Now is better than never.
I/pyjni   ( 3002): Although never is often better than *right* now.
I/pyjni   ( 3002): If the implementation is hard to explain, it's a bad idea.
I/pyjni   ( 3002): If the implementation is easy to explain, it may be a good idea.
I/pyjni   ( 3002): Namespaces are one honking great idea -- let's do more of those!

Just consider that, in boot.py, we have to manually add site-packages to the system path, since it does not get added automatically when we set the environment variables (although I still am not using that folder). And, as you can see, import this correctly prints the Zen of Python.

To include other modules from the standard library, you just need to copy them to your assets/lib/python2.7 folder, either as .py, or preferably, as .pyc or .pyo, because .py files get compiled to .pyc and will occupy more space. As I said previously, you can find the compiled standard library modules in python-for-android/dist/default/private/lib/python2.7/.

As a final note, as can be seen on the previous logcat, one of the sys.path locations is '/data/data/com.example.pytest/files/lib/python27.zip'. This means that, if you prefer to save some space, you can copy your modules and compress them to the root of a assets/lib/python27.zip file.

In the next article, I will show how you can add external modules (essentially, compiled ones) to your embedded distribution.