Background
Foreign function interface
As we know, many operating systems only provide C API for system calls. This statement is not quite correct, but we assume this is right because we usually interact with operating systems via C API directly or indirectly. Have you ever thought about that why non-C language can interact with operating systems? Before we discuss this question, we need to know some knowledge of foreign function interface (FFI).
The term FFI refers to the language features for inter-language calls. Some of the non-C languages interact with operating systems via FFI. Many languages refer their FFI as ‘language bindings’, such as Python bindings. Some languages have their own terminology, such as Java’s JNI. With the ability of FFI, languages like Python can easily make system calls.
Python bindings
Python bindings allows you to call C API in pure Python or run Python scripts in C program.
There are two basic ways to implement the Python Bindings.
- The first way is using ctypes, a library provided by Python.
- The second way is using Python/C API, a library provided by CPython.
We can also choose to use some 3rd-party tools and libraries to make our life easier.
Create Python bindings via Cython
Introduction
Cython is a programming language that makes writing C extensions for the Python language as easy as Python itself and has a compiler that can compile Python and Cython to C. Cython can let you write Python-esque code that manually controls the GIL as Cython is compatible with Python and can be compiled to C. It is also very convinient to use C/C++ library in Cython. You can easily install Cython with the following command.
|
|
Basic usages of Cython
The simplest example
Any valid Python code is valid Cython code. This is the simplest example. Save the following code in the hello.pyx.
|
|
Now, we can start to compile and build our extension. Save the following code in the setup.py.
|
|
Use the following command to build the Cython file. We can only use this module in the setup.py’s directory because we didn’t install this module.
|
|
We can use this Cython module now! Just open the python interpreter and simply import it as if it was a regular Python module.
|
|
Using static types
In Cython, we can use static type to improve the performance in some cases. We can use cdef to declare static types. In addition, we can declare not only basic types of C but also struct, union, enum and their pointer types.
|
|
Function definitions
There are three ways to define a function in Cython. We can use def, cdef, and cpdef to define a function. Functions with def statement are Python functions. They take Python objects as parameters and return Python objects. Functions with cdef statement are C functions. They take either Python objects or C values as parameters and can return either Python objects or C values. Functions with cpdef statement is a kind of hybrid function. They use the faster C calling convention when being called from other Cython code. Be careful, only functions with def or cpdef statement can be called from Python code. The following table shows their supported features.
statement | Python objects param/return | C variable param/return | Called from python |
---|---|---|---|
def | √ | × | √ |
cdef | √ | √ | × |
cpdef | √ | √ | √ |
The following code block shows the usage of these statements. When you need to call a C function from Python, you can use def or cpdef to write a wrapper function to call that C function.
|
|
Call C standard library
There are some pre-defined Cython modules for libc, libcpp and posix in Cython package. You can find the complete list of these modules here. The following code block shows the usage of pre-defined modules.
|
|
The libc math library is special in that it is not linked by default on some Unix-like systems. So we need to configure the build system to link the shared library m.
|
|
The following command is the building command.
|
|
|
|
Build with C code
In this example, we will implement two functions in C. The function called fibo calculates the Nth Fibonacci number. The other function called calcDistance calculates the distance of two points.
The C code
The point and distance are defined in the header file foolib.h and the two functions are implemented in the source file foolib.c.
Here is the header file and source file of the foolib.
|
|
|
|
Then we compile the source code into the shared library with the following commands.
|
|
In the first line, we compile the source code into the position-independent code (PIC)pic and we can get an object file named foolib.o. In the second line, we turn this object file into the shared library. Be careful, we need to name the shared library in the format of lib{name}.sosoname. So, in this example, we need to name the shared library libfoolib.so.
The Cython code
Now, we start to write the Cython code. We first create a file called pfoolib.pxd. In the pxd file, we import the foolib.h header file and define the struct and function in Cython. Because Python code can’t call the functions or use the structures defined in the pxd file. So, we need to define wrapper functions to call these C functions and wrapper classes can be used by Python code. We define the wrapper functions and classes in the pfoolib.pyx. We can simply regard the pyd file as the header file of the pyx file. But we only declare the things implemented by C. The following code blocks are the pxd file and pyx file of the pfoolib.
|
|
In the pxd file, we can just declare the fields that we need to use. If we don’t need any field of this struct, we can just put pass in the body of struct declaration.
|
|
If the pyx file, there are a few things we need to be careful with.
- The class name needs to be different from the name defined in the pxd file. In the official document, they use the same name. But when I use the same name, I got a redeclaration error and I don’t know why.
- Because we use the char array for name, we can only use strcpy to copy the string to the name. If you just use assignment, you will get a runtime index error. If you define the name as char*, then you can use assignment directly.
- The bytes in Python3 is correspond to the string in C. So, we need to use encode and decode to transform the string.
- Cython compiler will generate a same name C file for every pyx file. If your C files and pyx files are in the same directory, you’d better give them different names.
The following file is the setup file of our module. We can build and install our module with this file. Pay attention to the libraries field of the Extension class, this field tells Cython what library we need. Here we provide the foolib, then the C compiler will find the libfoolib.so in the link stage.
|
|
Now, works are almost done. We just type the following command to build the Cython module.
|
|
We can use the foolib shared library in Python now. In the build directory (or you use it anywhere if you install it), type python and try it in the terminal.
|
|
The makefile
We can write a simple Makefile to avoid typing the long shell command. The following code block shows the content of the makefile.
|
|
Now, we can simply use the make command to compile, build and clean.
|
|
Python callback functions
Sometimes, we need to provide callbacks to our library. This section will give a simple example to show how to provide callback functions to our C library. In this example, we implement a function to filter an array’s elements. The function will take an array and predicate function for input.
The C code
The following code blocks show the header file and source file.
|
|
In the header file, we declare the predicate type for callback functions. The predicate has a void* parameter to get the context. The context will contain the real Python predicate function. If we don’t provide a void* parameter, we can only use a callback function without any context.
|
|
The Cython code
The following code blocks show the pxd file and pyx file.
|
|
|
|
The cCallBack is a callback function provided to the filter function. cCallBack will get the Python callback function from void* parameter and call it to check whether to keep an element in the array. The filter function takes an integer array. The array is equivalent to the pointer in C. Cython can’t automatically transfer a Python list to an array if the array doesn’t have a fixed length. So, we need to malloc an array in the heap, copy the elements to this array, and finally free this array after use.
The setup script and build command are the same as the previous example. Now, try to use this filter function.
|
|
Conclusion
The examples in this blog only show the basic usages of Cython. But these examples can cover many situations that integrate Python code with C code. In the future, I’ll introduce more advanced usages of Cython.
Explanation
- What is position independent code (PIC)? PIC is code that works no matter where in memory it is placed. Because several different programs can all use one instance of your shared library, the library cannot store things at fixed addresses, since the location of that library in memory will vary from program to program.2
- Every shared library in Unix-like system has a ‘so-name’, the so-name need to have a lib prefix and .so suffix by default. So, the so-name of foo library is libfoo.so. When you pass a ‘-lfoo’ parameter to GCC, the compiler will look up the libfoo.so in the link stage. But some basic C libraries don’t have this constraint.