Buffer protocol

Python supports an extremely general and convenient approach for exchanging data between plugin libraries. Types can expose a buffer view [1], which provides fast direct access to the raw internal data representation. Suppose we want to bind the following simplistic Matrix class:

class Matrix {
    Matrix(size_t rows, size_t cols) : m_rows(rows), m_cols(cols) {
        m_data = new float[rows*cols];
    float *data() { return m_data; }
    size_t rows() const { return m_rows; }
    size_t cols() const { return m_cols; }
    size_t m_rows, m_cols;
    float *m_data;

The following binding code exposes the Matrix contents as a buffer object, making it possible to cast Matrices into NumPy arrays. It is even possible to completely avoid copy operations with Python expressions like np.array(matrix_instance, copy = False).

py::class_<Matrix>(m, "Matrix", py::buffer_protocol())
   .def_buffer([](Matrix &m) -> py::buffer_info {
        return py::buffer_info(
  ,                               /* Pointer to buffer */
            sizeof(float),                          /* Size of one scalar */
            py::format_descriptor<float>::format(), /* Python struct-style format descriptor */
            2,                                      /* Number of dimensions */
            { m.rows(), m.cols() },                 /* Buffer dimensions */
            { sizeof(float) * m.rows(),             /* Strides (in bytes) for each index */
              sizeof(float) }

Supporting the buffer protocol in a new type involves specifying the special py::buffer_protocol() tag in the py::class_ constructor and calling the def_buffer() method with a lambda function that creates a py::buffer_info description record on demand describing a given matrix instance. The contents of py::buffer_info mirror the Python buffer protocol specification.

struct buffer_info {
    void *ptr;
    size_t itemsize;
    std::string format;
    int ndim;
    std::vector<size_t> shape;
    std::vector<size_t> strides;

To create a C++ function that can take a Python buffer object as an argument, simply use the type py::buffer as one of its arguments. Buffers can exist in a great variety of configurations, hence some safety checks are usually necessary in the function body. Below, you can see an basic example on how to define a custom constructor for the Eigen double precision matrix (Eigen::MatrixXd) type, which supports initialization from compatible buffer objects (e.g. a NumPy matrix).

/* Bind MatrixXd (or some other Eigen type) to Python */
typedef Eigen::MatrixXd Matrix;

typedef Matrix::Scalar Scalar;
constexpr bool rowMajor = Matrix::Flags & Eigen::RowMajorBit;

py::class_<Matrix>(m, "Matrix", py::buffer_protocol())
    .def("__init__", [](Matrix &m, py::buffer b) {
        typedef Eigen::Stride<Eigen::Dynamic, Eigen::Dynamic> Strides;

        /* Request a buffer descriptor from Python */
        py::buffer_info info = b.request();

        /* Some sanity checks ... */
        if (info.format != py::format_descriptor<Scalar>::format())
            throw std::runtime_error("Incompatible format: expected a double array!");

        if (info.ndim != 2)
            throw std::runtime_error("Incompatible buffer dimension!");

        auto strides = Strides(
            info.strides[rowMajor ? 0 : 1] / sizeof(Scalar),
            info.strides[rowMajor ? 1 : 0] / sizeof(Scalar));

        auto map = Eigen::Map<Matrix, 0, Strides>(
            static_cat<Scalar *>(info.ptr), info.shape[0], info.shape[1], strides);

        new (&m) Matrix(map);

For reference, the def_buffer() call for this Eigen data type should look as follows:

.def_buffer([](Matrix &m) -> py::buffer_info {
    return py::buffer_info(,                /* Pointer to buffer */
        sizeof(Scalar),          /* Size of one scalar */
        /* Python struct-style format descriptor */
        /* Number of dimensions */
        /* Buffer dimensions */
        { (size_t) m.rows(),
          (size_t) m.cols() },
        /* Strides (in bytes) for each index */
        { sizeof(Scalar) * (rowMajor ? m.cols() : 1),
          sizeof(Scalar) * (rowMajor ? 1 : m.rows()) }

For a much easier approach of binding Eigen types (although with some limitations), refer to the section on Eigen.

See also

The file tests/test_buffers.cpp contains a complete example that demonstrates using the buffer protocol with pybind11 in more detail.



By exchanging py::buffer with py::array in the above snippet, we can restrict the function so that it only accepts NumPy arrays (rather than any type of Python object satisfying the buffer protocol).

In many situations, we want to define a function which only accepts a NumPy array of a certain data type. This is possible via the py::array_t<T> template. For instance, the following function requires the argument to be a NumPy array containing double precision values.

void f(py::array_t<double> array);

When it is invoked with a different type (e.g. an integer or a list of integers), the binding code will attempt to cast the input into a NumPy array of the requested type. Note that this feature requires the pybind11/numpy.h header to be included.

Data in NumPy arrays is not guaranteed to packed in a dense manner; furthermore, entries can be separated by arbitrary column and row strides. Sometimes, it can be useful to require a function to only accept dense arrays using either the C (row-major) or Fortran (column-major) ordering. This can be accomplished via a second template argument with values py::array::c_style or py::array::f_style.

void f(py::array_t<double, py::array::c_style | py::array::forcecast> array);

The py::array::forcecast argument is the default value of the second template parameter, and it ensures that non-conforming arguments are converted into an array satisfying the specified requirements instead of trying the next function overload.

Structured types

In order for py::array_t to work with structured (record) types, we first need to register the memory layout of the type. This can be done via PYBIND11_NUMPY_DTYPE macro, called in the plugin definition code, which expects the type followed by field names:

struct A {
    int x;
    double y;

struct B {
    int z;
    A a;

// ...
    // ...

    PYBIND11_NUMPY_DTYPE(A, x, y);
    PYBIND11_NUMPY_DTYPE(B, z, a);
    /* now both A and B can be used as template arguments to py::array_t */

Vectorizing functions

Suppose we want to bind a function with the following signature to Python so that it can process arbitrary NumPy array arguments (vectors, matrices, general N-D arrays) in addition to its normal arguments:

double my_func(int x, float y, double z);

After including the pybind11/numpy.h header, this is extremely simple:

m.def("vectorized_func", py::vectorize(my_func));

Invoking the function like below causes 4 calls to be made to my_func with each of the array elements. The significant advantage of this compared to solutions like numpy.vectorize() is that the loop over the elements runs entirely on the C++ side and can be crunched down into a tight, optimized loop by the compiler. The result is returned as a NumPy array of type numpy.dtype.float64.

>>> x = np.array([[1, 3],[5, 7]])
>>> y = np.array([[2, 4],[6, 8]])
>>> z = 3
>>> result = vectorized_func(x, y, z)

The scalar argument z is transparently replicated 4 times. The input arrays x and y are automatically converted into the right types (they are of type numpy.dtype.int64 but need to be numpy.dtype.int32 and numpy.dtype.float32, respectively)

Sometimes we might want to explicitly exclude an argument from the vectorization because it makes little sense to wrap it in a NumPy array. For instance, suppose the function signature was

double my_func(int x, float y, my_custom_type *z);

This can be done with a stateful Lambda closure:

// Vectorize a lambda function with a capture object (e.g. to exclude some arguments from the vectorization)
    [](py::array_t<int> x, py::array_t<float> y, my_custom_type *z) {
        auto stateful_closure = [z](int x, float y) { return my_func(x, y, z); };
        return py::vectorize(stateful_closure)(x, y);

In cases where the computation is too complicated to be reduced to vectorize, it will be necessary to create and access the buffer contents manually. The following snippet contains a complete example that shows how this works (the code is somewhat contrived, since it could have been done more simply using vectorize).

#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

namespace py = pybind11;

py::array_t<double> add_arrays(py::array_t<double> input1, py::array_t<double> input2) {
    auto buf1 = input1.request(), buf2 = input2.request();

    if (buf1.ndim != 1 || buf2.ndim != 1)
        throw std::runtime_error("Number of dimensions must be one");

    if (buf1.size != buf2.size)
        throw std::runtime_error("Input shapes must match");

    /* No pointer is passed, so NumPy will allocate the buffer */
    auto result = py::array_t<double>(buf1.size);

    auto buf3 = result.request();

    double *ptr1 = (double *) buf1.ptr,
           *ptr2 = (double *) buf2.ptr,
           *ptr3 = (double *) buf3.ptr;

    for (size_t idx = 0; idx < buf1.shape[0]; idx++)
        ptr3[idx] = ptr1[idx] + ptr2[idx];

    return result;

    py::module m("test");
    m.def("add_arrays", &add_arrays, "Add two NumPy arrays");
    return m.ptr();

See also

The file tests/test_numpy_vectorize.cpp contains a complete example that demonstrates using vectorize() in more detail.

Direct access

For performance reasons, particularly when dealing with very large arrays, it is often desirable to directly access array elements without internal checking of dimensions and bounds on every access when indices are known to be already valid. To avoid such checks, the array class and array_t<T> template class offer an unchecked proxy object that can be used for this unchecked access through the unchecked<N> and mutable_unchecked<N> methods, where N gives the required dimensionality of the array:

m.def("sum_3d", [](py::array_t<double> x) {
    auto r = x.unchecked<3>(); // x must have ndim = 3; can be non-writeable
    double sum = 0;
    for (size_t i = 0; i < r.shape(0); i++)
        for (size_t j = 0; j < r.shape(1); j++)
            for (size_t k = 0; k < r.shape(2); k++)
                sum += r(i, j, k);
    return sum;
m.def("increment_3d", [](py::array_t<double> x) {
    auto r = x.mutable_unchecked<3>(); // Will throw if ndim != 3 or flags.writeable is false
    for (size_t i = 0; i < r.shape(0); i++)
        for (size_t j = 0; j < r.shape(1); j++)
            for (size_t k = 0; k < r.shape(2); k++)
                r(i, j, k) += 1.0;
}, py::arg().noconvert());

To obtain the proxy from an array object, you must specify both the data type and number of dimensions as template arguments, such as auto r = myarray.mutable_unchecked<float, 2>().

If the number of dimensions is not known at compile time, you can omit the dimensions template parameter (i.e. calling arr_t.unchecked() or arr.unchecked<T>(). This will give you a proxy object that works in the same way, but results in less optimizable code and thus a small efficiency loss in tight loops.

Note that the returned proxy object directly references the array’s data, and only reads its shape, strides, and writeable flag when constructed. You must take care to ensure that the referenced array is not destroyed or reshaped for the duration of the returned object, typically by limiting the scope of the returned instance.

The returned proxy object supports some of the same methods as py::array so that it can be used as a drop-in replacement for some existing, index-checked uses of py::array:

  • r.ndim() returns the number of dimensions
  •, 2, ...) and r.mutable_data(1, 2, ...)` returns a pointer to the const T or T data, respectively, at the given indices. The latter is only available to proxies obtained via a.mutable_unchecked().
  • itemsize() returns the size of an item in bytes, i.e. sizeof(T).
  • ndim() returns the number of dimensions.
  • shape(n) returns the size of dimension n
  • size() returns the total number of elements (i.e. the product of the shapes).
  • nbytes() returns the number of bytes used by the referenced elements (i.e. itemsize() times size()).

See also

The file tests/test_numpy_array.cpp contains additional examples demonstrating the use of this feature.