Cross compiling for WebAssembly with Emscripten#
Prerequisites#
You need CMake and compilers etc. installed as per the normal build instructions. Before building with Emscripten, you also need to install Emscripten and activate it using the commands below (see https://emscripten.org/docs/getting_started/downloads.html for details).
git clone https://github.com/emscripten-core/emsdk.git
cd emsdk
# replace <version> with the desired EMSDK version.
# e.g. for Pyodide 0.26, you need EMSDK version 3.1.58
# the versions can be found in the Makefile.envs file in the Pyodide repo:
# https://github.com/pyodide/pyodide/blob/10b484cfe427e076c929a55dc35cfff01ea8d3bc/Makefile.envs
./emsdk install <version>
./emsdk activate <version>
source ./emsdk_env.sh
If you want to build PyArrow for Pyodide, you
need pyodide-build
installed via pip
, and to be running with the
same version of Python that Pyodide is built for, along with the same
versions of emsdk tools.
# install Pyodide build tools.
# e.g., for version 0.26 of Pyodide, pyodide-build 0.26 and later work
pip install "pyodide-build>=0.26"
Then build with the ninja-release-emscripten
CMake preset,
like below:
emcmake cmake --preset "ninja-release-emscripten"
ninja install
This will install a built static library version of libarrow
it into the
Emscripten sysroot cache, meaning you can build things that depend on it
and they will find libarrow
.
e.g. if you want to build for Pyodide, run the commands above, and then
go to arrow/python
and run
pyodide build
It should make a wheel targeting the currently enabled version of
Pyodide in the dist
subdirectory.
Manual Build#
If you want to manually build for Emscripten, take a look at the
CMakePresets.json
file in the arrow/cpp
directory for a list of things
you will need to override. In particular you will need:
Build dependencies set to
BUNDLED
, so it uses properly cross compiled build dependencies.CMAKE_TOOLCHAIN_FILE
set by usingemcmake cmake
instead of justcmake
.You will need to set
ARROW_ENABLE_THREADING
toOFF
for builds targeting single-threaded Emscripten environments such as Pyodide.ARROW_FLIGHT
and anything else that uses network probably won’t work.ARROW_JEMALLOC
andARROW_MIMALLOC
again probably need to beOFF
ARROW_BUILD_STATIC
set toON
andARROW_BUILD_SHARED
set toOFF
is most likely to work.