Set up#
Install and set up Git#
The Arrow project is developed using Git for version control which is easily available for all common operating systems.
You can follow the instructions to install Git from GitHub where Arrow repository is hosted, following the quickstart instructions.
When Git is set up do not forget to configure your name and email
$ git config --global user.name "Your Name"
$ git config --global user.email your.email@example.com
and authenticate with GitHub as this will allow you to interact with GitHub without typing a username and password each time you execute a git command.
Note
This guide assumes you are comfortable working from the command line. Some IDEs allow you to manage a Git repository, but may implicitly run unwanted operations when doing so (such as creating project files).
For example, cloning it in RStudio assumes the whole repository is an
RStudio project and will create a .Rproj
file in the root directory.
For this reason it is highly recommended to clone the repository using
the command line or a Git client.
Get the source code#
Fork the repository#
The Arrow GitHub repository contains both the Arrow C++ library plus libraries for other languages such as Go, Java, Matlab, Python, R, etc. The first step to contributing is to create a fork of the repository in your own GitHub account.
Go to apache/arrow.
Press Fork in the top right corner.
Choose to fork the repository to your username so the fork will be created at
https://github.com/<your username>/arrow
.
Clone the repository#
Next you need to clone the repository
$ git clone https://github.com/<your username>/arrow.git
and add Apache Arrow repository as a remote called upstream
.
$ cd arrow
$ git remote add upstream https://github.com/apache/arrow
Verify your upstream#
Your upstream should be pointing at the Arrow GitHub repo.
Running in the shell:
$ git remote -v
Should give you a result similar to this:
origin https://github.com/<your username>/arrow.git (fetch)
origin https://github.com/<your username>/arrow.git (push)
upstream https://github.com/apache/arrow (fetch)
upstream https://github.com/apache/arrow (push)
If you did everything correctly, you should now have a copy of the code
in the arrow
directory and two remotes that refer to your own GitHub
fork (origin
) and the official Arrow repository (upstream
).