by cedd burge
How to use GitHub as a PyPi server
I was looking for a hosted private PyPi Python Package server. A server that used credentials that the team already has (such as GitHub).
I didn’t want to create an on-premise server. For us, it would make it impossible to use cloud-based servers, and it is another moving part that can go wrong. There are also potential issues with fine-grained security and speed. (We have a worldwide team, so serving the content via a CDN would be helpful.)
I didn’t want to force the team to create accounts with another provider. They already have Active Directory and GitHub accounts. It is an annoyance for them and creates a governance burden for me.
Sadly, I couldn’t find such a service. GemFury is excellent but support GitHub authorization (at the team / organisation level) and Packagr doesn’t support GitHub authorisation at all. MyGet is also excellent, it does allow me to use GitHub authorization, but doesn’t host Python packages. Azure DevOps has something that looks promising, but it’s in private beta at the moment.
Happily, this is possible using cloud Git repositories such as GitHub, GitLab and BitBucket.
Pip can install packages from a Git repo
I have hosted a Python package on GitHub (python_world), which you can install with the following command.
pip install git+https://github.com/ceddlyburge/python_world#egg=python_world
Pip provides options to install from head, from a branch, from a tag or from a commit. I usually tag each release and install from these tags. See the pip install documentation for full details.
This repository is public, but it works just the same with a private repo, as long as you have permission. There is no special magic in the package (it's a vanilla Python package). Setup.py does most of the work of creating the package.
If you are new to creating Python Packages, the Packaging Python Projects tutorial is worth a quick read.
Setuptools can install dependencies from Git
Setuptools is how most people create Python packages. If it can’t find a package during installation, it will look in
dependency_links. As in the example above, we can use Git as a source in
dependency_links. There are some details in the Setuptools documentation
The relevant bits are below.
install_requires specifies that
python_world is a required dependency and
dependency_links tells Setuptools where to find it. The name and version of the
#egg=python_world-0.0.1 part of the link are used to resolve the dependency (the name of the repository is the same in this case, but is not required to be).
install_requires=[ 'python_world>=0.0',],dependency_links=[ 'git+https://github.com/ceddlyburge/python_world#egg=python_world-0.0.1',]
You can install this package using the command below. It will also download the dependent
pip install git+https://github.com/ceddlyburge/python_hello#egg=python_hello --process-dependency-links
You will probably see a warning that
--process-dependency-links is deprecated. This flag serves a well-known use case, so is very unlikely to be removed without an alternative being available. You can read more about it on the Pip repo. However, by the end of this article, we will have removed the need for it.
As everyone who has used Python without an environment knows, environments save a lot of frustration and wasted time. So we need to support those.
If you download the repo, you can install the dependencies into a virtualenv with the following command.
pip install -r requirements.txt --process-dependency-links
If you are using conda you can use this command:
conda env create -n use-hello-world
So far we are able to install packages from our private Git repositories. These packages can, in turn, define dependencies to other private repositories. There still isn’t a PyPi server in sight.
We could stop at this point. The syntax for defining dependencies is a bit mysterious. It will be difficult for the team to discover which packages are available. We are also still using the deprecated
To fix this we can set up a PyPi index that conforms to Pep 503. This specification is quite simple, and I have just created the index by hand. If this becomes too cumbersome I can generate it from the GitHub API.
I created this PyPi Index using GitHub Pages. There are equivalent things for GitLab and BitBucket. You can see that the source code is very simple. GitHub Pages sites are always public (and there is probably no sensitive information in your index). However, if you need them to be private you can use a service such as PrivateHub.
One thing to look out for is the name normalisation of the specification. This requires the
python_hello package information to be present at
python-hello/index.html (note the change from an underscore to a dash).
Now that we have a PyPi server, there is no need for
--process-dependency-links. You can install packages using the command below, again make sure that you trust me.
pip install python_hello --extra-index-url https://ceddlyburge.github.io/python-package-server/
So that you can see this working with environments, I have created another repo (use_hello_world-from-server) that defines the
python_hello dependency using this PyPi index instead of direct GitHub Links. If you are trying it with Conda, version >4.4 is required.
At this point, we could go back and remove the direct Git dependency_link in setup.py of python_hello (as Setuptools will be able to find it from our server).
Using a cloud-hosted Git provider as a PyPi server is a viable option. If you are already using one, that means that you can reuse the credentials and permissions that you already have. It will work with Cloud build servers and is likely to be provided via a CDN, so will be fast worldwide. It requires more knowledge to set up than a hosted server, but probably the same or less than hosting your own server on premises.
Hints and tips
Serving the index locally can help to troubleshoot problems (such as name normalization). It’s easy to see what requests are being made. You can use the inbuilt python HTTP server for this (
python -m Http.Server -8000). This led me to find out that
pip search uses
postrequests, so won’t work with GitHub pages.
You can run
python setup.py -install to check your pip packages locally, before pushing them to Git.