Hugging Face Spaces with jupyter notebooks and GPU

Also known as the poor-man Google Colab alternative

This article could also be renamed to how to enjoy some good time at LLM exploring, without getting poor!

If you ever scratched the surface of the “modern” AI (mostly generative stuff), you noticed for sure that, sooner or later, either a gigantic CPU or a decent GPU is needed - even just to deal with models inference and not die of boredom while waiting for your results.

There are many (too many!!) vendors out there, they all want your money, badly! Among these bunch of thieves providers, there are some more affordable, for example it’s really difficult to find a solution better than Google Colab - which with the Pro (at the cost of 12EUR/month) gives you a nice set of GPUs and CPUs minutes - you choose how to use them!

Google Colab is great but even the pro version has some serious limitations, for example I find annoying that:

  • you cannot have background processes without the Web UI fully loaded in your browser and actively used
  • as the instance gets closed after a while of user inactivity
  • you cannot easily SSH into the system (damn, there are workarounds but…. seriously, are we here to experiment networking?)
  • there are high cost when you want to use GPU (from few cents to few euro/hour, which can be a lot for what you get!)
  • ephemeral storage (but you can save on GDrive the notebooks and some file)
  • cannot version your notebooks - unless you use additional workarounds, like mounting gdrive, use git, etc etc

Nevertheless Colab is extremely useful and powerful, I highly recommend it - should you need specific hardware and you have clear objectives!

Why Hugging Face instead of Google Colab?

https://huggingface.co/ is a grown startup who “in August 2023 announced that it raised 4.5 billion valuation” - and these guys are definitely investing back, giving their users good services at a reasonable price.
This article focuses on their Pro account, which is sold at 9USD (not EURO!!) per month:

If you look deeply at the picture, you will read a nice and mysterious sentence:

Dev mode for spaces

The Dev mode for spaces is a beta feature which allows users to SSH into a running container and troubleshoot the code deployed, in place.
In few words:

We can run a container with Python interpreter inside an HF server, SSH to it, connect a Visual Studio Code instance and start running custom code.

The ZeroGPU project

Recently HF provided the possibility to run Gradio application (we can actually say python code) on a shared GPU.
This is a very interesting and promising feature, as when experimenting with LLMs we do not need a GPU reserved for 100% of the time, only when we execute something on it!
This is actually one of Colab downsides, once you turn on an instance with, for example, an A100 GPU, you pay from the first second up to the moment you turn it off.

I can definitely try to anticipate the future and say that shared GPU environments will be the future.

Reserving a GPU can sometimes be a waste of money - especially on interactive environments, which push you to a “trial and error” approach.

Dev mode and ZeroGPU, an example step by step

We need a new space:

The configuration is trivial, after having assigned a new name:

  1. Gradio is the only type of space which allows GPUZero (for now!)
  2. The blank template is enough
  3. ZeroGPU must be selected
  4. The Dev mode can also be turned on later, although it makes no sense to not enable it now!
  5. This is not a public Gradio application, therefore the space should be private.

HF Pro account has a number of limitations, for example you can have only 10 spaces running on ZeroGPU - but we just need one for our Colab alternative!

When the space is created, the browser will show the default Gradio example application:

The Dev mode is the most important part, as well as the fact that HF is telling you that ZeroGPU hardware is enabled! You can find more information about the Dev mode here https://huggingface.co/spaces/dev-mode-explorers/README

Open the container in Visual Studio Code

The beauty of VSCode is that you can run it on your browser!
Actually the UI/UX experience which you can achieve is very similar to Google Colab (in my opinion is even much better!).

Clicking the run button will lead you to a new instance of VSCode (on an other Tab of your browser).
The space filesystem only contain the demo Gradio stuff, one app.py python file a readme.md and few other git files which you’d better not touch!

(my instance is not vanilla, that’s why you see also other files!)

Install the python and jupyter extension on VSCode

This articles assumes that you already know how to run notebooks on VSCode if not, search in google how to do it - it is pretty easy!

The container filesystem is ephemeral

The space container is ephemeral. When it is created HF just pull your repository HEAD - but any change you do from VSCode is not automatically persisted, unless you commit it back.

From VSCode you can use Git commands from the integrated console or you can push using the integrated Git features in the UI.

If you wish to persist changes made while Dev Mode is enabled, you need to use git from inside the Space container (using VS Code or SSH). For example:

# Add changes and commit them
git add .
git commit -m "Persist changes from Dev Mode"

# Push the commit to persist them in the repo
git push

We need a virtual-env

The user logged into the space is clearly not super-user, moreover messing with the container python distribution is not very convenient.
My suggestion is to start your journey creating a Python virtual environment.
This can be easily achieved opening the console:

And typing:

python -m venv .venv

Visual studio at this point will also identify that you are working on a virtual environment, so it will switch the python interpreter to it (confirming the operation on a popup).

The virtual env can be also enabled in the console:

source .venv/bin/activate

Once all the steps are done correctly, the visual feedback should be similar to the below:

From the virtual env we can pip install as many packages we want.
A good way to keep these operations simple and grouped, is creating a new init.sh file in your space root:

#!/bin/bash
python -m venv .venv
.venv/bin/pip install ipykernel ipywidgets

(rememebr to chmod +x init.sh)

Every time the container is restarted, this init file can be launched from the VSCode console.
You can add, for example, all the needed python packages on a requirements.txt file, and ask pip to install them - but I actually prefer to have !pip in my notebook!

Create a new notebook

Using VSCode command prompt (CTRL + Shift + p) we can create a new Jupyter Notebook:

Each notebook must have a kernel selected:

This is the final result:

VSCode will detect that your venv does not have any kernel installed, and it will deploy it!!

Test out the whole system: