Falcon 180B: Can It Run on Your Computer?


There is also a chat version. The models are available on the Hugging Face hub:

Falcon 180B is completely free and state-of-the-art. But it’s also a huge model.

Can it run on your computer?

Unless your computer is ready for very intensive computing, it can’t run Falcon 180B out-of-the-box. You will need to upgrade your computer and use a quantized version of the model.

In this article, I explain how you can run Falcon-180B on consumer hardware. We will see that it can be reasonably affordable to run a 180 billion parameter model on a modern computer. I also discuss several techniques that help reduce the hardware requirements.

The first thing you need to know is that Falcon 180B has 180 billion parameters stored as bfloat16. A (b)float16 parameter is 2 bytes in memory.

When you load a model, the standard Pytorch pipeline works like this:

  1. An empty model is created: 180B parameters * 2 bytes = 360 GB



Source link

This post originally appeared on TechToday.