How to run GPT4ALL in a private cloud

4 min readJun 1, 2023

TLDR; check out https://github.com/drbh/gmessage for the latest code

Background

Large language models like ChatGPT and LlaMA are amazing technologies that are kinda like calculators for simple knowledge task like writing text or code.

I wont get into the weeds, but at the core, these technologies are using precise statistical analysis to generate text that is most likely to occur next.

The models can do this, because they have seen a large amount of text (way more text than any human can read) and they optimize their statistical guesses using the text as the source of truth.

LLM’s have already proven very helpful for a number of tasks, but because of their technical complexity there are very few providers of these models.

Fortunately, some really smart people have deciphered the open source models shared by large organizations and reduced their size and hardware requirements by drastic amounts. Resulting in the ability to run these models on everyday machines.

I especially want to point out the work done by ggerganov; llama.cpp which enables much of the low left mathematical operations, and Nomic AI’s GPT4ALL which provide a comprehensive layer to interact with many LLM models. I encourage the readers to check out these awesome projects!

In addition to running models locally, I’ve been seeking a way to run my own LLM in a personal private network and interact with it in my browser similar to ChatGPT.

Since GPT4ALL had just released their Golang bindings I thought it might be a fun project to build a small server and web app to serve this use case.

Later that day gmessage was born: https://github.com/drbh/gmessage

gmessage is yet another web interface for gpt4all with a couple features that I found useful like search history, model manager, themes and a topbar app.

While the application is still in it’s early days the app is reaching a point where it might be fun and useful to others, and maybe inspire some Golang or Svelte devs to come hack along on open source AI tech!

What’s the end result

Before scrolling down and dedicating 5 minutes to setting this up, what is the end result? Whadda I get?

🔍 Search

🎨 Themes

⚙️ Model Download Manager

There are other features currently implemented like text-to-speech and both a desktop and web version. As well as many more features being worked on now.

How to run right now

docker run -p 10999:10999 drbh/gmessage:v0.0.0

or if you just want to see the UI and test it out you can test it here

https://gmessage.xyz

How to deploy in your own cloud

First clone the repo

git clone https://github.com/drbh/gmessages.git

Next, since we want to deploy this remotely we’ll need to use a cloud provider.

For simplicity and low cost we’ll use Fly.io to deploy our Dockerfile with only a few commands

fly launch

Great! That was easy, now we just wait for the process to complete and fly will share a custom url with you.

Note; you’re server is not secured by any authorization or authentication so anyone who has that link can use your LLM.

Additionally if you want to run it via docker you can use the following commands.

docker build -t gmessage .
docker run -p 10999:10999 gmessage

In production its important to secure you’re resources behind a auth service or currently I simply run my LLM within a person VPN so only my devices can access it.

If you’re interested or currently building large distributed networks with open source LLM’s please reach out!

Conclusion

It’s early days for open source LLMs and privacy preserving AIs. While the models are not perfect now, progress is happening fast and every day new advancements are made!

Its also early days for gmessage and we can use all the help we can get to build a fun and helpful chat interface, please think about contributing or trying out the project!