ChatGPT-J: The Privacy-First, Self-Hosted Chatbot Built on GPT-J's Powerful AI

Y	Hacker News new \| ask \| show \| jobs

	ChatGPT-J: The Privacy-First, Self-Hosted Chatbot Built on GPT-J's Powerful AI (colab.research.google.com)
	54 points by jarrell_mark 1193 days ago

3 comments

two_handfuls 1192 days ago

"Privacy-First", but also working in a colab notebook - meaning running on someone else's machine? That doesn't seem very private.

link

was_a_dev 1192 days ago

Download the notebook and run locally?

link

jarrell_mark 1191 days ago

Yes, the GitHub has the Jupyter .ipynb notebook that can be run locally: https://github.com/jarrellmark/chatgpt-j

And even in Colab, it's privacy first in the sense that user input or model output isn't being sent anywhere. The data is local to your Colab session.

link

serendipty01 1192 days ago

Github repo : https://github.com/jarrellmark/chatgpt-j

link

johntash 1193 days ago

Can this be run locally without beefy GPUs by any chance?

link

quesomaster9000 1192 days ago

ggml (https://github.com/ggerganov/ggml) has a GPT-J example, the 6B parameter model runs happily on the CPU 16gb of ram and 8 cores at a couple of words per second, no GPUs necessary.

    gptj_model_load: ggml ctx size = 13334.86 MB
    gptj_model_load: memory_size =  1792.00 MB, n_mem = 57344
    gptj_model_load: model size = 11542.79 MB / num tensors = 285
    main: number of tokens in prompt = 12

    An example of GPT-J running on the CPU is shown in Fig. [4](#Fig4

    main: mem per token = 16179460 bytes
    main:     load time =  7463.20 ms
    main:   sample time =     3.24 ms
    main:  predict time =  4887.26 ms / 232.73 ms per token
    main:    total time = 13203.91 ms

link

jerpint 1193 days ago

There have been CPU implementations of LLAMA (7b parameters, comparable in size) with very impressive performance

link

ops 1193 days ago

I haven't used this yet, but I am currently running GPT-J on my Mac Studio, so I suspect so.

link

jarrell_mark 1191 days ago

It should work with about 12gb GPU RAM.

I got it to load on a GTX 1070 with 8GB GPU RAM, but then it crashed before it could generate a response.

It needs less RAM than regular GPT-J because the weights are converted to 8-bit

link