Hacker News new | ask | show | jobs
by superkuh 1186 days ago
Anyone know what format the models have to be in for use with textsynth? I looked at the gpt2 example binary (gpt2_117M.bin) and it seems like the "normal" params.json is embedded as a header for the binary and then some ascii string like "attn/c_attn/" and then the binary weights.

I tried just using the Stanford Alpaca fine-tuned version of the llama 7B weights that work with llama.cpp with textsynth but it didn't like that (ggml-alpaca-7b-q4.bin: invalid file header). Having a textsynth HTTP API would save me a lot of hassle . I'm currently wrapping the stdin/out of a execution of a modified llama.cpp binary and that's extremely messy.

1 comments

Question for you (sorry I don't have an answer for you): Where are you storing the models relative to the ts_server folder?