Hacker News new | ask | show | jobs
by awayto 1021 days ago
This reminds me of an interesting exeriment I did earlier this year with ChatGPT.

First, I came upon this reddit post [1] which describes being able to convert text into some ridiculous symbol soup that makes sense to ChatGPT.

Then, I considered the structure of my Typescript type files, ex [2], which are pretty straightforward and uniform, all things considered.

Playing around with the reddit compression prompt, I realized it performed poorly just passing in my type structures. So I made a simple script which essentially turned my types into a story.

Given a type definition:

    type IUserProfile {
        name: string;
        age: number;
    }
It's somewhat trivial to make a script to turn these into sentence structures, given the type is simple enough:

"IUserProfile contains: name which is a string; age which is a number; .... IUserProfiles contains: users which is an array of IUserProfile" and so on.

Passing this into the compression prompt was much more effective, and I ended up with a compressed version of my type system [3].

Regardless of the variability of the exercise, I can definitely say the prompt was able to generate some sensible components which more or less correctly implemented my type system when asked to, with some massaging. Not scalable, but interesting.

[1] https://www.reddit.com/r/ChatGPT/comments/12cvx9l/compressio...

[2] https://github.com/jcmccormick/wc/blob/c222aa577038fb55156b4...

[3] https://github.com/keybittech/wizapp/blob/f75e12dc3cc2da3a41...

1 comments

I’m curious, did you actually run it through the tokenizer and see if it was less tokens vs uncompressed? I have seen a lot of people try these “compression” schemes and token usage can be higher.
It's definitely less tokens at least in my contrived case. Looking at the compressed text, I can make out what is what, and see that it's just minimizing words to their root parts.

Typescript (22 tokens):

    export type IAssist = { id: string; prompt: string; promptResult: string[]; };
Story (26 tokens):

    IAssist contains: id which is a string; prompt which is a string; promptResult which is an array of strings.
Compressed (13 tokens):

    IAsst{id,prompt,promptR}
And again I'll just call this interesting, because is it really going to know promptResult is a string array in most cases? Definitely not unless it gets some help in the component description, maybe.