Hacker News new | ask | show | jobs
by rwhyan 1179 days ago
Looks great!

I've been playing around GPT information extraction, and I think your prompt can be simplified to save on token costs:

Instead of:

`The company name (field name: "companyName", field type: string)`

I use a prompt that looks like:

`... The JSON should consist of the following information, using the format <field name: field type>: The company name <companyName: string>`

I've also played around using JSON structure in the prompt, such as:

`Return a JSON object with following model, with the format <field type: instructions to extract> { "companyName": <string: The company name>, ... }`

In my experience, often the attribute name is enough and GPT can infer how to extract the information (i.e. { "companyName": string ... }

1 comments

Thank you very much! I will definitely try out your suggestions! However, at least with GPT-3.5 and the amount of data I have to deal with in this case, my main concern is the quality of the extractions. With about 500 posts per month, the cost is manageable. But for larger datasets, saving tokens is definitely important.