Hacker News new | ask | show | jobs
by harisec 682 days ago
I agree, I suspect the HuggingFace dataset that I've used is not that randomly distributed and it mostly contains prompts related with those themes. How it works is that I randomly select 5 random prompts from the dataset and use those prompts as seed for new prompts. The complete DeepSeek prompt to generate new prompts can be found here: https://github.com/harisec/llm-dreams