Hacker News new | ask | show | jobs
SOTA Code Retrieval with Efficient Code Embedding Models (qodo.ai)
11 points by jimminyx 474 days ago
2 comments

anyone else concerned that training models on synthetic, LLM-generated data might push us into a linguistic feedback loop? relying on LLM text for training could bias the next model towards even more overuse of words like "delve", "showcasing", and "underscores"...
SOTA? Lora? Seems like people are trying to usurp ham radio names for things.