|
|
|
|
|
by Hasnep
638 days ago
|
|
Actually, the code before does nothing if dataset is set to 'animals', 'turtle' or 'formal_turtle', most of the branches are inaccessible. Also, the extra else clause that raises an error and the line elif dataset = 'formal_turtle':
are both invalid syntax.I think 'clean up' here means something closer to 'convert this to what I'm trying to write'. |
|
It's almost impossible to for LLM to tell all the invalid rows at once since the data cannot be fit into the context window. If we prompt the model to thoroughly do data cleaning, there will be many try-and-fail steps. This happens to me as a human, I clean some rows, expect my program to run with the data, only to find there are more malformed data. LLM cannot get it right for now, actually I see many cases that LLM fails because it wants to convert types (e.g. string to date).
Based on my experience, the best way is simply to skip the data cleaning step in the planning stage (you can provide feedback asking the tool to not do any steps).