|
|
|
|
|
by zaidw
1514 days ago
|
|
We can blame CSV, or we can blame the way people use CSV. Either way CSV is so unreliable that I try to “fail-fast” as soon as possible in automated pipeline. At work, we explicitly define data structuring process, converting CSV to Parquet with strict schema and technical/structural validation. We assign interns and new grad engineers for this, which is nicely within their capabilities too with minimal training. |
|