Hacker News new | ask | show | jobs
by mbreese 1036 days ago
You can import the CSV file instead of opening it, which lets you choose the data type for each column. It's clunky, but it does what you want (if you set the types correctly).

Excel is a known issue for many text data files. My least favorite Excel-ism is when it changes the names of genes to dates. Many genes recently changed names to avoid Excel-related issues. As an example: OCT4 became POU5F1, and SEPT1 became SEPTIN1. Otherwise, if you're looking at gene related data, you'd have to be very careful when looking at gene names that you didn't accidentally save the file with a few "dates" as opposed to gene names. (Which is even more confusing when you realize that dates are stored as integers in Excel).

2 comments

I was doing a coursera project and it kept raising duplicates from what I thought were random strings of numbers and characters and therefore should be unique, took me a while to figure out what was happening:

https://sites.google.com/view/ryzvonusef/process

Wasted a week's worth of effort, but I learnt a valuable lesson.

Spoiler: Data read into Excel/Google sheets was interpreted as scientific notation, this caused uniqueness errors.
My least favorite Excel-ism is when it changes correct dates to american dates.

https://xkcd.com/1179/