| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zAy0LfpBZLC8mAC 4449 days ago

How do you know that the strategy worked reliably if you never compared the results to the results obtained using a reliable method (which you presumably didn't, because then you could just have used the reliable method)? The larger the data you have to deal with, the more likely it is that corner cases will occur in it, and the less likely that you will notice anomalies, thus the more important that you are very strict in your logic if you want to derive any meaningful results.

As such, the two viewpoints really are: not really caring about the soundness of your results and solving the actual problem.

Now, maybe you really can show that the bugs in the methods you use only cause negligible noise in your results, in which case it might be perfectly fine to use those methods. But just ignoring errors in your deduction process because you don't feel like doing the work of actually solving the problem at hand is not pragmatism. You'll have to at least demonstrate that your approach does not invalidate the result.

1 comments

lignuist 4449 days ago

Nitpicking much?

As I wrote above, by making sure that I use a placeholder that does not appear in the data, I make sure that it does not cause the issues you describe. And if I was wrong with that assumption, I can at least minimize the effect by choosing a very unlikely sequence as placeholder.

I really see no issue here. How do you find valid grammars for fuzzy data in practice?

link