|
|
|
|
|
by wolfgang42
3253 days ago
|
|
First, a small correction: the if statement in my original code can actually be written as the following, which is what I had in the actual code: if (items.Count(item2 => item.Upc == item2.Upc) != 1)
With that correction out of the way, here's the code if I used a HashSet (you suggest 'a. k. a. "dictionaries"' but that is slower and I don't need the other half of the mapping which they would provide): var set = new HashSet<string>();
foreach (var item in items) {
if (set.Contains(item.Upc)) {
item.Duplicate = true;
} else {
set.Add(item.Upc);
}
}
To my eye, this is less readable; in addition to having twice as many lines and an extra variable, it also takes extra work to figure out what it does; whereas my version does exactly what it says: if the UPC appears more than once, it marks the item as a duplicate.For a final comparison, here's an implementation of the extra loops and manual indexing I suggested: for (var i = 0; i < items.Count; i++) {
for (var j = i + 1; j < items.Count; j++) {
if (items[i].Upc != items[j].Upc) continue;
items[i].Duplicate = true;
items[j].Duplicate = true;
break;
}
}
This isn't significantly longer than the HashSet-based version, but it is even faster to run (I just benchmarked it) and also doesn't use an extra object which takes memory and needs to be managed and garbage-collected later.In conclusion, while I appreciate your suggestion for a possible alternative implementation, I do not appreciate your impugning of my abilities with the suggestion that "something went wrong with your thought process", and implying that I may not be aware of basic data structures. |
|
Not to menion that:
> you suggest 'a. k. a. "dictionaries"' but that is slower
In what way you think mappings are different in implementation than sets that makes them slower? They only need to carry one more pointer, and you're already writing in a language quite detached from bare metal.
One more thing, since you decided to "correct" yourself instead of letting the mistake slip: your "working" code is still wrong, unless something very weird happens in the data (but then calling it more readable this way is fundamentally wrong).