Hacker News new | ask | show | jobs
by EdwardRaff 3112 days ago
Really, the benign-vs-malicious question is an oversimplification. But thats what we have data for, and what most people focus on.

The reality is there is a big gray area between the two classes. Some cases are really hard to determine, and would be something that would lead to errors in production. Some examples:

What if it is of malicious intent, but the author messed it up and so it doesn't do anything. Is it still malware?

What if it's a program used for encryption for security, but used by malware to create ransomware? Is it malicious now?

What if its a benign program, but a bug causes it to destroy files. Is it malicious?

Some programs are maybe not malicious, but just annoying (like browser toolbar installers). What do we call it? Some systems have a "Potentially Unwanted Software" category for these guys.

Ultimately, it's not easy. Thankfully most binaries are fairly cut-and-dry in terms of which side of the fence do they belong. The hope is that with enough labeled data, we can do a good job for the majority of cases. We don't expect it to ever be perfect. Hitting the hard to distinguish samples is definitely something dig into in the future.