Hacker News new | ask | show | jobs
by _a_a_a_ 927 days ago
define 'semantic similarity'

would your hoped-for tool recognise that

  1
and

  sin(x)^2 + cos(x)^2 
are the same? (I think that identity holds, but if not you get the picture)
3 comments

That looks like a case where "analyse the AST after constant folding" might be a theoretical path if you had a language frontend that could emit the AST at that point.

I suspect that things like "these two functions both start with the same conditional+early return" would be more useful to -me- given the sort of things I tend to be working on. Also a 'fuzzy possible copy+paste detector' in general to help identify refactoring targets.

It also strikes me that something that was mostly 'just' a structure-aware diff so e.g. you got diffs within-if-body and similar but I'm now into vigorous hand waving because it's been ages since I've thought about this and I probably need more coffee.

I -did- do a pure maths degree many years ago but I don't generally seem to end up working on computational code

Not with floats it isn't.
umm, touche
to the downvoter: I thought this was a reasonable question? Semantic equivalence is IIRC undecidable in general. Some languages (Backus' FL?) try to deal with that but I dunno.
> Semantic equivalence is IIRC undecidable in general.

They did mention code, and said "similarity" rather than equivalence.

But, as a trivial example, two different pieces of code can compile down to the same AST, or bytecode, or assembler.