Hacker News new | ask | show | jobs
by smu 4775 days ago
Indeed! During a security assessment of a large code base, I wrote a simple ruby source code parser that spat out Prolog terms. After that, it was a breeze to find certain kinds of logic errors, such as code paths that used user input before sanitation...

It's beautiful when it all works together, checking everything manually would certainly have been a PITA and would have resulted in a lower quality result.

Definitely an aha moment!

2 comments

Every time I use prolog for something (or more like I think "Ha! This is meat for Prolog!") I feel like I'm just using an awesomely sharp knife. As I like to put it, it's like a database on steroids.

A few months ago I was "bored" and decided to write a "scrapper" (kind of) to get all my tweets (this was before twitter opened tweets for download). It was relatively easy: write a little javascript that scrolls to bottom and then regex-matches all the (correctly loaded now) tweet data in the webpage. Then collect the tweet text in a plain ol' list that then is dumped on-screen as prolog terms... and saved as plain text.

With this simple thing I could make interesting queries (who I mention more often, when I'm more prone to tweet...) in a breeze, and I could made even better queries with a little more work/available data.

Could you open source it? I'm very interested in seeing how one would do that, I've been thinking about writing one myself, for PHP (I've a somewhat large codebase to inherit, that only works in a old version of PHP, and I might need to migrate it).
I don't have the IP for the code so I can't copy it here, but what I did was very simple and the gist of it is described below:

The parser used regular expressions to recognise function definitions and calls in those definitions. I used file names as the function scope, this was good enough because there were no two functions with identical names in the same file. Function calls became the following terms: "calls(x,y,z).", meaning that function x in file y calls z.

These "calls" terms actually define a directed graph. If you google "prolog path through directed graph", there are lots of hits that will help you out. The following (untested) code should get you started:

    %there is a path if there is a call
    path(Caller, File, Called, [calls(Caller, File, Called)]) :- calls(Caller, File, Called). 
    %there is a path if there is a call to some function and there is a path from that function
    path(Caller, File, Called, [calls(Caller, File, A) | P]) :- calls(Caller, File, A), path(A, _, Called).
After that, you can find all possible paths with "findall/3" and check for existence of a certain known good/bad function with "member/2" (again, google is your friend)

Due to some properties of the code, this simple approach worked well enough for me. Hopefully this helps you out.

Probably needs a green cut in the first term... Or maybe not. I'm quite backtracking-puzzled about green cuts in generic terms I write, so thinking about them in others' code is even more puzzling :)