Hacker News new | ask | show | jobs
Show HN: A new way to diff websites (diffd.com)
46 points by patrickrogers 3883 days ago
4 comments

Looks cool. I wanted to build a service that would monitor an external web page for me and notify me about content changes. The problem was that it was constantly detecting changes with ads or menus or stuff not related to the content itself (which is what I was interested in). Didn't have time to get it going any further. One use case for me was to monitor the Washington State traffic laws (or all RCWs if needed) and get notified when changes were published (with a diff that I could actually read, as opposed to just the URL with the change on it). I've found over the years that the laws were being changed in very small ways (like a 1 character diff, like changing "2" days to "5" days, giving LEOs more time to file traffic tickets with the court, or just a few letters, like changing "may" to "must" (requiring judges to dismiss tickets not filed in a timely manner as opposed to giving them the option).
With CasperJS and a CSS selector it would be easy to do, a few lines actually: https://gist.github.com/Ivanca/aef2e58dbbf9eb3e1bd4
An automatically curated list of changes made to various state and federal laws would be super cool. I remember reading a NYT article [1] that highlights the need for something like this to monitor changes that are being made to supreme court rulings years after they are issued.

[1] http://www.nytimes.com/2014/05/25/us/final-word-on-us-law-is...

I personally use the Page Monitor chrome extension: https://chrome.google.com/webstore/detail/page-monitor/pemhg...

It also has Selector functionality (in addition to full-page monitoring), which let's you explicitly choose which elements on the page to monitor for changes.

I've been recently looking for something similar to this as well, but haven't found the perfect product for me yet. Ideally, it would be an application or service where you would add URLs to monitor (perhaps with a list of ids/classes), where for each website, you'd get a long list of changes (timestamped) going back to when you added it, along with the option to diff each change against the current version.
I recently discovered http://visualping.io/, which does a good job of this.
If you're not interested in this because it's SaaS, check out BBC's Wraith instead:

http://responsivenews.co.uk/post/56884056177/wraith

Wraith is a cool open source project that we played around with a bunch before starting Diffd. This is roughly what the same diff of Stripe as shown in our demo will look like with Wraith: http://www.diffd.com/imagemagick_compare.png
If you're not interested in Wraith because you want something that non-programmers can use, you can check out Scylla:

http://getscylla.com/

(disclaimer: I wrote it)

Nice, I've written a few scripts to do things like this such as creating heat maps of areas of change between two sites and giving you the percentage of difference for CI builds etc...

Do you have any plans to open source this? That's the big factor for a lot of organisations decisions as to whether they'll make use of software or not. For example we have a policy that we won't rely on any proprietary software within our build / test pipeline.

We have been considering going down that route and releasing Diffd as an open source project while offering a hosted version that is the easiest way to do things in parallel and across multiple browsers and operating systems.

However, we expect that initially most will add Diffd as a manual step at the end of their pipeline. So, it's not like their ability to push code is completely dependent on a third party, it's just another check.

Taking a dependence on Diffd is going to be less risky than depending on a third party for your CI server like Travis CI, CircleCI or CodeShip.

We don't depend on a third party for our CI though? We use Gitlab-CI to build all our code / Docker images and run our tests which not only works very well but is also extremely fast.
Congrats on releasing, I was doing a similar thing, with a 'backup' twist but never got to do the MVP http://clicktwin.com/
Thanks lazyant. Clicktwin looks cool, I think people might find the instant deployment to s3 useful if they had a dynamic website that was falling over from high load.

You might also be able to achieve a similar sort of backup with Cloudflare.