Hacker News new | ask | show | jobs
Show HN: Sieve - filter webpages and watch changes (getsieve.com)
44 points by ajitk 4772 days ago
9 comments

Cool concept, but name collision: Sieve has been the name of a server-side mail filtering language for quite some time. https://en.wikipedia.org/wiki/Sieve_%28mail_filtering_langua...
This is exactly where my thoughts went when seeing the name, as well as the title on this submission.
Glad that you liked the concept. Wasn't aware of the unfortunate name collision. I am open to suggestions. Thanks!
Liked this suggestion. Unfortunately the domain names getsiv.com and sivapp.com are taken.

Edit: siv.io could be nice.

if sifter is out, then something like sift could be winnow or winnower. might be a bit obscure for non-agrarians, but it has 'win' in it!
Herodotus or shorten it to Herodo

GetHerodotus.com and GetHerodo.com are both open

DiffWeb?

Detect? (amusingly not taken within tech, seemingly)

well, a sieve sifts, how about sifter?
Thanks. I like sifter. Going to check if it is available.
Probably want to avoid that as well - https://sifterapp.com - #1 Google result for sifter...
strainer?
This is very awesome! The implementation is different to what I expected (or to what I would have done).

They seem to run a browser on the server and let the user interact with it to choose DOM elements to monitor for changes. I would have just taken a screenshot (perhaps with http://urlbox.io/ or perhaps with CutyCapt) and allowed the user to draw boxes over the screenshot. Then repeatedly screenshot the site and whenever the contents of that box changes, alert the user.

The Sieve method has the advantage that you are able to tell the user what the new text is. The screenshot method is significantly simpler.

EDIT: So, in startup terms, the screenshot method could be the MVP :). Provided, of course, that it is considered "viable" not to know the text. If I were a user I would consider it viable - it is still a vast improvement over having to check the site manually.

Thanks for the great feedback. :)

Getting started would certainly be much simple when using screenshot or page HTML for comparison. Sieve could do that too! More work needs to be done to make startup faster.

On the other hand having filtered text provides multiple advantages. The filtering becomes accurate. It can be used to compute rules to take conditional actions. Notifications delivery via email and SMS would make more sense as well.

Running a browser on server enables another cool function. User could record macro and run the macro when text satisfied a pre-condition.

Edit: typo

I'm been working on http://imnosy.com for a while, and thought about that approach, but ran into difficulties when slight variations to the screen were made. Right now, we're using a diff engine. Sieve looks pretty rad though. Will check that out. Fun space :)
This is what the selection boxes are useful for - you can make it ignore changes to irrelevant areas of the page (though obviously that doesn't apply so well to imnosy).
I'd been mulling over the same concept for quite a while, as a sort of an intelligent update to IE5 for Mac's Subscription manager[1]. It was a very useful tool in my toolbox, and I mourned losing it as that browser decayed.

The main issue with the Subscriptions was that they were global, and would not inform you what changed, just that there were changes. With the increased dynamicness of the web since good ol' Y2K (especially with ads), this model is much less feasible, whereas a DOM-based model is more robust, and allows further automatic data processing.

I never got past small prototypes, so I look forward to Sieve's release since it is basically someone doing my work for me! :)

[1] http://www.macoptions.com/tips/images/iesub2.gif

Ah IE5! It has been a very long wait. :) We are hard at work to launch Sieve. Optimistically we should be ready in a couple of months.
Hello HN! I am excited to show Sieve. Its in alpha state and under active development.

Would appreciate your feedback. Checkout Visual Selector used to let user filter content from a webpage.

If what you are doing is essentially a diff for news, then you might be onto something very interesting.

The way news are consumed is currently tiered by the temporal interval which they cover: breaking news, daily news, weekly, monthly, annual summaries. A diff for news can help process the different tiers from a single reader, without the "breaking news" tier taking over.

At its core, the essential work is to detect changes important to the user. Adding summarization techniques to further weed out the noise would be a very important improvement.
Maybe I'll give it a try, I use http://www.changedetection.com/ for some pages I like to monitor (typically job sites) but it works only if you can clearly address the page you want to monitor
Using Sieve it is possible to monitor any kind of webpage. It should be possible to monitor websites that require you to log in to check updates too.
Can I ask what sort of libraries you're using for the canvas-in-page-browser-selector? Is it server or client side?

The technology here is spectacular. Great job!

Thanks for your kind words. The stack is a mix of both. Browser runs on the server and sends updates to the client via websocket. Then it is painted on the client. On the client-side a part of noVNC js library is used to capture input and is sent to the browser running on the server.
Pretty slick. Could be the seed of a lightweight Browserling competitor, if you could scale it economically.
Thats an excellent suggestion. We do have plans to modularize the browser component to offer browser as a web service.
Might I suggest, re the name collision, to simplify the name to Siv? Sieve is prone to misspellings, and "Siv" has more of a brand feel to it.
One of my friend's summerjob is to keep eye on heat rates of some electrical engines (web page, as it seems) and he will surely love this.
Is the selection of the relevant part of the page not possible on the client side? Through some iframe magic or something?
It is not possible select content using iframes due to client-side security restrictions imposed by the browser.
Btw, the landing page is a Stripe.com rip off.