Show HN: Sieve - filter webpages and watch changes

Y	Hacker News new \| ask \| show \| jobs

	Show HN: Sieve - filter webpages and watch changes (getsieve.com)
	44 points by ajitk 4819 days ago

9 comments

elehack 4819 days ago

Cool concept, but name collision: Sieve has been the name of a server-side mail filtering language for quite some time. https://en.wikipedia.org/wiki/Sieve_%28mail_filtering_langua...

link

X-Istence 4818 days ago

This is exactly where my thoughts went when seeing the name, as well as the title on this submission.

link

ajitk 4819 days ago

Glad that you liked the concept. Wasn't aware of the unfortunate name collision. I am open to suggestions. Thanks!

link

alariccole 4818 days ago

Siv. https://news.ycombinator.com/item?id=5739784

link

ajitk 4818 days ago

Liked this suggestion. Unfortunately the domain names getsiv.com and sivapp.com are taken.

Edit: siv.io could be nice.

link

jeremyswank 4818 days ago

if sifter is out, then something like sift could be winnow or winnower. might be a bit obscure for non-agrarians, but it has 'win' in it!

link

ljd 4818 days ago

Herodotus or shorten it to Herodo

GetHerodotus.com and GetHerodo.com are both open

link

tekacs 4818 days ago

DiffWeb?

Detect? (amusingly not taken within tech, seemingly)

link

jeremyswank 4819 days ago

well, a sieve sifts, how about sifter?

link

ajitk 4818 days ago

Thanks. I like sifter. Going to check if it is available.

link

twanlass 4818 days ago

Probably want to avoid that as well - https://sifterapp.com - #1 Google result for sifter...

link

mosburger 4818 days ago

strainer?

link

jstanley 4818 days ago

This is very awesome! The implementation is different to what I expected (or to what I would have done).

They seem to run a browser on the server and let the user interact with it to choose DOM elements to monitor for changes. I would have just taken a screenshot (perhaps with http://urlbox.io/ or perhaps with CutyCapt) and allowed the user to draw boxes over the screenshot. Then repeatedly screenshot the site and whenever the contents of that box changes, alert the user.

The Sieve method has the advantage that you are able to tell the user what the new text is. The screenshot method is significantly simpler.

EDIT: So, in startup terms, the screenshot method could be the MVP :). Provided, of course, that it is considered "viable" not to know the text. If I were a user I would consider it viable - it is still a vast improvement over having to check the site manually.

link

ajitk 4818 days ago

Thanks for the great feedback. :)

Getting started would certainly be much simple when using screenshot or page HTML for comparison. Sieve could do that too! More work needs to be done to make startup faster.

On the other hand having filtered text provides multiple advantages. The filtering becomes accurate. It can be used to compute rules to take conditional actions. Notifications delivery via email and SMS would make more sense as well.

Running a browser on server enables another cool function. User could record macro and run the macro when text satisfied a pre-condition.

Edit: typo

link

onassar 4818 days ago

I'm been working on http://imnosy.com for a while, and thought about that approach, but ran into difficulties when slight variations to the screen were made. Right now, we're using a diff engine. Sieve looks pretty rad though. Will check that out. Fun space :)

link

jstanley 4818 days ago

This is what the selection boxes are useful for - you can make it ignore changes to irrelevant areas of the page (though obviously that doesn't apply so well to imnosy).

link

chch 4818 days ago

I'd been mulling over the same concept for quite a while, as a sort of an intelligent update to IE5 for Mac's Subscription manager[1]. It was a very useful tool in my toolbox, and I mourned losing it as that browser decayed.

The main issue with the Subscriptions was that they were global, and would not inform you what changed, just that there were changes. With the increased dynamicness of the web since good ol' Y2K (especially with ads), this model is much less feasible, whereas a DOM-based model is more robust, and allows further automatic data processing.

I never got past small prototypes, so I look forward to Sieve's release since it is basically someone doing my work for me! :)

[1] http://www.macoptions.com/tips/images/iesub2.gif

link

ajitk 4818 days ago

Ah IE5! It has been a very long wait. :) We are hard at work to launch Sieve. Optimistically we should be ready in a couple of months.

link

ajitk 4819 days ago

Hello HN! I am excited to show Sieve. Its in alpha state and under active development.

Would appreciate your feedback. Checkout Visual Selector used to let user filter content from a webpage.

link

opminion 4818 days ago

If what you are doing is essentially a diff for news, then you might be onto something very interesting.

The way news are consumed is currently tiered by the temporal interval which they cover: breaking news, daily news, weekly, monthly, annual summaries. A diff for news can help process the different tiers from a single reader, without the "breaking news" tier taking over.

link

ajitk 4818 days ago

At its core, the essential work is to detect changes important to the user. Adding summarization techniques to further weed out the noise would be a very important improvement.

link

Ecio78 4818 days ago

Maybe I'll give it a try, I use http://www.changedetection.com/ for some pages I like to monitor (typically job sites) but it works only if you can clearly address the page you want to monitor

link

ajitk 4818 days ago

Using Sieve it is possible to monitor any kind of webpage. It should be possible to monitor websites that require you to log in to check updates too.

link

smickie 4818 days ago

Can I ask what sort of libraries you're using for the canvas-in-page-browser-selector? Is it server or client side?

The technology here is spectacular. Great job!

link

ajitk 4818 days ago

Thanks for your kind words. The stack is a mix of both. Browser runs on the server and sends updates to the client via websocket. Then it is painted on the client. On the client-side a part of noVNC js library is used to capture input and is sent to the browser running on the server.

link

hornbaker 4818 days ago

Pretty slick. Could be the seed of a lightweight Browserling competitor, if you could scale it economically.

link

ajitk 4818 days ago

Thats an excellent suggestion. We do have plans to modularize the browser component to offer browser as a web service.

link

alariccole 4818 days ago

Might I suggest, re the name collision, to simplify the name to Siv? Sieve is prone to misspellings, and "Siv" has more of a brand feel to it.

link

Jhsto 4818 days ago

One of my friend's summerjob is to keep eye on heat rates of some electrical engines (web page, as it seems) and he will surely love this.

link

borplk 4818 days ago

Is the selection of the relevant part of the page not possible on the client side? Through some iframe magic or something?

link

ajitk 4818 days ago

It is not possible select content using iframes due to client-side security restrictions imposed by the browser.

link

aatifh 4818 days ago

Btw, the landing page is a Stripe.com rip off.

link