Hacker News new | ask | show | jobs
by adambyrtek 5750 days ago
> Although launched with a zero funding, the business model for the Mini-WWW is highly scalable: The more you invest into it, the more reviwers can be hired.

I don't think this is what we usually mean by scalability. If you find a good algorithm to measure the "simplicity" of a page, it could be useful, but relying on a manual review process is a recipe for failure.

2 comments

Relying on a manual review process would be a failure only in the case you want to include into your search engine the whole Web. In contrast, the proposed niche is for a small Sub-Web, where a slow, - yet quality! - manual reviewing would be quite sufficient: http://mini-www.com/blog/next_big_thing-be_small/

Use Google to access the Web. And Mini-WWW for the "Mini Web"

The problem is that such a slow reviewing process means that the results are likely bad, because only manually submitted pages are indexed. For example:

http://mini-www.com/search/?find=c%2B%2B

The results are currently terrible, and at $1 per page indexing fee, or a backlink to mini-www.com required I expect the results will stay bad for a long time, perhaps forever.

Who said the actual implementation for the Mini-WWW search would be the same next week? And WHO SAID THE MINIMAL INTRODUCTION PRICE OF $1 IS FOREVER?!!
Don't shout, please.
OK, I will not :-) I'm somewhat confused why some my answers I think are very important for this topic are down-voted ...
Manual reviewers are people. You'll still need guidelines to determine what's minimal. People are points of failure... Will be hard to determine what really is minimal.
It's not too hard to measure what might be considered a "minimal" page, if he means simple, uncluttered web pages.

Here is what you do; number-of-CSS-blocks + number of images + javascript-dependencies + some magic function involving unique HTML elements in pages = minimal#

The lower the minimal# the better.

You can also throw in some weight products as well, to skew the results and penalize specific components (say flash=1000, gif=300, java-applet=1000, real-media=2000, etc.)

Yahoo! Directory is new again.
The MiniRank algorithm is described here: http://mini-www.com/blog/mini_rank_formula/
I described your algorithm elsewhere here without seeing your spec. Ok, if you have a history of delivering software, you should be able to pull this off. You will need some massive crawling work, but you can narrow down an initial target by using "good" bookmarks as a starting point.
That's exactly what I will try to do next! However, the soft will be no more than some tips for pre-selection purposes only. Yet the final stage should be manual anyway.