Hacker News new | ask | show | jobs
by tomhallett 626 days ago
While I don’t disagree with any of your points, it seems like they are using a “platform/UGC/crowd” model to change the economics of the business model.

In the same way that TV networks find/vet/pay for the supply of shows and take on the risk per-show, YouTube (at its core) doesn’t do any of that and all of the content creators do those things with the hope it will take off and a share of the ad revenue, while YouTube’s risks are related to the opex cost of the incoming supply/demand.

Instead of cloudflare paying per examiner, they give a non-guaranteed slice to a bigger group of people.

3 comments

Gene Quinn (in 2015) estimated that patent search with the attorney's opinion on patentability for software costs around $2500 to $3000 [1]. Obviously the cost is going to be higher now. Compare that alone against the $1000 ("at least") per winner that Cloudflare's offering.

But Cloudflare isn't asking for an opinion on a particular invention. A patent searcher could come back and say there is no prior art that reads on the invention in that case and still be paid. Instead, Cloudflare's asking for invalidating prior art, which I think sets the bar even higher and should increase the payout to account for the fact that much of the time there won't be invalidating prior art and thus won't be a payout.

If the platform is not taking on as much risk, the payouts should be higher.

[1] https://ipwatchdog.com/2015/04/04/the-cost-of-obtaining-a-pa...

I doubt the program’s aimed at patent lawyers. They’re probably casting a wide net hoping to reach people who happen to be close to invalidating prior art to begin with, skipping the search. Or maybe people who’s sued by the same patent troll, in which case the program serves to pool findings. If I can write up something I already know in less than an hour and possibly win $1k, why not.
At Google we did a comparison of many, many "patent search" firms: giving them all the same task. Unfortunately I couldn't tell you the results even if I remembered them (which I don't). Most were garbage but a couple were spot-on.

It's more than $3,000; I can tell you that.

Secondly, it's detective work; you might get the answer right away, and you might spend days searching fruitlessly. Making a claim chart is what take the time: you have to hit every single element.

I really don't understand these posts on "it should be higher". Dozens and dozens of people contributed so the payout was either fair or irrelevant.
Thousands, technically.
But is there any potential disproportionate upside for any of the group of people who are searching? The sued company avoids paying $100 million in damages, and my upside as a searcher is $1000? Correct? Like, I don't have a potential super high upside like a YouTube content creator.
FOSS has almost unlimited upside and is based on contributors who are barely paid anything.
Agreed. I do think, however, that FOSS contributors may get some "social capital" from contributing (approval from the cool crowd, putting it on their resume and walking employers through what they did) that I doubt would go to some dude who spent 100 hours researching and finding an old patent or publication that kills a patent. Though I may be wrong.
You are being misguided for the same reason in both FOSS and this patent thing.

You just cannot see that for many people it's their genuine interest.

I know plenty of open source contributors and most of them do not give two damns about social capital or resume (some don't even work in software, but contribute to OS), they just like solving problems with code.

Strangely, this sounds like a great use case for LLMs? To just grind through entire datasets attempting to surface prior art.

Edit: Found this with a search, so it can be done: https://xlscout.ai/novelty-checker-llm/

(also, thanks Cloudflare! Keep on grinding patent trolls!)

After I quit the USPTO, I tried using ChatGPT 3.5 for some basic patent examining activity out of curiosity, and I can say that it did an absolutely horrendous job. This wasn't prior art search, just analyzing the text to do a rejection based on the text alone (35 USC 112).

And the AI search technologies I used tended to not be particularly good. They typically find "background" documents that are related but can't be used in a rejection.

I don't anticipate LLMs being able to examine patents in general well. Many times a detailed understanding of things not in the text is necessary to examine. For the technologies I examined, often search was basically flipping through drawings. I'd love to see an AI search technology focus specifically on patent drawings. This can be quite difficult. Often I'd have to understand the topology of a circuit (electrical or flow) and find a specific combination of elements. Of course, each drawing could be laid out differently but be topologically equivalent... this surely can be handled with computers in some way, but it's going to require a big effort right now.

The patent office is also horrendous at evaluating novelty, so I suppose ChatGPT has already reached human level performance on this task!
Similar to the way in which software developers are terrible at delivering quality software on-time and on-budget, so I suppose ChatGPT has already reached human level performance on this task!
ChatGPT is a mirror where we don't look too good ...
My point was more that just because humans are terrible at something doesn't mean ChatGPT can't be much worse.
As others have said, ChatGPT is great for writing fluff content that has no right or wrong answer. But it is still weak when a correct answer is needed, like in legal analysis. It can write a great 10 page summary of the history of the use of strawberries. But when it comes to telling how many r's are in the word strawberry, it's not very trustworthy.
I wonder if most people realize that your observation is a fundamental problem with LLMS. LLMs simply have no means to evaluate factuality. Keep asking ChatGPT "Are you sure?" and it will break eventually.

The inability to answer basic facts should be a dealbreaker.

Then you need to go over each item with just as much care as you would any probably-irrelevant item pulled from a keyword search, because the LLM is incapable of evaluating it in any way other than correlation.

Also, you don't necessarily have a real dataset to begin with: prior art doesn't need to be patented, it just needs to be published/public/invented sufficiently before the patent. Searching the existing patent database is insufficient.

> Also, you don't necessarily have a real dataset to begin with: prior art doesn't need to be patented, it just needs to be published/public/invented sufficiently before the patent. Searching the existing patent database is insufficient.

I would caution against making assumptions with regards to dataset access and size. I agree effectiveness of the effort I mention would be a function of not only gen AI engineering, but also dataset size and scope.

Going over a better curated list is a significant upgrade and time saver.

Let’s not pretend that “correlation” isn’t very powerful

There are, in fact, startups working on using AI for legal matters. I know one of the principals in one personally.

I don't know if they're tackling this issue, though.