|
The premise of this blog post is a little off base. (Though I think Open Law Library is doing good work.) The difficulty in building a high quality legal search engine is not in parsing the links between the documents. High quality links matter, but they only get you about 25% of the way there. The more important thing is to have a highly accurate and structured understanding of the law. (Think of Google's Knowledge Graph, or the maps they use for their driverless cars.) Disclaimer: I worked on Google Scholar and am the CEO of Judicata. A recent evaluation of various legal search engines [1] found: "The oldest database providers, Westlaw and Lexis, had the highest percentages of relevant results, at 67% and 57%, respectively. The newer legal database providers, Fastcase, Google Scholar, Casetext, and Ravel, were also clustered together at a lower relevance rate, returning approximately 40% relevant results." Westlaw, Lexis and Google Scholar all have high quality citation parsing (i.e., links). And Scholar relies very heavily on PageRank (as [1] demonstrates). But it is Westlaw and Lexis that are the better search engines. That's because they have invested more into going beyond just links; they've invested a lot into understanding what it is happening with the law. At Judicata our own findings are that the average legal search query is significantly more complex than the average Google query -- having more terms and more concepts. Moreover, whereas only 15% of Google queries are unique, the inverse is true in legal research: more than 85% of queries are unique. What that means is that in order to return a good result, you need to understand a lot more about the query and the documents you've indexed. You can't rely on links between documents and past searches and clicks to power a quality search engine (the way that Google.com can). As has been mentioned in other comments here, the real challenge for legal research is extracting structure out of the law (Shepardization, Procedural Postures, Causes of Actions, Dispositions, Legal Principles, Arguments, Facts, etc.). That is what will get legal search engines closer to where Google really shines -- results that are powered by the Google Knowledge Graph. [1] https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2859720 |