|
|
|
|
|
by the_real_sparky
1263 days ago
|
|
I think the actual interface with OpenAI’s platform is the easy part. Everybody and their dog will have a version of this. Just look at the comments so far - many of us have already been playing with it. If you want a real moat, figure out how to parse existing PDF documentation that is really badly formatted. Think diagrams and tables with text floating in various places, etc. Documentation of this style is very common in industries where physical things are being built in the real world. The standards documentation (IEEE, ANSI, NFPA, etc) doesn’t usually parse cleanly, much less the messier internal documentation within the businesses. Grobid is the best example of such a documentation parser, but it is so laser focused on academic papers that it fails to properly process industry-style standards and SOP documentation. What the world needs right now is a Grobid that works for other kinds of messy documentation. |
|
One thing that all these models will lack is the ability to include diagrams (on both the input and output side). Working out a clever way to do that would be very cool.
At the moment there are some difficulties with the GPT interface - the most tricky one being the limit on the length of the input prompt. I'm not sure at the moment how much fine tuning helps with this.
But, my assumption is that OpenAI will improve this, so there's not a huge way to differentiate here.