Once it's possible for users to try out, please do post it here. I'm sure many will be very interested. And email us a link at hn@ycombinator.com so we can make sure it doesn't get flagged.
OP, I'm going to go out on a limb and assume you are Taichi Kato? If not, I'm assuming he, Arya, and Khush will read this discussion.
The three of you should be very proud of yourselves. This is a really cool, novel idea! The fact that you all are sophomores in high school is very impressive too. Continue to follow your passions and you will surely be very successful.
Until we address the stupidity of regurgitation of facts as a means of learning, if this helps someone generate flash cards to prep for exams, it's a positive thing.
This is fantastic. Congrats to the team! Education is an area that is still behind on using tech to dramatically improve the learning process so I'm very excited to see more work in this area.
[Tangent]
I was looking into doing something similar for a side project and got discouraged by my lack of a strong ML background. This is both great work and inspirational. Time to go dust off that repo..
This looks like an amazing tool. As someone who often uses courses that have premade notes but highly limited exercise and quiz material, this looks like a serious blessing (assuming it works). Thanks!
Wow! I love this idea. Congratulations! One question I have is what are you doing with the textbook pictures? Also, does the user categorize where the pictures came from so other people studying the same text can have additional questions?
Are you legally allowed to copy the contents of these textbooks? If I recall there is often on the first page of textbooks long legal texts which prohibits copying of any kind without written consent from the authors.
What our algorithm does is, we convolute the text from the textbooks to create questions such as "How did Hitler die?" from texts like "In August 1945, Hitler killed himself." from the textbook. So usually, that ends up being significantly different to the original texts in the textbook.
Copyright laws typically prohibit both identical copies, which appear to be done here by users during image processing / OCR even if only transitory, as well as derivatives of the copy. So the question then becomes, is this "fair use"?
If the use case for the app is personal use by the possessors of the textbook, very likely no problem. For all other uses you should consult a lawyer to make sure you're in the clear.
Shouldn't this imply that it's illegal for me to make quizzes on my own from information I found in a textbook? If this app is implemented correctly, what it does should be indistinguishable from a human. It makes no sense to change the infringement status of a work based on how the work was produced.
Also, generally hard facts (like when Hitler died or how Hitler killed himself) are not copyrightable.
I don't know how you are doing this, but with NNs there's still be the chance that result contains significant parts of the text-books.
I am not a lawyer, but because you are only interested in the syntactic equality (the sematic being the same) i think a simple algorithm like something based on edit-distance may be able to exclude such cases.
IIRC and IANAL, but, copyright laws are about reproducing text, e.g. taking this text and republishing it without consent of the copyright holder. In this case that does not apply; while the app does copy the text (and possibly sends it to their servers), this text is not republished.
Copyright law apples, not the first page legal looking text. They can make any claims they want to on the first page, but that has no legal standing in court. This quiz almost undoubtedly fall under fair use.
Of course if you need legal advice you need to consult a lawyer.
Once it's possible for users to try out, please do post it here. I'm sure many will be very interested. And email us a link at hn@ycombinator.com so we can make sure it doesn't get flagged.