Hacker News new | ask | show | jobs
by echelon 3508 days ago
I took three semesters of orgo in undergrad, and if it taught me anything, it's that there are exceptions to nearly every rule. There are so many complicated molecular orbital interactions, requiring years of study. And even then there are always things that break these rules in unexpected ways or produce several products.

How do you overcome this? Can you predict yield percentages of each product? What about chirality?

Can your system design synthesis pathways? Can it optimize for final product yield? How does it handle the thermodynamics and kinetics of reactions?

In any case, cool project. It's a very difficult domain.

2 comments

One of the things that I didn't quite get when I started taking organic chemistry was that I really couldn't figure out (didn't have enough background knowledge) why many reactions happened the way they did and that I just had to memorize things.

(Accounting is the same way though for very different reasons. You could justify recording some transactions in about 5 different ways--but FASB says only a particular one is OK.)

At the grad level students get really good at rationalizing and predicting reactions. We did bimonthly exercise called mechanism club where someone would pick some chemical reactions from the literature and basically volunteers would come up to the chalkboard and push electrons till the reaction was complete.
Thanks for the kind words! I would agree that the function we're trying to learn is very complicated, with many exceptions. But that only improves our comparative advantage, at least compared to novice chemists. We might also be able to help our system take advantage of the knowledge of physics that expert chemists use to predict reaction outcomes by giving our network access to the output of a physical reaction simulator.

Extending the system to (try to) predict yield percentages or chirality is straightforward. The hard part, in my mind, is that there aren't a fixed number of reaction types. As molecules get bigger, we'll have to move away from predicting one of a fixed set of reaction types, to directly predicting products - but this is a much harder problem.

Our system probably isn't good enough to design synthesis pathways yet, but that is the eventual goal. A system that also predicted yield would of course help with that, and that would be another straightforward extension.