Hacker News new | ask | show | jobs
by delton137 2534 days ago
We published something similar in spirit recently (although it ended up as a conference paper and not in Nature)... Notably, we did our study with much fewer data - instead of millions of patents we had the text of a few thousand patents and the text of a few hundred conference papers. We had a specific focus and we wanted to focus on texts about energetic materials (explosives and propellants).

We showed how chemical-application & chemical-property relations are captured by word2vec and GloVe. For instance we found rocket fuels where the chemicals appearing closest to “rocket” while materials used in air bags appeared closest to “air bag”. We were able to filter to chemical names using ChemDataExtractor and further to likely energetic chemicals by obtaining SMILES strings from PubChem and using a classifier to classify them as likely energetics or not.

You can find our work here : https://arxiv.org/pdf/1903.00415.pdf .