Hacker News new | ask | show | jobs
by simne 1541 days ago
I'm not sure about biggest, but on of the biggest sure.

Two problems.

1. DNA sequencing, now is two part process (I simplify a lot, sure) - at first stage, target DNA divided for small pieces, and those pieces sequenced to digital codes and saved to database. Than in second part, computer cluster works few months, to connect those pieces to one chain. And this is very expensive process even considering few gigabytes storage to store all these data for few months.

The problem is that currently used algorithms are very naive, even in some projects used Perl implementation, not C or some other fast language. So exist opportunity, to create new special algorithm/software, or may be also some hardware support, so this will be at least 5 times faster, and if costs will drop under 100$ for one human DNA, this will lead to very new type of medicine - genetic checks of everything.

2. connected to 1st problem - find protein folding structures from DNA, and how they interact with other proteins, with drugs and with chemical molecules. As I know, this now solved in similar way as 1st problem, but in partially 3d space - programs try to calculate some positions, in which parts of structures are attracted to each other and calculate power of attraction. This way for example, calculated probability of drugs, which will attract to spike protein of Covid and neutralize it. Also this way calculated effects of new drugs, like interaction with proteins in human organism, etc.

In ideal world, possible, that for example for cancer, 1st will create fast DNA of cancer cells of some human, and 2nd will create DNA code of artificial protein, which will be printed on DNA-printer and inhibit cancer, but will not affect normal cells, and all this will be done in just few weeks.

2 comments

The is an interesting related opportunity: given a existing desired protein, find its sequence- needed so that you can determine the corresponding genetic sequence so that your custom organism can synthesize the protein.

Protein sequencing is not so easy... very basically you can do it one amino acid at a time based on the fact that amino acids are "sided", meaning side A of each can only connect to side B of the next one. So you attach the end of the protein to some kind of substrate (which can only attach "A" sides), then cut off each amino acid one at a time.

There are some companies in this space:

https://www.quantum-si.com/

You don't need to make sequencer - they are exists now and some even affordable to buy for personal use, and exists industry, which run hardware progress.

But their software not good, because they just make first steps into gigabytes world, they have not worked with so large data chunks before, and this is really stellar opportunity.

1) I'm afraid you're about 15 years out of date with your perception of how genome sequencing and assembly works. Nobody is assembling PacBio, MinION or other long-read data (or short read illumina data for that matter) with slow aligners in Perl.

2) see AlphaFold2 and other related methods -- the protein folding problem is still a big challenge but the field is moving in leaps and bounds

1. You should read carefully and not tie to some part of text and miss all others.

2. Author asks about biggest opportunities, and I give example. I suggest you to learn, how to be less toxic in conversation, as this will make your life better.