Hacker News new | ask | show | jobs
by 331c8c71 1 day ago
The post is written almost as if there is no prior art on (germline) variant interpretation. In fact, it is an established niche field with multiple commercial vendors existing for years (and diagnostics for critically ill infants is the most well known use case - google Stephen Kingsmore or Rady's children hospital for one). I'd be surprised if the approach is really something novel at this point.

It is definitely the case that the parents of babies and kids with life-threatening conditions are often one the most motivated people you see on Earth and what they accomplish sometimes is truly incredible. My heart goes out to them including the OP - I can only imagine how hard it must be....

3 comments

It's funny you mention Dr. Kingsmore and Rady.

While I am truly grateful for him and the team for their contributions to neonatal genetics (and hosting me in San Diego for a few days to show me how I could help), Rady was actually the unnamed lab that failed to diagnosis my son.

And this happens all the time. The WGS NICU diagnostic rate is only ~30%, depending on who you ask. Just because people have been working at this for a decade and products exists, doesn't mean it's a solved problem.

I don't know if you read until the end of my post, but I did run a small experiment in collaboration with an academic geneticist and outperformed the first-line clinical labs across the board. My approach, which is essentially Claude Code for genetics, is fundamentally different and novel than how this work is done today and seems to perform much better in early experiments. Time will tell is this generalizes to all clinical work.

I'm planning on publishing evals and benchmarks in the next few weeks, but out-of-the-box systems actually don't do very well for a variety of reasons.

Thanks for the reply. I have read your post but I haven't seen the preprint obviously and without knowing the details I remain skeptical.

> The WGS NICU diagnostic rate is only ~30%, depending on who you ask.

Agreed. It does not automatically mean, however, that it can be significantly improved with better variant interpetation or better analysis of the same wgs data in general sense.

> I'm planning on publishing evals and benchmarks in the next few weeks, but out-of-the-box systems actually don't do very well for a variety of reasons.

Happy to see it. I wish you all the luck and will be the first one praising your solution if I see convincing results.

> Agreed. It does not automatically mean, however, that it can be significantly improved with better variant interpetation or better analysis of the same wgs data in general sense.

I wouldn't say anything is automatic or taken for granted, but it is actually relatively common for more thorough reanalysis to uncover something that the first pass missed. I hinted at this in the post, but the reason that this doesn't happen today is human bandwidth.

A core part of my thesis is that that this highly specialized human bandwidth can be scaled with AI.

It may work. It may not work. But I would feel bad if I didn't give it a try.

> Happy to see it. I wish you all the luck and will be the first one praising your solution if I see convincing results.

Appreciate that! Hopefully, they will come.

> I'd be surprised if the approach is really something novel at this point. <

this is a very common reaction to people doing things with llms and i think the effects of it can be somewhat insidious. you constantly see people out there vibing their way to something that has already been discovered somewhere else, but they didn't know that and in many cases wouldn't have known how to find that thing even if they did.

the framing of "the llm tricked you into thinking you discovered something" while technically true in many cases, very strangely casts the positive outcome of a person being linked in a very engaged manner to something they wouldn't otherwise know or found out into something to be looked down on, and sort of just discourages people from trying stuff themselves that wouldn't be possible for them without something like an agent. it's okay if someone else already found the thing. for areas like science and research, it's actually a good thing if something you did repeated the work of someone else. it validates the original piece of work, and it tells you the things you were trying were on the right track to begin with.

Interesting I didn't think about it this way but it's indeed a kind of IKEA effect assuming that P-creative is H-creativity (as defined by Boden) which is totally aligned with incentives of using models.

They have to be useful, otherwise nobody comes back, and used, not just a starting point that can be bypassed after doing it a couple of time. Instead of pointing out to what exists, basically what a search engine does, it "helps" the user by building. It also gives an amazing sense of agency and power, you "do" get something that seems to come out of nothing, conveniently removing provenance and thus make the user feel quite good about the process.

This is especially poignant to me given this anecdote from a friend I shared just days ago https://news.ycombinator.com/item?id=48457842 showcasing how we wrote a Wacom driver, on his own, without being a developer, thanks to Claude, and how he even potentially helped others by sharing back what he "built" only for someone to suggest an already existing project https://news.ycombinator.com/item?id=48459366 .

Related professional ramblings few days ago https://news.ycombinator.com/item?id=48353270 (TL;DR: more code isn't better, doesn't matter who or what wrote it)
There can be plenty of prior art; however, AI can democratize that knowledge. There are many things it's helped me accomplish which are trivial to people who are far more knowledgeable than me.
I am not sure what specifically you mean by "AI" here but it's a bit naive (no offence) to think the field is so dumb that it haven't been looking at "AI" for a few years already. See https://www.biorxiv.org/content/10.1101/2023.10.02.560464v1 for instance

nostos/limbus, genoox, engenome, congenica are a few companies/products that I have heard about and have been around for years (the last one was defunct from what I heard last however).

Disclaimer: not affiliated to any of these.

I thought that the author had used AI to vibe code the software to help him look through the genomic data

"Like all of the others, this WGS came back non-diagnostic. Unlike the others, my heartbreak had passed the point of hopelessness and I was ready to do something. I contacted every lab we had worked with and requested access to all genomics files. I was going to figure this out myself.

"After a few days, my initial results shocked me. The prototype I built not only accomplished my original goal of confirming that Warren seemed healthy (spoiler: everything is fine), but it found the genetic mutation that took our first son Owen’s life. How could something I built so quickly outperform the top sequencing lab in the country?"