| Every now and then, I'll actually sort of believe an article like this. Then I go and test the current models on things like semantic search. For instance - The Hough transform detects patterns with certain structure in images, e.g. circles or lines. So I'm looking for academic research papers which apply the Hough transform to audio spectra, to recognize the harmonic structure of tonal audio and thus determine the fundamental pitch. (i.e. the Hough space would be a 1D space over fundamental frequency). I've spent probably 90 minutes, over the several times I've read an optimistic post like this, asking various LLMs (mostly GPT-4o, though my early tests predate GPT-4o, and I've also tried Gemini and Claude), prompts along the lines of > The Hough transform detects patterns with certain structure in images, e.g. circles or lines.
> I'm looking for academic research papers (please link them or provide a DOI.org link at least) which apply the Hough transform to audio spectra, to identify the harmonic structure of audio and thus determine the fundamental pitch.
> Make sure to provide only papers that actually exist. If you can't find anything particularly relevant, say so as a disclaimer & just provide the most relevant papers you can. This is a reliable "fake paper generator", unfortunately - it'll just make up plausible garbage like > Here are some academic papers related to applying the Hough transform to audio spectra for identifying harmonic structures and fundamental pitch: > "An Audio Pitch Detection Algorithm Based on the Hough Transform"
> Authors: Mark W. and John D.
> Published In: IEEE Transactions on Audio, Speech, and Language Processing
> DOI: 10.1109/TASL.2008.2000773
> Abstract: This paper proposes an audio pitch detection algorithm that utilizes the Hough transform to analyze the harmonic structure of audio spectra and determine the fundamental pitch. This paper does not exist. Complete waste of my time. And again, this behavior persists over the >1 year period I've been trying this query. And it's not just search-like tasks. I've tried asking for code and gotten stuff that's outright dangerous (try asking for code to do safe overflow-checked addition on int64_t in C- you have about an 80% chance of getting code that triggers UB in one way or another). I've asked for floating-point calling conventions on RISC-V for 32-bit vs 64-bit (would have been faster than going through the extension docs), and been told that RV64 has 64 floating-point registers (hey, it's got a 64 in the name!). I've asked if Satya Nadella ever had COVID-19 and been told- after GPT-4o "searched the web"- that he got it in March of 2023. As far as I can tell, LLMs might conceivably be useful when all of the following conditions are true: 1. You don't really need the output to be good or correct, and
2. You don't have confidentiality concerns (sending data off to a cloud service), and,
3. You don't, yourself, want to learn anything or get hands-on - you want it done for you, and
4. You don't need the output to be in "your voice" (this is mostly for prose writing, for code this doesn't really matter); you're okay with the "LLM dialect" (it's crucial to delve!), and
5. The concerns about environmental impact and the ethics of the training set aren't a blocker for you. For me, pretty much everything I do professionally fails condition number 1 and 2, and anything I do for fun fails number 3. And so, despite a fair bit of effort on my part trying to make these tools work for me, they just haven't found a place in my toolset- before I even get to 4 or 5. Local LLMs, if you're able to get a beefy enough GPU to run them at usable speed, solve 2 but make 1 even worse... |
2 researchgate papers (Overlapping sound event recognition using local spectrogram features with the Generalised Hough Transform July 2013 Pattern Recognition Letters)
and one ieee publication (Generalized Hough Transform for Speech Pattern Classification, in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 11, pp. 1963-1972, Nov. 2015)
When I am looking for real web results chatgpt is not very good, but perplexity very often shines for me
and for python programming have a look at withpretzel.com which does the job for me.
just my 2 ct