But I don't want to be able to parse only highly restricted languages. I want to be able to parse anything, including natural language or even non-languages like raw audio.
Yes, never do humans misunderstand each other, or instructions are not clear to everyone and totally unambiguous, and luckily no language has pure differentiation of meaning by intonation, and, and.. and...
No, they cannot be reliably parsed. There is no unambiguously correct parsing for many (or, arguably, any) strings. Two people could say the same thing in the same context and mean different things by it. You can't even definitively say whether what they said/wrote is valid English. Sure, there are strings most would agree are and strings most would agree aren't, but even taking consensus opinion as the source of truth, most isn't all, and there's no universally agreed upon threshold for acceptance.
That's okay - that just means your parser needs to model what the speaker was thinking when they said it. That's extra information that's required to decode the message. It is not necessary for the same text to always mean the same thing.
If you need to already know what the speaket meant in order to understand them, then there is no point in communication.
Human language has a pretty clear distinction between syntax and sementics. This is how we recognize that "colorless green ideas sleep furiously" are perfectly well formed, if meaningless. In contrast, "I is happy" is meaningful and unambiguous, but grammatically incorrect.
In terms of syntax, English (like most, if not all) languages is literally ambigous.
Consider the sentence structure:
Subject Verb Object Prepositional-Phrase.
This can be either:
(Subject Verb (Object Prepositional Phrase))
Or
(Subject Verb Object ) Prepositional Phrase.
For instance, consider the sentence "I saw a man with binoculars".
In any sense of the word, this example is structually ambigous.
Fine, a parser that is a perfect oracle for authorial intent can reliably parse English. But no real parser can. And anyways, that effectively extends the English grammar to include the entire world state, which isn't really what people mean when they talk about English as a language or parsing strings—a fact which perhaps helps to illustrate the problem.