When you're classifying letters in the abstract, Y doesn't get included in the vowels.
There is a school of thought that says you should also be able to classify letters as they appear in specific words.† That's where the idea comes from that "Y is sometimes a vowel".
But people who believe that they hold to that school are generally unwilling to say that M is sometimes a vowel, despite the fact that the spelling of the second syllable in rhythm is "m". This is difficult to reconcile with the justification given for Y sometimes being a vowel, that any syllable must contain a vowel.
As far as the phonetics go, they support the idea that "m" /m/ in rhythm is not a vowel [it would be called a "syllabic consonant"] while "y" /i/ in homily is. Obviously, this requires tossing out the idea that all syllables must contain vowels. Phonetics assigns an intermediate status to the "y" /j/ in "yell" - it is a vowel as far as the mechanics of producing it go, but it clearly behaves as if it is a consonant, so it is called a "glide".
As far as the etymology goes, Y represents a foreign sound, a vowel in ancient Greek, and it's always a vowel. Interestingly enough, the ancient Greek vowel is sometimes a consonant (what you'd think of as V) in modern Greek.
† This doesn't really work; any particular word will have a definite spelling and a definite pronunciation, but that doesn't mean that it's possible to consistently map the letters of the spelling to the sounds of the pronunciation.
Maybe it's an accent thing (mine is typical British) but the 'm' in 'rhythm' doesn't act like a vowel in my mind, not the way 'y' does in 'sky'.
The only kind-of vowel sound in the second syllable of rhythm is what you get from saying "th", but actually it just sounds like 3 consonants together without a vowel.
People say that 'y' is sometimes a vowel because it sometimes sounds like one, not because it sometimes fits in a syllable.
Y /j/ always sounds like a vowel. As I mentioned, phonetically it is one. It's just that a shortened version can act as a consonant for phonological purposes. Y as represented in yell and pretty are the same sound; it's just shorter in yell.
There is a school of thought that says you should also be able to classify letters as they appear in specific words.† That's where the idea comes from that "Y is sometimes a vowel".
But people who believe that they hold to that school are generally unwilling to say that M is sometimes a vowel, despite the fact that the spelling of the second syllable in rhythm is "m". This is difficult to reconcile with the justification given for Y sometimes being a vowel, that any syllable must contain a vowel.
As far as the phonetics go, they support the idea that "m" /m/ in rhythm is not a vowel [it would be called a "syllabic consonant"] while "y" /i/ in homily is. Obviously, this requires tossing out the idea that all syllables must contain vowels. Phonetics assigns an intermediate status to the "y" /j/ in "yell" - it is a vowel as far as the mechanics of producing it go, but it clearly behaves as if it is a consonant, so it is called a "glide".
As far as the etymology goes, Y represents a foreign sound, a vowel in ancient Greek, and it's always a vowel. Interestingly enough, the ancient Greek vowel is sometimes a consonant (what you'd think of as V) in modern Greek.
† This doesn't really work; any particular word will have a definite spelling and a definite pronunciation, but that doesn't mean that it's possible to consistently map the letters of the spelling to the sounds of the pronunciation.