Today's software beats the hell out of the audio tapes I used in the 90s but the ideas are the same: Familiarization, recall, and recognition.
First you familiarize yourself with an interval, by just playing it all over your instrument, listening and singing it.
Then you test your recall: Play a note N, and try to sing N+I (where I is the interval).
Once you know more than one interval, you can also test recognition: Someone else plays an interval (N,N+I) and you try to identify I.
Once you've got the twelve intervals (you can go past the octave but that's pretty easy so basically there are only 12 to learn) you can do similarly for chords. Chords are also easy compared to intervals, though, because they are built up from intervals, so even before you can recognize them all at once, you'll already have the ability to pick them apart note by note to figure out what they are.
Seriously though, you can think of melody and etc as relative parts of a scale as well as intervals between notes. Ear training using scales is simplest and eventually starts to apply to more complicated patterns. Eventually you recognize the intervals and scales even if you are not consciously thinking about the descriptors like 'oh this is a major third interval'
First you familiarize yourself with an interval, by just playing it all over your instrument, listening and singing it.
Then you test your recall: Play a note N, and try to sing N+I (where I is the interval).
Once you know more than one interval, you can also test recognition: Someone else plays an interval (N,N+I) and you try to identify I.
Once you've got the twelve intervals (you can go past the octave but that's pretty easy so basically there are only 12 to learn) you can do similarly for chords. Chords are also easy compared to intervals, though, because they are built up from intervals, so even before you can recognize them all at once, you'll already have the ability to pick them apart note by note to figure out what they are.