Same basic idea. He's using variable-length input strings and mapping them to unique symbols rather than the other way around, but it seems like he's using entropy measurements to build an optimal trie.
No, it's not. (Although "unicode" is ambiguous, because there are several unicode encodings.) Even if the binary sequence can be decoded as unicode code points, certain combinations of code points aren't allowed (surrogate pairs have to be matched).