Great read, thanks for sharing. Would love to see the natural language + code mixed in there :)
I've been interested in contrastive learning for a while, mainly as a means to train semantic code search models. OpenAI released a great paper on this topic called Text and Code Embeddings by Contrastive Pre-Training[1] that outlines the approach. I've used it as a base to build https://codesearch.ai [2] with pretty good results.