| In my (limited) experience it seems to perform even better for typed languages (for example Kotlin/Java/Swift) compared to Python. The Python code it provided often had subtle type issues when working with dates. While the Kotlin date-related code it provided was more accurate and correct in terms of types. Which makes sense since the additional type information likely leads to a much better "internal model of how Kotlin works" What surprised me was the level of "understanding" it seems to do when providing it with some of my own sample code. It can analyze the code, explain how it works/what it does, use libraries, suggest improvements and apply those improvements. Have a look at this conversation: https://imgur.com/a/ZtViC3d While the end result isn't perfect, it's still highly impressive and while I was an AI-skeptic before, I now see the possible benefits of AI assistants for programming. Some other prompts with very impressive results: * "Write an implementation for the following Kotlin repository interface: <insert-interface-with-full-type-signatures>." * (followup) "Add save/load methods that store the backing map in a JSON file" * (followup) "Replace Gson with Jackson for JSON serialization" * "Write an Android layout xml for a login form with username/password/loginbutton" * (followup) "Provide the Kotlin activity code for this layout" * "Write a Kotlin function that parses a semver input string into a data class" |
In my (limited) experience it seems to perform even better for typed languages (for example Kotlin/Java/Swift) compared to Python. The Python code it provided often had subtle type issues when working with dates. While the Kotlin date-related code it provided was more accurate and correct in terms of types. Which makes sense since the additional type information likely leads to a much better "internal model of how Kotlin works"
I think another possibility here is that they might have used an execution environment to check whether the code the model came up with actually compiles and used that as additional input during training. Some sort of execution environment seems to me to also be a possible explanation for how they managed the model to emulate a terminal so well.