Hacker News new | ask | show | jobs
by ptx 3157 days ago
Isn't the biggest difference between Python 2 and 3 and handling of Unicode? That's what motivated the break in compatibility in the first place.

But in Skulpt the strings don't quite work like any of the Python versions.

Python 2 has a combined str/bytes type and a separate Unicode string type:

  >>> type("hello") is type(u"你好")
  False
  >>> len(u"你好")
  2
  >>> len("你好")
  6
...whereas in Python 3 the str type is Unicode and the "bytes" type is a completely separate thing:

  >>> type("hello") is type(u"你好")
  True
  >>> len(u"你好")
  2
  >>> len("你好")
  2
Skulpt actually seems to work more like Python 3, except that 1) there is no way at all to work with bytes (that I can find), e.g. no encode/decode methods, and it 2) requires the "u" prefix if literals contain non-ASCII characters, even though the type of the resulting string is the same as without the prefix:

  >>> type("hello") is type(u"你好")
  True
  >>> len(u"你好")
  2
  >>> len("你好")
  SyntaxError: invalid string (possibly contains a unicode character) on line 1
2 comments

Skulpt strings are javascript strings internally, wether or not you add u in front of a string doesn't actually change it's internal representation. We always strive to be as close to cpython as we can, but in this instance we chose to use javascript strings internally, very likely with the mentality that we would come back to this if and when it would be a requirement for one of us. :) Which it hasn't been I think.
Thanks for this succinct explanation! I've opened an issue on the skulpt repo. https://github.com/skulpt/skulpt/issues/731