Hacker News new | ask | show | jobs
by mont_tag 488 days ago
Didn't Python's compact dictionary implementation do this a decade ago?

"The dict type now uses a “compact” representation based on a proposal by Raymond Hettinger which was first implemented by PyPy. The memory usage of the new dict() is between 20% and 25% smaller compared to Python 3.5." -- https://docs.python.org/3.6/whatsnew/3.6.html#whatsnew36-com...

"Note, the sizeof(index) can be as small as a single byte for small dicts, two bytes for bigger dicts and up to sizeof(Py_ssize_t) for huge dict." -- https://mail.python.org/pipermail/python-dev/2012-December/1...

The "tiny pointers" are in the _make_index method in the proof of concept code. -- https://code.activestate.com/recipes/578375-proof-of-concept...

      @staticmethod
      def _make_index(n):
          'New sequence of indices using the smallest possible datatype'
          if n <= 2**7: return array.array('b', [FREE]) * n       # signed char
          if n <= 2**15: return array.array('h', [FREE]) * n      # signed short
          if n <= 2**31: return array.array('l', [FREE]) * n      # signed long
          return [FREE] * n
The logic is still present today in CPython. -- https://raw.githubusercontent.com/python/cpython/3e222e3a159...

  dk_indices is actual hashtable.  It holds index in entries, or DKIX_EMPTY(-1)
  or DKIX_DUMMY(-2).
  Size of indices is dk_size.  Type of each index in indices varies with dk_size:

  * int8  for          dk_size <= 128
  * int16 for 256   <= dk_size <= 2**15
  * int32 for 2**16 <= dk_size <= 2**31
  * int64 for 2**32 <= dk_size
3 comments

The "compact dictionary" representation you're talking about not always using 64-bit numbers, but rather using log(n)-bit values when you have n entries (i.e. using 8-bit values when there's only <= 128 entries). The paper linked here talks about pointers which uses less than log(n)-bits for n entries.
These are the "traditional log n-bit pointers" that tiny pointers improve on.
No, although it's not completely unrelated.