|
> I strongly doubt your compiler will transform all the literal alists in your program into hash tables, Time for experiment.
SBCL doesn't, as far as I know. But you seem to imply that it is always better to use hash-tables, and I don't think so (even though hash-table can be implemented in a smart way, like in Lua). Under roughly 10 elements, alist result in shorter code. (defun my-fun (x)
(declare (optimize (speed 3) (debug 3) (safety 0))
(type (member a b c) x))
(let ((a '((a . 10) (b . 21) (c . 31))))
(cdr (assoc x a :test #'eq))))
(disassemble #'my-fun)
; disassembly for MY-FUN
; Size: 53 bytes
; 04A7FF2F: 483B1592FFFFFF CMP RDX, [RIP-110] ; 'A
; no-arg-parsing entry point
; 36: 7511 JNE L1
; 38: 488B0D91FFFFFF MOV RCX, [RIP-111] ; '(A . 10)
; 3F: L0: 488B5101 MOV RDX, [RCX+1]
; 43: 488BE5 MOV RSP, RBP
; 46: F8 CLC
; 47: 5D POP RBP
; 48: C3 RET
; 49: L1: 488B1D98FFFFFF MOV RBX, [RIP-104] ; '(B . 21)
; 50: 488B0D89FFFFFF MOV RCX, [RIP-119] ; '(C . 31)
; 57: 483B157AFFFFFF CMP RDX, [RIP-134] ; 'B
; 5E: 480F44CB CMOVEQ RCX, RBX
; 62: EBDB JMP L0
This is roughly equivalent to a "case" construct (not shown here). Then, this is what I have with a hash-table: (let ((table (make-hash-table :test #'eq)))
(declare (optimize (speed 3) (debug 3) (safety 0)))
(setf (gethash 'a table) 10)
(setf (gethash 'b table) 21)
(setf (gethash 'c table) 31)
(defun my-fun-hash (x)
(declare (optimize (speed 3) (debug 3) (safety 0))
(type (member a b c) x))
(gethash x table)))
(disassemble #'my-fun-hash)
; disassembly for MY-FUN-HASH
; Size: 115 bytes
; 05467EB0: .ENTRY MY-FUN-HASH(X) ; (FUNCTION (#)
; (VALUES T # ..))
; EE8: 8F4508 POP QWORD PTR [RBP+8]
; EEB: 488D65E8 LEA RSP, [RBP-24]
; EEF: 488B7805 MOV RDI, [RAX+5]
; EF3: 4C8BC7 MOV R8, RDI
; EF6: 498BF8 MOV RDI, R8
; EF9: BE17001020 MOV ESI, 537919511
; EFE: 488B05FBFDFFFF MOV RAX, [RIP-517] ; #<FDEFINITION object for SB-IMPL::GETHASH3>
; F05: B906000000 MOV ECX, 6
; F0A: FF7508 PUSH QWORD PTR [RBP+8]
; F0D: FF6009 JMP QWORD PTR [RAX+9]
; F10: 6A20 PUSH 32
; F12: B9604F4200 MOV ECX, 4345696 ; alloc_tramp
; F17: FFD1 CALL RCX
; F19: 59 POP RCX
; F1A: 488D490B LEA RCX, [RCX+11]
; F1E: E917FFFFFF JMP #x1005467E3A
The function with an hash-table has a constant size relatively to the number of elements in the table, which is fully known at compile-time. The size of the code with a case/alist grows linearly with the number of elements.Now, let's compare speed. In the following versions, I have added more elements in the alist, just to be sure the code with an hash-table is the shortest (from 'a to 'l). (time
(dotimes (i 10000000)
(dolist (x '(a b c d e f g h i j k l))
(my-fun-hash x))))
(time
(dotimes (i 10000000)
(dolist (x '(a b c d e f g h i j k l))
(my-fun-case x))))
(time
(dotimes (i 10000000)
(dolist (x '(a b c d e f g h i j k l))
(my-fun x))))
Results: HASH:
2.491 seconds of real time
33,072 bytes consed
ALIST:
0.852 seconds of real time
0 bytes consed
CASE:
0.778 seconds of real time
0 bytes consed
Even though in terms of footprint, the code tend to be larger with alist quite rapidly (10 elements), the resulting code is faster and does not allocate memory.Also, just to clarify, alist and property lists have a different behavior than hash-tables, namely that the sequential access allows you to shadow values from another list: if you write (cons (cons 'a b) older-alist), you have a new list where the value for key 'a is b, and where values for other keys are those found in older-list (even though older-list also contains key 'a). I don't quite remember what is my point anyway ;-) but it was fun to test the different behaviors. |
> Also, just to clarify, alist and property lists have a different behavior than hash-tables, namely that the sequential access allows you to shadow values from another list: if you write (cons (cons 'a b) older-alist), you have a new list where the value for key 'a is b, and where values for other keys are those found in older-list (even though older-list also contains key 'a).
Oh yes, doesn't emacs make good use of this trick for its configuration variables? You can get the same effect with Lua's metatables, however, with a little elbow grease:
Not quite as easy as "cons", but very flexible; __index can be another table to searched if the lookup on the child fails (which in turn can have its own parent and so on), but it can also be a "metamethod" that is called whenever a lookup fails and can then do arbitrary things. A neat example off the top of my head: OpenGL has the peculiarity that you do not know the address of any of its functions until runtime, requiring a program to call a lookup function for each function to get a usable pointer. Declaring FFI function prototypes for many OpenGL functions is not such a big deal, but looking up hundreds of functions that you will never use can add significantly to startup time. So someone (possibly an HN user?) wrote an OpenGL FFI library that uses the __index metamethod in a clever way; the first time a function like "GL.CreateShader" is called, the lookup fails, and the __index metamethod mangles the index name a bit and in turn calls (on Windows) the C function wglGetProcAddress to look up its address, which it then stores in the original table. Using this library, you can write code that uses GL functions willy-nilly, and their addresses will automatically be looked up at runtime the first time they are used.Sure, you can do the same thing in any language with macros or a preprocessor, but is that cool or what?