| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by taliesinb 4365 days ago

Associations aren't just a data structure, they've been designed to fit in a sensible way into the rest of the language, via a principle I call the "central dogma". This means they work in a predictable way with a huge number of existing functions (though we still have more to do).

For example, Associations interact naturally with the hierarchical part-specification language used by Part (http://reference.wolfram.com/language/ref/Part.html):

   In[1]:= people = { 
      <|"name" -> "bob", "age" -> 20, "sex" -> "M"|>, 
      <|"name" -> "sue", "age" -> 25, "sex" -> "F"|>, 
      <|"name" -> "ann", "age" -> 18, "sex" -> "F"|>
   }; 
   
   In[2]:= people[[ All, "age" ]] (* extract list of ages *)
   Out[2]= {20, 25, 18}

   In[3]:= people[[ All, "sex" ]] (* extract list of sexes *)
   Out[3]= {"M", "F", "F"}

   In[4]:= people[[ 2, "age" ]] (* extract age of 2nd person *)
   Out[4]= 25

   In[5]:= people[[ 2, {"age","sex"} ]] (* extract age and sex *)
   Out[5]= <|"age" -> 25, "sex" -> "F"|>

This naturally generalizes to 'indexed tables', in which the outermost list becomes an association, because associations serve double-duty as "structs" and "hash-maps", just like lists are used for both "vectors" and "tuples":

   In[6]:= people = <|
      236234 -> <|"name" -> "bob", "age" -> 20, "sex" -> "M"|>, 
      253456 -> <|"name" -> "sue", "age" -> 25, "sex" -> "F"|>, 
      323442 -> <|"name" -> "ann", "age" -> 18, "sex" -> "F"|>
   |>; 
   
   In[7]:= people[[ All, "age" ]] (* extract association between ID and age *)
   Out[7]= <| 236234 -> 20, 253456 -> 25, 323442 -> 18|>

   In[8]:= people[[ All, "sex" ]] (* extract association between ID and sex *)
   Out[8]= <| 236234 -> "M", 253456 -> "F", 323442 -> "F"|>

   In[9]:= people[[ Key[323442], "age" ]] (* extract age of person with ID 323442 *)
   Out[9]= 18

   (* extract age and sex of person with ID 323442 *)
   In[10]:= people[[ Key[323442], {"age","sex"} ]] 
   Out[10]= <|"age" -> 18, "sex" -> "F"|>

The uniform addressing scheme behind Part (and Extract, Position, etc) is tremendously useful in day-to-day code, because it makes it much easier to write programs as functions that transform potentially complex, hierarchical data in a series of steps.

This is similar in some ways to the ideas behind Haskell's lens library, Clojure's assoc-in and friends, even the schemes used in JQuery and XPath. But it's core to WL.

The semantics of Part are also extended to become a full-fledged query language, as used by Dataset (http://reference.wolfram.com/language/ref/Dataset.html):

   (* load a dataset of passengers of the Titanic *)
   titanic = ExampleData[{"Dataset", "Titanic"}]

   (* produce a histogram of passenger ages *) 
   titanic[Histogram, "age"] 

   (* produce a histograms for 1st class, 2nd class, etc.. *)
   titanic[GroupBy[Key["class"]], Histogram[#, {0,80,4}]&, "age"]

There are also some really nice functions to work with associations, like the map-reduce-like GroupBy (http://reference.wolfram.com/language/ref/GroupBy.html):

   (* split sentence into list of words *)
   In[16]:= words = StringSplit["it was the best of times it was the worst of times"] 
   Out[16]= {"it", "was", "the", "best", "of", "times", "it", "was", 
      "the", "worst", "of", "times"}

   (* group words that have the same length *) 
   In[17]:= GroupBy[words, StringLength] 
   Out[17]= <|
      2 -> {"it", "of", "it", "of"}, 
      3 -> {"was", "the", "was", "the"}, 
      4 -> {"best"}, 
      5 -> {"times", "worst", "times"}
   |>
   
   (* reduce each group into an association of counts *)
   In[18]:= GroupBy[words, StringLength, Counts] 
   Out[18]= <|
      2 -> <|"it" -> 2, "of" -> 2|>, 
      3 -> <|"was" -> 2, "the" -> 2|>, 
      4 -> <|"best" -> 1|>, 
      5 -> <|"times" -> 2, "worst" -> 1|>
   |>

And Counts and CountsBy (http://reference.wolfram.com/language/ref/CountsBy.html):

   In[21]:= CountsBy[words, StringLength]
   Out[21]= <|2 -> 4, 3 -> 4, 4 -> 1, 5 -> 3|>

And AssociationMap (http://reference.wolfram.com/language/ref/AssociationMap.htm...):

   In[23]:= AssociationMap[WordData[#, "PartsOfSpeech"]&, words]
   Out[23]= <|
      "it" -> {"Pronoun"}, "was" -> {"Verb"}, 
      "the" -> {"Determiner"}, 
      "best" -> {"Noun", "Adjective", "Verb", "Adverb"}, 
      "of" -> {"Preposition"}, "times" -> {"Noun"}, 
      "worst" -> {"Noun", "Adjective", "Verb", "Adverb"}
   |>

Here's some more info about associations: http://reference.wolfram.com/language/guide/Associations.htm...