# FAQ overflow

#### QUESTION

What is the most space and time efficient way to implement a Trie in Mathematica?

Will it be practically faster than what is natively available in appropriate cases?

A combination of rules and recursion is able to produce rather powerful solutions. Here is my take on it:

ClearAll[makeTree];
makeTree[wrds : {__String}] := makeTree[Characters[wrds]];
makeTree[wrds_ /; MemberQ[wrds, {}]] :=
Prepend[makeTree[DeleteCases[wrds, {}]], {} -> {}];
makeTree[wrds_] :=
Reap[If[# =!= {}, Sow[Rest[#], First@#]] & /@
wrds, _, #1 -> makeTree[#2] &][]

ClearAll[getSubTree];
getSubTree[word_String, tree_] := Fold[#2 /. #1 &, tree, Characters[word]]

ClearAll[inTreeQ];
inTreeQ[word_String, tree_] :=  MemberQ[getSubTree[word, tree], {} -> {}]

ClearAll[getWords];
getWords[start_String, tree_] :=
Module[{wordStack = {}, charStack = {}, words},
words[{} -> {}] :=
wordStack = {wordStack, StringJoin[charStack]};
words[sl_ -> ll_List] :=
Module[{},
charStack = {charStack, sl};
words /@ ll;
charStack = First@charStack;
];
words[First@Fold[{#2 -> #1} &, getSubTree[start, tree],
Reverse@Characters[start]]
];
ClearAll[words];
Flatten@wordStack];



The last function serves to collect the words from a tree, by performing a depth-first tree traversal and maintaining the stack of accumulated characters and words.

Here is a short example:

In:= words = DictionaryLookup["absc*"]
Out= {abscess,abscessed,abscesses,abscessing,abscissa,abscissae,abscissas,
abscission,abscond,absconded,absconder,absconders,absconding,absconds}

In:= tree = makeTree[words]
Out= {a->{b->{s->{c->{e->{s->{s->{{}->{},e->{d->{{}->{}},s->{{}->{}}},
i->{n->{g->{{}->{}}}}}}},i->{s->{s->{a->{{}->{},e->{{}->{}},s->{{}->{}}},
i->{o->{n->{{}->{}}}}}}},o->{n->{d->{{}->{},e->{d->{{}->{}},r->{{}->{},s->{{}->{}}}},
i->{n->{g->{{}->{}}}},s->{{}->{}}}}}}}}}}

In:= inTreeQ[#,tree]&/@words
Out= {True,True,True,True,True,True,True,True,True,True,True,True,True,True}

In:= inTreeQ["absd",tree]
Out= False

In:= getWords["absce", tree]
Out= {"abscess", "abscessed", "abscesses", "abscessing"}



I only constructed here a bare-bones tree, so you can only test whether or not the word is there, but not keep any other info. Here is a larger example:

In:= allWords =  DictionaryLookup["*"];

In:= (allTree = makeTree[allWords]);//Timing
Out= {5.375,Null}

In:= And@@Map[inTreeQ[#,allTree]&,allWords]//Timing
Out= {1.735,True}

In:= getWords["pro",allTree]//Short//Timing
Out= {0.015,{pro,proactive,proactively,probabilist,
<<741>>,proximate,proximately,proximity,proxy}}

In:= DictionaryLookup["pro*"]//Short//Timing
Out= {0.032,{pro,proactive,proactively,probabilist,<<741>>,
proximate,proximately,proximity,proxy}}



I don't know which approach has been used for the built-in functionality, but the above implementation seems to be generally in the same calss for performance. The slowest part is due to the top-level tree-traversing code in getWords. It is slow because the top-level code is slow. One could speed it up considerably by hashing words to integers - then it can be Compiled. This is how I'd do that, if I were really concerned with speed.

EDIT

For a really nice application of a Trie data structure, where it allows us to achieve major speed-up (w.r.t. using DictionaryLookup, for example), see this post, where it was used it to implement an efficient Boggle solver.