Dictionary Examples#

Word Count#

Counting how often each word appears in a text is a classic dictionary application. We process a string word by word, updating a count for each word seen. Run it on a short phrase to see counts accumulate for repeated words, then edit the text:

>>> def word_count(text):
...     counts = {}
...     for word in text.lower().split():
...         word = word.strip(".,!?;:\"'")   # remove punctuation
...         if word:
...             counts[word] = counts.get(word, 0) + 1
...     return counts
...
>>> text = "four score and seven years ago our fathers four score"
>>> counts = word_count(text)
>>> for word in sorted(counts):
...     print(f"{word:10s}  {counts[word]}")
...
ago         1
and         1
fathers     1
four        2
our         1
score       2
seven       1
years       1

counts.get(word, 0) returns the current count or 0 if the word has not been seen yet — a cleaner alternative to checking word in counts first.

Inverting a Dictionary#

To build a reverse lookup (value → key) from a one-to-one dictionary, run this:

>>> e2sp = {"one": "uno", "two": "dos", "three": "tres"}
>>> sp2e = {v: k for k, v in e2sp.items()}
>>> sp2e
{'uno': 'one', 'dos': 'two', 'tres': 'three'}

This dict comprehension is the dictionary analogue of a list comprehension.

Using collections.Counter#

Python’s collections.Counter is a specialised dict subclass that counts hashable objects automatically. It is cleaner than writing the accumulation loop by hand. Run it live:

>>> from collections import Counter
>>> text = "four score and seven years ago our fathers four score"
>>> counts = Counter(text.lower().split())
>>> counts.most_common(3)
[('four', 2), ('score', 2), ('and', 1)]

most_common(n) returns the n most frequent elements as a list of (element, count) tuples.

Grouping Data#

A dictionary whose values are lists is a natural way to group items. Run this and edit the word list:

>>> words = ["apple", "ant", "bear", "bee", "cat"]
>>> by_letter = {}
>>> for word in words:
...     letter = word[0]
...     if letter not in by_letter:
...         by_letter[letter] = []
...     by_letter[letter].append(word)
...
>>> for letter in sorted(by_letter):
...     print(f"{letter}: {by_letter[letter]}")
...
a: ['apple', 'ant']
b: ['bear', 'bee']
c: ['cat']

dict.setdefault(key, default) can simplify the inner if:

by_letter.setdefault(letter, []).append(word)