Merging Python dictionaries: A functional take

python
functools
functional-programming
python-standard-library
An elegant way to merge many dictionaries in Python by leveraging some functional tools.
Author

Fabrizio Damicelli

Published

October 10, 2021

Merging dictionaries

Say we have these two dictionaries that we would like to merge:

d1 = {"a": 1, "b": 2}
d2 = {"a": 2, "c": 3, "d": 4}

A kind of cannonical way to do it would be this:

d3 = {}
for d in [d1, d2]:
    for k, v in d.items():
        d3[k] = v

d3
{'a': 2, 'b': 2, 'c': 3, 'd': 4}
Warning

Notice that we are updating the items, so later appeared keys will overwrite the values under existing keys.

That works. But it’s arguably not so nice. Here’s an alternative:

d3 = {k: v for d in [d1, d2] for k, v in d.items()}

d3
{'a': 2, 'b': 2, 'c': 3, 'd': 4}

That’s compact and kind of nice because of the dictionary comprehension. But, as Michael Kennedy puts it in this video, it’s a bit of a “too clever” alternative, that might be not so easy to read.

What many people consider to be a “more pythonic” way is the following:

d3 = {**d1, **d2}
d3
{'a': 2, 'b': 2, 'c': 3, 'd': 4}
Warning

This will only work from Python 3.5 on.

Beautiful.

That’s where most tutorials on merging dictionaries in Python end. Let’s go beyond that. What if we have more dictionaries to merge, say, three:

d1 = {"a": 1, "b": 2}
d2 = {"a": 2, "c": 3, "d": 4}
d3 = {"c": 3, "f": 6, "g": 9}


d4 = {**d1, **d2, **d3}
d4
{'a': 2, 'b': 2, 'c': 3, 'd': 4, 'f': 6, 'g': 9}

Question: And how about having 10 thousand dictionaries? Or not even knowing how many you have?

Answer: Let’s get functional! :)

We can easily extend the logic of what we’ve been doing so far with one functional concept: reduce (aka “fold” in other languages).

Detour: What is reduce

If you’re alredy familiar with this concept, jumpt to the next subsection.

You can check the details for yourself if you’re not yet familiar with the concept, eg this nice Real Python’s tutorial. In essence reduce is a higher order function that will recursively apply a combining operation to the elements of an iterable. That’s mouthful, let’s look at a couple of quick examples:

from functools import reduce
from operator import add, pow
add??
Signature: add(a, b, /)
Docstring: Same as a + b.
Type:      builtin_function_or_method
pow??
Signature: pow(a, b, /)
Docstring: Same as a ** b.
Type:      builtin_function_or_method

Both add and pow take two arguments and return one, so they “reduce” (or “fold”) the two inputs into one.

reduce(add, (1,2,3,4))
10
1+2+3+4
10
reduce(pow, (2, 3, 4, 5))
1152921504606846976
((2 ** 3) ** 4) ** 5   # notice the succesive (recursive) nature of the operation
1152921504606846976

Back to dictionaries

Let’s apply that to dictionary merging:

def merge(d1, d2):
    """Return a new dictionary which results from merging d1 and d2"""
    return {**d1, **d2}

So far nothing new. But notice that we now have an operation that takes two arguments and returns one, in other words a “reducing” or “folding” operation, so we can now use that!

d1 = {"a": 1, "b": 2}
d2 = {"a": 2, "c": 3, "d": 4}
d3 = {"c": 4, "f": 6, "g": 9}
reduce(merge, (d1, d2, d3))
{'a': 2, 'b': 2, 'c': 4, 'd': 4, 'f': 6, 'g': 9}

Even some nice non-trivial properties come for free, eg it does the right thing when passing only one argument:

reduce(merge, (d1,))   # notice that d1 it has to be in an iterable
{'a': 1, 'b': 2}

I think that is nice. But it can get nicer, because we can put all that together and by exploiting the arbitrary positional arguments (aka *args) we make it more general:

def merge_dicts(*dicts):
    return reduce(lambda d1,d2: {**d1, **d2},  dicts)

Now we can use the very same function to merge as many dictionaries as we’d like, just passing them as positional arguments:

merge_dicts(d1)
{'a': 1, 'b': 2}
merge_dicts(d1, d2)
{'a': 2, 'b': 2, 'c': 3, 'd': 4}
merge_dicts(d1, d2, d3, d1)
{'a': 1, 'b': 2, 'c': 4, 'd': 4, 'f': 6, 'g': 9}

How cool is that? :)

Here’s the video version of this tutorial

Edit:

If you are using Python >= 3.9, there are a couple of better alternatives:
The first, more compact:

import operator

def merge_dicts(*dicts):
    return reduce(operator.__or__, dicts)
merge_dicts_or(d1,d2,d3)
{'a': 2, 'b': 2, 'c': 4, 'd': 4, 'f': 6, 'g': 9}

The second, more readable:

def merge_dicts(*dicts):
    ret = {}
    for d in dicts:
        ret |= d
    return ret
merge_dicts_or(d1,d2,d3)
{'a': 2, 'b': 2, 'c': 4, 'd': 4, 'f': 6, 'g': 9}

Both were pointed out by Anthony Sottile in this tweet - thanks!:

References:
- Python functools documentation
- Michael Kennedy’s tutorial
- Real Python’s article on reduce

/Fin

Any bugs, questions, comments, suggestions? Ping me on twitter or drop me an e-mail (fabridamicelli at gmail).
Share this article on your favourite platform: