How to remove accents (normalize) in a Python unicode string?

Sometimes, we want to remove accents (normalize) in a Python unicode string.

In this article, we’ll look at how to remove accents (normalize) in a Python unicode string.

How to remove accents (normalize) in a Python unicode string?

To remove accents (normalize) in a Python unicode string, we can use the unicodedata.normalize method.

For instance, we write:

import unicodedata


def strip_accents(s):
    return ''.join(c for c in unicodedata.normalize('NFD', s)
                   if unicodedata.category(c) != 'Mn')
no_accent = strip_accents(u"A u00c0 u0394 u038E")      
print(no_accent)             

We call unicodedata.normalize on the s string and then join all the returned letters in the list with join.

We filter out all the non-spacing characters in s with if unicodedata.category(c) != 'Mn'

Therefore, no_accent is 'A A Δ Υ'.

Conclusion

To remove accents (normalize) in a Python unicode string, we can use the unicodedata.normalize method.