How to convert MP3 ID3 tag encodings on Linux

Question: When I load MP3 files on Rhythmbox music player, song titles and artist names appear as gibberish and unreadable characters. I suspect this is an ID3 tag's character encoding problem. How can I fix this problem?

If your music player does not display some MP3 tracks correctly (e.g., with garbled symbols), it is most likely that the character encoding of ID3 tags is not supported on your Linux system. You can often encounter this problem when song titles or artist names are written in Cyrillic, Greek, Chinese, Japanese or Korean characters.

To solve this problem, you need to convert character encoding of ID3 tags to UTF-8/Unicode. The very tool that can do such conversion is mid3iconv command-line tool. Using mid3iconv, you can convert ID3 tags from any arbitrary character encoding to Unicode.

Install Mid3iconv on Linux

The mid3iconv tool is part of python-mutagen package, which is universally available on major Linux platforms.

Debian, Ubuntu or Linux Mint:

$ sudo apt-get install python-mutagen

CentOS, Fedora or RHEL:

$ sudo yum install python-mutagen

Convert ID3 Character Encodings with Mid3iconv

The typical command-line usage of mid3iconv is as follows.

$ mid3iconv -e <source-encoding> -d input.mp3

The above command converts the character encoding of input MP3 file's ID3 tag from <source-encoding> to Unicode. If you want to "dry-run" the conversion without actually modifying a MP3 file, you can add "-p" option.

For example, to convert ID3's character encoding from CP1252 to UTF-8/Unicode:

$ mid3iconv -e CP1251 -d input.mp3

Without "-p" option, the converted ID3 tag will be overwritten to an input MP3 file.

Fix Broken Characters in MP3 Song Titles and Artist Names

Now let's try to fix broken characters in MP3 ID3 tags with mid3iconv.

The first thing you need to do is to identify the character encoding used by your MP3 files. You may be able to guess their character encoding if you know where they are from. For example, if MP3 files are Japanese songs, the encoding may be Shift-JIS or EUC-JP, while Chinese songs may be encoded with GB 2312 or GBK, etc. If you are not sure, you can use mid3iconv's "dry-run" option to find it out.

That is, choose any encoding from:

$ iconv --list

and apply conversion (with "dry-run" option):

$ mid3iconv -p -d e <source-encoding> input.mp3

If the output is readable, that means you guessed a source encoding correctly.

Once you know what ID3 encoding your MP3 files are using, use the following command to batch-convert ID3 character encoding for all your MP3 files.

$ find . -name "*mp3" -print0 | xargs -0 mid3iconv -e <source-encoding> -d

For example, if ID tags of your MP3 songs are encoded with EUC-KR (Korean characters), use the below command to fix broken characters.

$ find . -name "*mp3" -print0 | xargs -0 mid3iconv -e EUR-KR -d

Verify that MP3 files are displayed correctly.

Download this article as ad-free PDF (made possible by your kind donation): 
Download PDF

Subscribe to Ask Xmodulo

Do you want to receive Linux related questions & answers published at Ask Xmodulo? Enter your email address below, and we will deliver our Linux Q&A straight to your email box, for free. Delivery powered by Google Feedburner.


Support Xmodulo

Did you find this tutorial helpful? Then please be generous and support Xmodulo!

Leave a comment

Your email address will not be published. Required fields are marked *