How to Fix CSV Encoding Issues
Repair garbled characters in a CSV by getting the encoding right on both ends.
Garbled text in a CSV, accented letters showing as strange two-character sequences, or an odd symbol before the first header, almost always comes from an encoding mismatch: the file was written in one encoding and read in another. The bytes are intact; they're just being decoded with the wrong map.
The fix is to identify the file's actual encoding and re-read or re-save it as UTF-8, the modern standard. Getting both the writer and the reader on UTF-8 makes the problem disappear and keeps it from coming back.
- 1
Identify the symptom
Accented characters as two-character garbage usually means UTF-8 read as Latin-1. A stray symbol before the first header is a UTF-8 byte-order mark (BOM) being shown literally.
- 2
Re-read with the correct encoding
Open the file telling the tool its real encoding (often UTF-8 or Windows-1252). In Excel, use Data, From Text/CSV and pick the encoding in the preview.
- 3
Re-save as UTF-8 without a BOM
Save the corrected file as UTF-8. Drop the BOM unless a specific Windows tool needs it, since a stray BOM is itself a common cause of first-column problems.
- 4
Fix the export at the source
If the file keeps coming in wrong, set the exporting system to output UTF-8 so you're not repairing it every time.
Frequently asked questions
What does a stray symbol at the start of my file mean?▾
It's usually a UTF-8 byte-order mark (BOM) being displayed as text because the reader didn't strip it. Save as UTF-8 without a BOM, or use a parser that handles it.
Why do accented characters show as garbage?▾
The file is UTF-8 but is being read as Latin-1 or Windows-1252. Re-read it as UTF-8 and the characters render correctly.
Which encoding should I standardize on?▾
UTF-8, without a BOM. It represents every character and is the default expected by modern data tools.