Glossary

Data cleansing

The process of fixing or removing incorrect, inconsistent, or malformed records so a dataset is accurate and ready to use.

Data cleansing (or cleaning) is everything you do to get a dataset trustworthy: trimming stray whitespace, standardizing capitalization and date formats, fixing encoding problems, removing blank or duplicate rows, and reconciling values that mean the same thing but are written differently (NY versus New York).

It matters most right before data is loaded, merged, or compared, because dirty data produces false differences. Two exports of the same customers can look completely changed when one has trailing spaces or a different date format. Normalizing those surface differences first, trimming whitespace, ignoring case, unifying number formats, is what lets a comparison surface the changes that actually matter.

Compare two spreadsheets

Drop two files into SheetCompare and see every changed cell. Free, private, and runs in your browser.