--- title: "Cleaning Data with OpenRefine" original_url: "https://tds.s-anand.net/#/cleaning-data-with-openrefine?id=cleaning-data-with-openrefine" downloaded_at: "2025-06-08T23:26:48.911609" --- [Cleaning Data with OpenRefine](#/cleaning-data-with-openrefine?id=cleaning-data-with-openrefine) ------------------------------------------------------------------------------------------------- [![Cleaning data with OpenRefine](https://i.ytimg.com/vi_webp/zxEtfHseE84/sddefault.webp)](https://youtu.be/zxEtfHseE84) This session covers the use of OpenRefine for data cleaning, focusing on resolving entity discrepancies: * **Data Upload and Project Creation**: Import data into OpenRefine and create a new project for analysis. * **Faceting Data**: Use text facets to group similar entries and identify frequency of address crumbs. * **Clustering Methodology**: Apply clustering algorithms to merge similar entries with minor differences, such as punctuation. * **Manual and Automated Clustering**: Learn to merge clusters manually or in one go, trusting the system’s clustering accuracy. * **Entity Resolution**: Clean and save the data by resolving multiple versions of the same entity using Open Refine. Here are links used in the video: * [OpenRefine software](https://openrefine.org) * [Dataset for OpenRefine](https://drive.google.com/file/d/1ccu0Xxk8UJUa2Dz4lihmvzhLjvPy42Ai/view) [Previous Data Preparation in the Editor](#/data-preparation-in-the-editor) [Next Profiling Data with Python](#/profiling-data-with-python)