iitm_scraper / markdown_files /Cleaning_Data_with_OpenRefine.md
Shriyakupp's picture
Upload 107 files
980dc8d verified
---
title: "Cleaning Data with OpenRefine"
original_url: "https://tds.s-anand.net/#/cleaning-data-with-openrefine?id=cleaning-data-with-openrefine"
downloaded_at: "2025-06-08T23:26:48.911609"
---
[Cleaning Data with OpenRefine](#/cleaning-data-with-openrefine?id=cleaning-data-with-openrefine)
-------------------------------------------------------------------------------------------------
[![Cleaning data with OpenRefine](https://i.ytimg.com/vi_webp/zxEtfHseE84/sddefault.webp)](https://youtu.be/zxEtfHseE84)
This session covers the use of OpenRefine for data cleaning, focusing on resolving entity discrepancies:
* **Data Upload and Project Creation**: Import data into OpenRefine and create a new project for analysis.
* **Faceting Data**: Use text facets to group similar entries and identify frequency of address crumbs.
* **Clustering Methodology**: Apply clustering algorithms to merge similar entries with minor differences, such as punctuation.
* **Manual and Automated Clustering**: Learn to merge clusters manually or in one go, trusting the system’s clustering accuracy.
* **Entity Resolution**: Clean and save the data by resolving multiple versions of the same entity using Open Refine.
Here are links used in the video:
* [OpenRefine software](https://openrefine.org)
* [Dataset for OpenRefine](https://drive.google.com/file/d/1ccu0Xxk8UJUa2Dz4lihmvzhLjvPy42Ai/view)
[Previous
Data Preparation in the Editor](#/data-preparation-in-the-editor)
[Next
Profiling Data with Python](#/profiling-data-with-python)