iitm_scraper / markdown_files /Cleaning_Data_with_OpenRefine.md
Shriyakupp's picture
Upload 107 files
980dc8d verified
metadata
title: Cleaning Data with OpenRefine
original_url: >-
  https://tds.s-anand.net/#/cleaning-data-with-openrefine?id=cleaning-data-with-openrefine
downloaded_at: '2025-06-08T23:26:48.911609'

Cleaning Data with OpenRefine

Cleaning data with OpenRefine

This session covers the use of OpenRefine for data cleaning, focusing on resolving entity discrepancies:

  • Data Upload and Project Creation: Import data into OpenRefine and create a new project for analysis.
  • Faceting Data: Use text facets to group similar entries and identify frequency of address crumbs.
  • Clustering Methodology: Apply clustering algorithms to merge similar entries with minor differences, such as punctuation.
  • Manual and Automated Clustering: Learn to merge clusters manually or in one go, trusting the system’s clustering accuracy.
  • Entity Resolution: Clean and save the data by resolving multiple versions of the same entity using Open Refine.

Here are links used in the video:

[Previous

Data Preparation in the Editor](#/data-preparation-in-the-editor)

[Next

Profiling Data with Python](#/profiling-data-with-python)