Introduction to OpenRefine
- OpenRefine is a free, open-source tool for cleaning, organizing, and exploring messy data.
- You can easily import, filter, sort, and analyze your data, even without technical experience.
- OpenRefine supports many data formats and can be extended with add-ons and custom scripts for even more possibilities.
- Using OpenRefine helps you prepare your data for analysis, making your research more accurate, efficient, and enjoyable.
Importing Data and Getting to Know the OpenRefine User Interface
- You can import data from different different formats in OpenRefine
- Adjust import settings to ensure your data is read correctly and preview the results before starting.
- Functions to work with your data are used from the Arrow Buttons next to the collumn header
Exploring Data
Custom Facets and GREL
Transforming Data
Reconciling Data with External Data Sources
- Reconciliation links text strings to unique identifiers in external databases.
- This makes your dataset more precise, reusable, and comparable across projects.
- OpenRefine provides a structured workflow for reconciliation: propose → review → confirm → enrich.
- The human researcher stays in control: machines suggest, but you decide.
Undo, Redo, and Exporting Workflows
- OpenRefine records every transformation you make.
- The Undo/Redo panel lets you move backward and
forward through your cleaning process.
- Workflows can be exported as JSON and reapplied to other projects, ensuring transparency and reproducibility.