Home / Horse Sport Ireland Case Study

Horse Sport Ireland Case Study

Case Study – Horse Sport Ireland Data Cleansing Project

Case Study

Horse Sport Ireland Data Cleansing Project

Challenge

Horse Sport Ireland (HSI), the national governing body for equestrian sports in Ireland, was facing significant challenges with their legacy system for managing passport registrations, endorsements, importation, and ownership applications.

The outdated, 15-year-old custom-built platform relied on manual data input with limited internal controls. This led to serious data integrity issues—such as duplicate records for animals and owners, as well as inaccurate progeny information—ultimately compromising the quality and reliability of their data.

These inefficiencies presented a critical roadblock as HSI prepared to migrate to a new, modern database platform, one that required a clean and accurate dataset for a successful transition.

Solution

The team applied advanced machine learning techniques to identify duplicate records for both the animal portfolio and accounts table. This initial analysis revealed numerous inconsistencies, including mismatched progeny data and incorrect parentage information, which were flagged for further review.

A process was developed to isolate and validate candidate duplicate records, ensuring no critical information would be lost during the clean-up process. This enabled the presentation of candidate duplicate records to HSI so they could be isolated and removed from the data table.

To prevent linking errors, the team cross-referenced progeny data with verified parent records, re-mapping incorrect relationships and making sure that all associated records were updated accurately.

Validation rules (business rules) were implemented on a holistic level to highlight anomalies that may indicate incorrect links between data. For example, where progeny are older than parents or where male horses were incorrectly identified as the Dam. This ensured that HSI had a clean data set to migrate to their new platform.

Impact

20% of the population was identified as duplicate data and successfully removed
Validation Rules implemented to prevent future data errors

Through the data cleansing project, the team identified that 20% of the population was duplicate data. Once removed, HSI were faced with a reduced data set for review, ensuring a more efficient project for their team.

The project also successfully implemented validation rules that not only mitigated against erroneous data going forward but that set the foundations of clean data standards for migration to their new system.

What was once seen as a cumbersome project had been greatly reduced after the team took control of the data set.

“We were initially overwhelmed by the number of duplicates being identified which highlighted the true extent of the challenge we faced. Their solution not only cleaned up thousands of duplicate entries but also put essential data validation rules in place, ensuring a smooth migration to the new system and preventing future errors.”