The Challenge

Oftentimes, catalogs are out of sync with the data in the ERP. It could be the result of missing or incorrect Manufacturer Part Numbers (MPNs), leading to duplicate SKUs and missing values. This is a common situation with most retailers and distributors. Their challenge is to find an efficient way of detecting duplicates, combining SKUs where possible, and identifying the correct MPN for each SKU.

The Solution

dataX algorithms solve this problem in two steps. First, we look at the base product catalog and use the Content Enrichment module to optimize the data to the extent possible. We do this by getting all of the right attributes into the catalog and making sure that there are no missing MPN values, which we obtain from supplier websites or any other sources that are available.

Once the catalog is in order, we use the Product Matching algorithm to look through the entire data, and identify SKUs that have the same attributes, manufacturer and part number. These are considered duplicates and will be combined into a single product. Further, we look at SKUs with the same attributes, but different manufacturer information, and compute a similarity score based on other parameters such as images and text associated with the product. If the similarity score is above a certain percentage, the SKUs are grouped together. We can then de-duplicate these SKUs by merging them, and updating the MPN value. Finally, the ERP information is updated with the catalog information.

The Advantage

At the end of this automated exercise, retailers and distributors will have a clean product catalog, completely in sync with the ERP, with absolutely no duplicates, and with the right MPN values for all the SKUs.