From 1 million SKUs to 200 million in 18 months

Photo by Sylvia Yang on Unsplash
Success Stories
From 1 million SKUs to 200 million in 18 months


Retail giant, one of the top 3 in the US, with growing multinational presence.


Our client, already leading the world market in brick-and-mortar retail, wanted to up their e-commerce game by scaling their online presence from 1 million to 200 million SKUs in 18 months.

That automation was the answer was obvious. The client fully understood the limitations of their earlier manual process, which required a 1000-member strong human team to key in rows and rows of product data into their system. For a scale this big, continuing the manual process meant hundreds of thousands of more people, and timelines that would have put them nowhere in the competitive space.

Our client had a competent in-house team of data scientists who knew, better than anyone, that it was not just automation, but intelligent automation that was going to help them operate at scales they were looking at.

The challenge therefore was not just about getting 200 million SKUs on board; it wasn’t just about getting this done quickly – it was really about getting it done accurately and efficiently. It was about scaling their catalog without having to scale costs; it was about achieving 100% automation without compromising accuracy.

This is where dataX stepped in.


With access to over 25,000 data scientists spread around the world, we crowdsourced the onboarding algorithms. With this we were able to develop hundreds of algorithms in parallel. The timelines definitely looked achievable. Our crowdsourcing model gave us another crucial advantage – with multiple community members competing with each other for best precision, we were able to meet the client’s expectation of 94-95% accuracy. This was better than human accuracy.

Soon we were meeting targets of deploying 30-40 models per week with the same consistent accuracy. We were processing 2 million API calls per day. During the holiday season, this would go up to 20-30 million, and our system was robust enough to handle that.

At the end of 18 months, all 200 million SKUs were auto-classified into their taxonomy, with enriched attributes, optimized titles and descriptions, specification tables, and relevant digital assets in place. All the targets in terms of speed, cost and accuracy were well met.


The impact was significant. Our client became the direct and only competitor to the world leader in e-commerce. The quality of data (particularly the well-populated attribute values) directly impacted sales, and great sales led to better market value as reflected in a manifold jump in the stock price. We are not exaggerating when we say that such success is caused by something as innocuous as a customer searching for a red striped shirt, and actually getting that!

Share this post

More Stories