The Client
Top industrial distributors buying group in the USA
The Challenge
Our client came to us for an efficient and accurate automated onboarding solution, but ended up staying for a lot more. Over the years we have taken on multiple challenges, ranging from catalog data enrichment, quality audit and taxonomy management, to creating supplier versions of modules that would give our client’s suppliers a way to upload and view their data in enriched form.
The challenges therefore can be categorized into:
- Replacing a vendor who was manually onboarding product data in a slow and expensive manner, quality audit for all existing SKUs, and a plan for getting new SKUs into the system. Further, all attributes were to be validated and enriched following a set of stringent business rules and checks.
- Auto-classification into complex taxonomies at various high-level categories, including impact analysis for changes in taxonomy.
- Supplier versions of dataX modules that would enable our client’s suppliers to upload and review their data.
The Solutions:
Accurate and speedy onboarding and enrichment:
We created a proof-of-concept to demonstrate the efficiency and quality of AI automation over the manual process. While our speed was impressive, it was the way dataX handled complexity that was the clincher. Our algorithms auto-classified SKUs into complex taxonomies with better-than-human accuracy. We enriched product data by adding global attributes, product titles, specifications and other digital assets, aggregating to more than 50 attributes per SKU.
For new SKUs, we created an automated pipeline designed for smooth and speedy onboarding, with accuracy levels that required no or minimum human intervention. At the time of this writing, it is set to handle 600,000 new uploads and counting.
Quality Audit:
We performed a thorough check on the entire catalog to check if all business rules were adhered to, and all industry standards followed, and the formatting was right, down to the last comma. We also audited the accuracy of classification, quality of attributes, and correctness of image formats and other digital assets associated with e-commerce data. Our modules automatically updated all those SKUs that required minimum fix, while those that needed a complete overhaul were brought to the attention of the client through detailed reports.
Complex taxonomy management:
We built a new feature in dataX to manage our client’s complex taxonomy structures. There were separate structures defined at each high-level category, which meant that each of these taxonomies would be maintained independently, and different versions at each of these hierarchies would also have to be maintained independently.
Any change to taxonomy would potentially impact the catalog. dataX created an impact analysis tool that worked at 2 levels (a) An online, immediate assessment that returned an impact score indicating how severely the catalog would be affected; and (b) A detailed analysis that would let the client know the impact of the change on every individual SKU, down to attribute level.
Supplier views:
Our dataX supplier version was designed to let suppliers upload their data for basic sanity checks and review before submitting to our client. It also let the supplier view their own data after enrichment on our client’s website.
The Result
All these solutions came together in the most satisfying way. Our client, primarily engaged in the buying business, were able to sign on more members, with larger catalogs, because of the automated pipeline doing all the work of onboarding and enrichment that maintained the data at industry standards. Their clients were happy because of the e-commerce services they had access to, which was completely enabled by dataX, and furthermore, tuned to optimal performance thanks to our enrichment, auditing and taxonomy management tools.
Resources
It is a long established fact that a reader will be distracted

Supplier Monitoring
One Pager

Competitor Monitoring
One Pager

Customer Experience Optimization
One Pager

Taxonomy Management
One Pager

Content Enrichment
One Pager
The Client
The world’s largest marketplace for sampling interior design materials.
The Challenge
When you are in possession of the world’s largest selection of interior design materials, and when your business depends on making every single one of those available to architects and designers, but your catalog only has 30000 SKUs – yes, there’s a problem.
Our client had an in-house team working on getting all their SKUs into the system, but the process was slow. With their sights set on onboarding millions of SKUs, and the speed of current process making this look impossible, they turned to dataX for a quick and efficient onboarding solution.
The Solution
Within 12 weeks we had completely revamped the onboarding process. Using AI automation, our modules were getting 30000 SKUs into the system every month. Contrast this to the point just before dataX entered the picture, when there were 30000 SKUs in all, and you will get an idea of how rapid the progress was.
With growing data, the business also grew. Our client made several acquisitions, which meant there were more SKUs to process, more catalogs to merge. We successfully standardized data across acquisitions, even multilingual international data. (At the time of this writing, we are implementing this module in Japanese, allowing the client to standardize their SKUs across the two languages, and we are set to add many more languages for them soon.)
“dataX modules performed a quick and accurate onboarding of product data, ensuring that our client’s samples always remain accessible to customers”
The Solution Plus
Having put in place an efficient onboarding process, which would also take care of standardizing any new catalog data – we realized there was still a gap.
Our client had 800 brands from various suppliers on their catalog. Suppliers and manufacturers frequently make changes to product data, and our client had to either depend on the supplier to pass on this information, or manually check the supplier websites and update their catalog. Surely, there must be a better way to stay current?
With dataX, there is.
“dataX constantly monitors supplier websites to help retailers present the most up-to-date product information to their customers”
Our modules periodically scour through supplier websites and generate reports of changes in prices, descriptions, product attributes, digital assets and every kind of product data that would require the attention of our client. Of particular interest are those attributes that lead to conversions on the website – as you can imagine, changes to such attributes would have the most impact, and as such, must never go unnoticed.
We also monitor supplier websites for the active/inactive status of SKUs, and for gaps in the digital assets (such as install images, manuals, documents). Every status change, every gap in data is made known to our client, and by extension, to their customers. Our client is therefore able to confidently present the most current and up-to-date product information, eliminating chances of putting up products that are no longer available, or worse, not even showcasing those that are!
The Result
The impact was significant. Our client became the direct and only competitor to the world leader in e-commerce. The quality of data (particularly the well-populated attribute values) directly impacted sales, and great sales led to better market value as reflected in a manifold jump in the stock price. We are not exaggerating when we say that such success is caused by something as innocuous as a customer searching for a red striped shirt, and actually getting that!
Resources
It is a long established fact that a reader will be distracted

Supplier Monitoring
One Pager

Competitor Monitoring
One Pager

Customer Experience Optimization
One Pager

Taxonomy Management
One Pager

Content Enrichment
One Pager
The Client
A leading home improvement retailer in the USA.
The Challenge
Our client had a complex classification system with 6000 product types and 4 levels of hierarchy in their taxonomy. The process of categorizing SKUs was largely manual. This was not just a problem for them, but for their suppliers as well, who had to perform this exercise for every single product they onboarded. It was tedious and confusing, and oftentimes, suppliers would end up picking the wrong product type or just throwing everything into a “miscellaneous” bucket. For a while, our client managed by manually correcting the misclassified product data. At one point, their catalog grew to 2 million SKUs, and what was difficult became impossible.
They turned to dataX for an automated classification and onboarding system.
The Solution
We built an auto-classifier to correct and reclassify all their existing product data into the specified taxonomy. This was up and running within a short period of 8-12 weeks. What about new data? And data from suppliers? We created a pipeline through which all new data, including supplier data, would be processed and classified. Any supplier wishing to onboard their data would simply use this pipeline.
We didn’t stop there. We performed a deep enrichment of the catalog data by extracting attributes from PDF data sheets and providing a dashboard for the content team from our client’s side to make edits, validate and upload enriched data into their PIM.
“Good data is not about just well-classified data. Good data is all about plugging in the right attributes at the right places. About enriching the catalog in every way. dataX does just that.”
So the first part of our solution was to streamline the backend taxonomy and set it up for any kind of scaling and future processing.
The Solution Plus
The future was not far off! Looking at the highly organized nature of the backend taxonomy, our client asked us to map the same to the customer facing displays as well. Essentially, what this meant was that an entirely new display taxonomy would have to be defined and created. Why entirely new? Because the way products are displayed to the customer would be slightly different from the way they are categorized in the database. For example, a customer looking for recyclable products may key in search words like “green” or “environment friendly”, and our system needs to be smart enough to figure out that they are not talking about colors and plants. DataX was more than up to the task of creating this one-to-many mapping.
“While backend taxonomies are all about business rules, display taxonomies are guided by customer behavior – a nuance that dataX understands well.”
Based on customer search terms and Google AdWords, and taking into account variations within product categories, we created a finely tuned display taxonomy. Using this, we optimized product display pages that included left hand navigation panels, auto-generated product titles, descriptions and breadcrumbs – all of which leveraged robust backend product data to provide enhanced ecommerce shopping experience to the end customer.
The Result
The impact was significant. Our client became the direct and only competitor to the world leader in e-commerce. The quality of data (particularly the well-populated attribute values) directly impacted sales, and great sales led to better market value as reflected in a manifold jump in the stock price. We are not exaggerating when we say that such success is caused by something as innocuous as a customer searching for a red striped shirt, and actually getting that!
Resources
It is a long established fact that a reader will be distracted

Supplier Monitoring
One Pager

Competitor Monitoring
One Pager

Customer Experience Optimization
One Pager

Taxonomy Management
One Pager

Content Enrichment
One Pager
The Client
Retail giant, one of the top 3 in the US, with growing multinational presence.
The Challenge
Our client, already leading the world market in brick-and-mortar retail, wanted to up their e-commerce game by scaling their online presence from 1 million to 200 million SKUs in 18 months.
That automation was the answer was obvious. The client fully understood the limitations of their earlier manual process, which required a 1000-member strong human team to key in rows and rows of product data into their system. For a scale this big, continuing the manual process meant hundreds of thousands of more people, and timelines that would have put them nowhere in the competitive space.
Our client had a competent in-house team of data scientists who knew, better than anyone, that it was not just automation, but intelligent automation that was going to help them operate at scales they were looking at.
The challenge therefore was not just about getting 200 million SKUs on board; it wasn’t just about getting this done quickly – it was really about getting it done accurately and efficiently. It was about scaling their catalog without having to scale costs; it was about achieving 100% automation without compromising accuracy.
This is where dataX stepped in.
The Solution
With access to over 25,000 data scientists spread around the world, we crowdsourced the onboarding algorithms. With this we were able to develop hundreds of algorithms in parallel. The timelines definitely looked achievable. Our crowdsourcing model gave us another crucial advantage – with multiple community members competing with each other for best precision, we were able to meet the client’s expectation of 94-95% accuracy. This was better than human accuracy.
“dataX uses a crowdsourcing approach to get the best algorithms at enviable rates of accuracy and minimum bias”
Soon we were meeting targets of deploying 30-40 models per week with the same consistent accuracy. We were processing 2 million API calls per hour (day?). During the holiday season, this would go up to 20-30 million per hour, and our system was robust enough to handle that.
At the end of 18 months, all 200 million SKUs were auto-classified into their taxonomy, with enriched attributes, optimized titles and descriptions, specification tables, and relevant digital assets in place. All the targets in terms of speed, cost and accuracy were well met.
The Result
The impact was significant. Our client became the direct and only competitor to the world leader in e-commerce. The quality of data (particularly the well-populated attribute values) directly impacted sales, and great sales led to better market value as reflected in a manifold jump in the stock price. We are not exaggerating when we say that such success is caused by something as innocuous as a customer searching for a red striped shirt, and actually getting that!
Resources
It is a long established fact that a reader will be distracted

Supplier Monitoring
One Pager

Competitor Monitoring
One Pager

Customer Experience Optimization
One Pager

Taxonomy Management
One Pager

Content Enrichment
One Pager