🎉 30 days FREE!Claim Now

· Andrei M. · Data Quality  · 13 min read

Case Study: How a Baby Products Brand Standardized 25,000 Attributes After Multi-Supplier Onboarding

A baby products brand onboarded 8 new suppliers in 6 months. The result: 25,000 product attributes with no consistent naming, no value standardization, and broken storefront filters. Here is how they fixed it.

Case Study: How a Baby Products Brand Standardized 25,000 Attributes After Multi-Supplier Onboarding

A baby products brand selling strollers, feeding bottles, clothing, toys, and nursery furniture had onboarded 8 new suppliers over a 6-month period as part of a catalog expansion strategy. Each supplier sent product data in their own format, using their own attribute naming conventions and value vocabularies. After onboarding, the brand’s catalog contained 25,000 product attributes across 1,200 products — with no consistent naming, no standardized values, and a storefront filter system that had become actively misleading to shoppers. The product attribute cleanup was not optional.


The Challenge

The symptom that forced the issue was customer-visible: the storefront’s faceted filters had become unusable for the baby clothing category. A shopper filtering by color was seeing 14 distinct “color” filter options on the left-hand navigation panel: Color, Colour, color_name, primary_color, main_colour, Farbe, Product Color, Item Color, and six variations of those names with different capitalization or punctuation.

Each variation was a separate attribute that different suppliers had used to represent the same data. Because the catalog system treated differently-named attributes as distinct fields, each supplier’s color attribute created a separate filter facet. A shopper trying to find blue items would need to check 14 separate filter boxes — and even then, suppliers who had used hex codes rather than color names would not appear in any of those filter results at all.

The same problem existed across every filterable attribute category. Age suitability appeared as “Age Range,” “age_group,” “Suitable Age,” “Min Age Months,” “Age (months),” and 9 other variants. Material was represented across 11 different attribute names. Safety certifications — a critical purchase consideration for baby products — appeared under 8 different attribute names.

Beyond the filter visibility problem, the attribute fragmentation was creating operational problems in catalog management. Because the same data existed under different names across different supplier imports, there was no way to run a catalog-wide query like “show me all products certified for CE safety” without manually querying 8 different attribute names and deduplicating the results. Catalog managers trying to apply bulk updates to products with a specific attribute value could not do so efficiently because the target attribute had 8 to 14 equivalent names depending on which supplier’s data was in scope.

The total attribute count across the 1,200 products was approximately 25,000 individual attribute instances. Of those, an analysis identified roughly 340 unique attribute names in active use — representing, at best, 60 to 80 meaningfully distinct data fields that had been fragmented into 340 variants by inconsistent supplier naming.


What They Tried First

The first response was a manual rename project. Two catalog managers were assigned to work through the attribute list and rename duplicates to a standard name. They worked in the catalog management system’s attribute view, identifying groups of attributes that appeared to represent the same field and renaming them one by one.

After two weeks, they had standardized approximately 80 attribute names — roughly 24% of the 340 in active use. The work was slow because identifying which attributes were true duplicates required sampling the values across multiple products for each suspect pair, not just comparing the names. Two attributes named differently might represent genuinely different fields; two attributes named similarly might have been used for different data by different suppliers.

The rename work also created a secondary problem: when an attribute was renamed, products using the old attribute name lost their filter assignments until the new name was indexed. On a live storefront, this created transient periods where products temporarily disappeared from filter results. The team was not comfortable running this process at scale on a live catalog.

The second attempt was a supplier data re-import project. The idea was to ask each of the 8 suppliers to re-export their product data using a standardized attribute template the brand would provide. Four suppliers agreed and re-exported on the template. The other four either could not export to a custom format from their systems or did not respond in a reasonable timeframe. Re-importing the 4 compliant supplier exports added 4 new standardized attribute names to the mix, but the existing non-standard attributes from those same suppliers were still in the catalog from the original imports and needed to be removed. The catalog now had both the old and new versions of those suppliers’ attributes in simultaneous use across different products.

Neither approach was producing a clean, governed attribute schema at the pace the business needed.


The Solution

The team used MicroPIM’s attribute mapping and bulk normalization tools to execute a structured product attribute cleanup across the full 25,000 attribute instances.

Step 1: Audit and Classify All Attribute Names

The starting point was a complete audit of all attribute names in use. MicroPIM’s attribute manager exports a complete attribute list with usage counts — how many products have a value for each attribute name. The export produced a spreadsheet with 340 rows, each representing a unique attribute name and the count of products using it.

The catalog team worked through this spreadsheet offline to classify each attribute into one of three groups:

  • Keep: The attribute name is the standard and will be the canonical name going forward.
  • Map to canonical: This attribute name is a variant of a canonical attribute — it should be merged into the canonical name.
  • Retire: This attribute name represents data that should not be in the catalog (supplier-internal codes, temporary fields, data artifacts from imports).

The classification work took two days for two people. It required domain knowledge of which attributes were equivalent — for example, recognizing that “Min Age Months” and “Suitable Age” both represented age suitability and should be mapped to a canonical “Age Suitability” attribute, even though neither was the same name. The output was a mapping document: 340 source attribute names mapped to approximately 72 canonical attribute names, with 40 attributes designated for retirement.

Step 2: Define the Canonical Attribute Schema

Before executing the mapping, the team formalized the canonical attribute schema in MicroPIM — creating 72 structured attribute definitions with their data types, allowed values (controlled vocabulary for select-type attributes), and validation rules.

For attributes with controlled vocabularies — color, age suitability, material type, safety certification, size range — the canonical values were defined explicitly. For color, the standard value list included 28 color names that covered the full range of values appearing across supplier data, with a mapping table from supplier-specific color terms to the canonical values (for example, “sky blue,” “powder blue,” “light blue,” and three hex codes all mapped to the canonical value “Light Blue”).

[SCREENSHOT: MicroPIM attribute schema editor showing the canonical “Color” attribute definition with a controlled vocabulary of 28 values, and the value mapping table showing 47 non-standard supplier color terms mapped to canonical values]

This step took approximately 8 hours, including the value mapping tables for the 12 attributes with controlled vocabularies.

Step 3: Run Attribute Mapping in Bulk

With the canonical schema defined and the source-to-canonical mapping document complete, the team used MicroPIM’s bulk attribute mapping function to execute the normalization. The mapping configuration imported the 340-row mapping spreadsheet and applied the transformations across the full product catalog.

For each product, the operation:

  1. Read the value from the source attribute (e.g., “primary_color”: “sky blue”)
  2. Applied the value mapping (sky blue → Light Blue)
  3. Wrote the value to the canonical attribute (Color: Light Blue)
  4. Flagged the source attribute for retirement once the migration was validated

The bulk mapping ran in approximately 35 minutes for all 1,200 products and 25,000 attribute instances.

[SCREENSHOT: MicroPIM bulk attribute mapping job results screen showing 25,000 attribute instances processed, 18,400 successfully mapped to canonical attributes, 4,200 flagged for manual review due to unrecognized values, and 2,400 retired attribute instances]

Step 4: Resolve Manual Review Flags and Retire Deprecated Attributes

The mapping job flagged 4,200 attribute instances where the source value did not match any entry in the value mapping table — meaning the supplier had used a value the team had not anticipated when building the mapping tables. The catalog managers worked through the flagged values in batches, deciding whether each value should be added to the canonical vocabulary or mapped to an existing canonical value.

The resolution work took 3 days for one catalog manager. Approximately 60% of the flagged values were straightforward additions or mappings that had been missed in the initial mapping table. The remaining 40% required checking with the supplier or the product record itself to determine the correct canonical value.

After resolution, the deprecated source attribute names were retired from the active schema. MicroPIM’s attribute retirement function removes the attribute from the schema definition while preserving a historical record — the old attribute values remain queryable for audit purposes but are no longer treated as active catalog fields.

[SCREENSHOT: MicroPIM storefront filter preview showing the “Color” filter with 28 clean, distinct values replacing the previous 14 fragmented color attribute facets]


The Results

Filter usability: The storefront’s faceted filter system was reduced from 340 fragmented attribute facets to 72 canonical attributes. The color filter dropped from 14 variants to a single “Color” filter with 28 distinct values. Shopper engagement with the filter system (measured by filter click events per session) increased 41% in the 30 days following the cleanup, indicating shoppers were finding the filters useful again.

Catalog management efficiency: Queries and bulk operations that previously required checking 8-14 attribute names could now be executed against a single canonical attribute. The time to generate a “show all products with CE certification” report dropped from 25 minutes (requiring manual queries against 8 attribute variants and deduplication) to under 2 minutes.

Supplier onboarding: The canonical attribute schema became the mandatory data specification for new supplier onboarding. Suppliers now receive a data template with the 72 canonical attribute names and controlled vocabulary value lists. New supplier imports are mapped to the canonical schema at import time, and MicroPIM’s import validation flags non-conforming values before they enter the catalog. The attribute fragmentation problem has not recurred in the 9 months since the cleanup.

Search indexing: The storefront’s search system, which indexes product attributes for faceted search, reduced its attribute index from 340 entries to 72. Search index build time dropped from 4.2 hours to 38 minutes, which meant the nightly search index update that had been running past working hours was completing well before midnight.

Returns attributed to incorrect specification data: Product returns citing “not as described” for specification attributes (age suitability, material, size) dropped from 4.1% of orders to 1.8% in the 90 days following the cleanup. The return reduction was attributed to shoppers filtering by correct standardized values and finding products that genuinely matched their requirements.


Key Takeaways

  • Attribute fragmentation from multi-supplier onboarding is predictable when there is no canonical schema enforced at import time. The more suppliers onboarded without an attribute governance process, the worse the fragmentation becomes.
  • The classification work — mapping 340 source attribute names to 72 canonical names — is the hardest part of the cleanup and cannot be fully automated. It requires domain knowledge of what attributes actually mean and catalog familiarity to distinguish legitimate duplicates from genuinely distinct fields.
  • Controlled vocabularies for select-type attributes (color, age suitability, material) are the most important governance mechanism to implement. Without them, value fragmentation recurs even after attribute name standardization.
  • Executing the mapping in bulk rather than product-by-product is the difference between a multi-month project and a multi-week project. The bulk mapping in MicroPIM processed 25,000 attribute instances in 35 minutes; the equivalent manual work would have taken months.
  • The operational value of a clean attribute schema — efficient catalog queries, reliable filter results, faster search indexing — compounds over time. It is not just a cosmetic improvement.

Multi-supplier catalogs without attribute governance accumulate data debt with every new import. The cleanup cost grows with the catalog. If your storefront has duplicate filter facets, inconsistent attribute names, or multiple fields representing the same data type, the product attribute cleanup project is overdue. MicroPIM’s attribute mapping and bulk normalization tools are built for exactly this scenario. Create your account at app.micropim.net/register and run an attribute audit on your first import.



Frequently Asked Questions

How do you identify which attributes are true duplicates versus genuinely distinct fields with similar names?

The most reliable method is value sampling: for two suspect attributes, pull the values in use across a sample of 20-30 products and compare them. If “color_name” and “primary_color” both contain values like “red,” “blue,” “green,” they are almost certainly the same field under different names. If “item_color” contains standardized color codes and “color_description” contains free-text phrases like “deep forest green with silver accents,” they represent different data and should not be merged. In MicroPIM’s attribute manager, you can view the distinct values in use for any attribute, which makes this sampling quick without needing to open individual product records.

Can attribute mapping be run incrementally as new supplier imports arrive, or does it require a full catalog pass each time?

Attribute mapping can be applied at import time for new supplier data, so that incoming products are normalized to the canonical schema immediately rather than entering the catalog with supplier-specific attribute names. MicroPIM’s import templates support a field mapping configuration that translates supplier column names to canonical attribute names and applies value mapping tables during the import. This means the cleanup project is a one-time event for existing data, and ongoing governance prevents the problem from recurring for new imports.

What happens to historical product data for retired attributes after the cleanup?

Retired attribute data is preserved in MicroPIM’s data history for audit and reference purposes. Products that previously had values for a retired attribute retain a historical record of those values, accessible via the product’s change log. The retired attributes no longer appear in the active schema, filter configuration, or search index — they are invisible to the storefront and to catalog management workflows — but the underlying data is not deleted. If you later realize a retired attribute contained data that should have been mapped to a canonical field rather than retired, the historical values are still available to work with.

What is a realistic timeline for a product attribute cleanup project at this scale?

For a catalog of 1,000-2,000 products with 200-400 unique attribute names, allow 2-3 weeks of part-time effort for the classification and mapping work (done offline as a spreadsheet analysis), plus 1-2 days to execute the bulk mapping and review the flagged exceptions in MicroPIM. The cleanup itself (running the bulk mapping job) takes minutes to hours depending on catalog size. The majority of the time is the upfront classification work, which is primarily a human judgment task. Having a team member with strong catalog domain knowledge — who understands what each attribute means in context — is more important than the technical tooling for this phase.

Andrei M.

Written by

Andrei M.

Founder MicroPIM

Entrepreneur and founder of MicroPIM, passionate about helping e-commerce businesses scale through smarter product data management.

"Your most unhappy customers are your greatest source of learning." — Bill Gates

Back to Blog

Related Posts

View All Posts »
Get Started Today

Start Using MicroPIM for Free

No credit card required. Free trial available for all Pro features.

Join other businesses owners who are using MicroPIM to automate their product management and grow their sales.

  • 14-day free trial for Pro features
  • No credit card required
  • Cancel anytime
SSL Secured
4.9/5 rating