Deduplicate recommendations
On this page
With Recommend deduplication, you can refine your recommendations by removing item variants from your AI model training. The feature offers several advantages:
- Improves recommendation accuracy and speed by removing item variants before the training process.
- Improves recommendation quality by using events from all item variants.
- Separates deduplication from search so it doesn’t affect the search experience.
Recommend deduplication versus search deduplication
Algolia offers two types of deduplication:
- Search deduplication removes duplicate items from search results. This can help improve the relevance of the results and make them more user-friendly.
- Recommend deduplication remove duplicate items from Algolia AI recommendations. This can help improve the diversity of the recommendations and make them more personalized.
With Recommend deduplication, your AI models are trained with all variants, and the recommendations then filtered by the API before sending you the results to remove variants.
How Recommend deduplication works
Recommend deduplication adds two processes to your AI model training:
- Pre-training process: generates a training dataset with only one variant per item. It also merges all events from variants of the same item.
- Post-training process: add all the variants dropped during the pre-training process back into the final set of recommendations.
Item variants share the same recommendations.
Set up the deduplication for a model
To deduplicate your recommendations, you must first declare an attribute for distinguishing variants, then turn on deduplication when configuring a Recommend model. After that, verify that the recommendations have been deduplicated.
Configure an attribute for distinguishing variants
First, choose which attribute defines records as variants:
- Go to the Algolia dashboard and select your Algolia application.
- On the left sidebar, select Search.
-
Select your Algolia index:
- On the Configuration tab, go to the Deduplication and Grouping page.
-
In the Attribute for Distinct box, select or enter the attribute name you want to use to define variants.
Only use the distinct
option if you also want to deduplicate search results.
Enable Recommend deduplication on your model
- Go to the Algolia dashboard and select your Algolia application.
-
On the left sidebar, select Recommend.
- Create a new Recommend model or edit an existing one for the index you used when setting the
attributeForDistinct
. - In the section Distinct and deduplication of recommendations, select the Deduplicate recommendations option. The attribute you selected for defining variants is shown.
-
Continue to configure your Recommend model and click Save.
Verify the recommendations
To check that the deduplication is working for your recommendations, revisit the model configuration once the training process has finished:
- Go to the Preview section.
- Use the Search for a record box to search for an item that should have variants.
This displays the list of recommendations for the selected item. They shouldn’t contain any variants.
Examples
The following examples illustrate how Recommend deduplication works. The index has records for T-shirts in different colors and sizes:
- One red T-shirt in one size (XS)
- Two green T-shirts in two sizes (S, M)
- Three blue T-shirts in three sizes (L, XL, XXL)
This example uses the Related Products model to recommend the top 3 similar items with and without deduplication (with the color
attribute configured as attributeForDistinct
).
Without deduplication
Without deduplication, recommendations include variants, such as blue (XXL) or green (M), except for the red T-shirt which doesn’t have any.
Base item | Recommendation 1 | Recommendation 2 | Recommendation 3 |
---|---|---|---|
red (XS) | green (S) | blue (L) | blue (XL) |
green (S) | green (M) | red (XS) | blue (L) |
blue (L) | blue (XL) | blue (XXL) | green (M) |
With deduplication
With deduplication, the recommendations don’t include any variants. However, since this example dataset only includes three records, it can only generate two recommendations. If you introduce a new item, such as an orange T-shirt, it will be added as a third recommendation.
Base item | Recommendation 1 | Recommendation 2 | Recommendation 3 |
---|---|---|---|
red (XS) | green (S) | blue (L) | Doesn’t apply |
green (S) | red (XS) | blue (L) | Doesn’t apply |
blue (L) | red (XS) | green (M) | Doesn’t apply |