Guides / Sending and managing data / Prepare your records for indexing

To ensure good performance, Algolia has a maximum limit on record size of 10 KB to 100 KB—based on your plan (10 KB maximum for the Build plan). If you exceed this limit, Algolia returns the error message Record is too big.

Instead of upgrading your plan, make your records smaller before sending them to Algolia. Do this by:

Remove unused attributes

You might not need to index every attribute from your data sources. For example, if you’re developing a website to display tweets on specific topics, Twitter may send you a lot of data, but only some will help when searching. For example, an excerpt of the Twitter API response might look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[
  {
    "text": "Good morning #LaraconEU ! Swing by the Algolia booth for some vintage video games, awesome stickers, and lightning-fast search 😎",
    "truncated": false,
    "in_reply_to_user_id": null,
    "in_reply_to_status_id": null,
    "pictures": true,
    "pictures_urls": [
      "https://pbs.twimg.com/media/Dl1UftbX4AAF1Fj.jpg",
      "https://pbs.twimg.com/media/Dl1Um6OXsAAa1JS.jpg"
    ],
    "source": "<a href=\"http://twitter.com/\" rel=\"nofollow\">Twitter for Web</a>",
    "in_reply_to_screen_name": null,
    "in_reply_to_status_id_str": null,
    "retweeted": true,
    "retweet_count": 10,
    "like_count": 24,
    "place": null,
    "geolocated": false
  }
]

You can reduce this record’s size by:

  • Only indexing attributes that help build your search experience
  • Reusing an attribute for different purposes. For example, retweeted=false is the same as retweet_count=0
  • Adding attributes that rely on others. For example, if you want to use the pictures attribute to display a gallery (in the pictures_url array), only include it when the pictures_urls array isn’t empty.

After removing attributes

Choose only the information you need for searching, displaying, and ranking. Leave everything else out.

For example, after transforming the previous Twitter data, your record might look like this:

1
2
3
4
5
6
7
8
9
10
11
[
  {
    "text": "Good morning #LaraconEU ! Swing by the Algolia booth for some vintage video games, awesome stickers, and lightning-fast search 😎",
    "pictures_urls": [
      "https://pbs.twimg.com/media/Dl1UftbX4AAF1Fj.jpg",
      "https://pbs.twimg.com/media/Dl1Um6OXsAAa1JS.jpg"
    ],
    "retweet_count": 10,
    "like_count": 24
  }
]

Selecting only the necessary data reduces the number of attributes and record size without hurting search quality.

Did you find this page helpful?