Understanding Indexing in MongoDB

TLDR: Indexing is just a way of grouping documents, dividing collections into groups to speed up performance


The idea of indexing in MongoDB is similar to index of any book, They increase the speed of finding a page. The index in MongoDB increase the speed of finding documents

How do indexes work?

collectionName.createIndex({field:value}) //for creating indexcollectionName.dropIndex({field:value}) //for removing index

Here field is the “fieldName” which will be indexed. “Value” could be -1 or 1 or “text”.

It defines the type of index, 1 or -1 increase find() query performance whereas “text” is used for searching.

1 and -1 give the order of the index. Ascending = -1 & Descending =1

Now, How indexes work under the hood?

Imagine a collection of users, each document containing various information, one of them being score.

Let’s say we want all users to score 23.

When no index exists, MongoDB goes through each document to find the queried document, This is called Collection scan, MongoDB has a shorthand for this COLLSCAN (This is called Table scan in Relational databases)

How can we optimize this?

To optimize this we can create a table with one column for score and another column for references which will contain IDs of documents with that particular score. Now we only need to scan that table rather than scanning the whole database. This is much faster. This is exactly what an Index is.

Indexes help MongoDB narrow the dataset it will have to scan. This is called Index Scan, MongoDB has a shorthand for this too IXNSCAN

Here is a visual representation of a score Index and its mapping.

Performance improvement by Index is only visible when the number of documents crosses 100K or so.

You can compare it yourself by comparing two queries one with an indexed field and one without an Index

db.<collection name>.find(query).explain()

An Object will be returned

object.winingPlan.stage will tell the type of scan COLLSCAN or IXNSCAN

but it will not tell you the time taken in the execution

use explain(‘executionStat’) method before the query method like find

db.<collection name>explain('executionStat').find(query)

executionTimeMillis will tell you the time taken in the execution

Downsides of Index


this Video best explains B Tree


Finding Magic.