Blog

Cosmos DB API for MongoDB vs Cosmos DB API for NoSQL vs MongoDB Atlas

(6 minutes to read)
Categories:

In January 2023, as part of the one project, I conducted several tests to compare three different NoSQL databases. The purpose was to assess their performance and determine the most suitable option for future use.

The databases included in the test were:

Database nameConfiguration
MongoDB Atlas M30Dedicated, M30 (32GB, 8GB RAM, 2v CPU); Azure (westeurope)
Cosmos DB API for MongoDBProvisioned, 10K RU autoscaling container; Azure (westeurope)
Cosmos DB API for NoSQLProvisioned, 10K RU autoscaling container; Azure (westeurope)

The application I worked on dealt with various JSON documents and has two primary requirements:

  • Consumers can submit multiple documents at once, with a maximum of 1K documents per request. The expectation is for these documents to be processed as quickly as possible.
  • Consumers can read documents while applying various filtering and sorting rules. The service should respond promptly. On average, each “bucket” (partition) contains around 10K documents.

The main motivation behind conducting these tests was the disappointing performance of Cosmos DB API for MongoDB, which prompted the need to explore alternative options.

Cosmos DB limitations

One of the limitations of Cosmos DB, when compared to MongoDB, is the requirement for a mandatory index on a field that needs to be sorted. This limitation posed a challenge for us, as one of our requirements was the ability to sort by any field. To address this limitation, we had to resort to using a “Wildcard” index, which unfortunately has a negative impact on performance.

Another limitation is the concept of Request Units (RU). In Cosmos DB, 1 RU represents 1KB for write operations. Since our average document size is 4.5KB, each write operation consumes approximately 4 RUs, in addition to the RUs consumed by indexes. It’s important to note that a Cosmos DB partition can handle a maximum of 10K RUs. Therefore, in our case where a large number of documents need to be inserted into a single partition, the maximum throughput we can achieve is 10K RUs. Setting a higher throughput would not make any difference in this scenario.

However, despite these limitations, we still wanted to compare the Cosmos DB APIs before exploring alternative solutions outside of Azure. We were hopeful that a different API might better serve our needs, and we aimed to leverage as much as possible from Azure without venturing into external options.

Testing approach

The testing approach was the following:

  1. No frameworks, only drivers in use.
  2. Generate 10K documents in memory so they would be the same across databases.
  3. Average size of a document is 4.5KB.
  4. Do insertions and read the documents from all databases listed above in one test scenario.
  5. Measure execution time around driver insert and read commands.
  6. Tests are executed one after another after a pause to reset the request limits.

Building blocks

The tool which run the tests had the following building blocks:

  • Kotlin 1.7.21
  • Java 17
  • com.azure:azure-cosmos:4.39.0 (Async Client)
  • org.mongodb:mongodb-driver-reactivestreams:4.8.1 (Async Client)

Code is not present for now, maybe will be added later. The important bit is what operations were used:

  • MongoDB — insertMany; find with sort, limit, skip and allowDiskUse(true).
  • Cosmos DB — executeBulkOperations; queryItems with SELECT, ORDER BY, OFFSET and LIMIT.

Environment

Code was executed from AKS cluster which was located in Azure (westeurope). In the same region as three contestants. No memory or CPU limits were set, VM type — Standard_B4ms. Code was executed as a Job in Kubernetes.

Rules for all tests

There were rules applicable to our reality:

  • IDs for the documents are client generated.
  • By default, descending sort is done on created date time field.
  • Writes are unordered.
  • 10K generated documents are assigned to the same partition.

Results

Let’s take a look at the numbers. “Index” column indicates what index rules where set:

  • “No” — no indexes (only default for _id is present).
  • “Only needed” — index on created date time and a.b.c.d.e.f field.
  • “Wildcard” — index on all document properties.

Test Group 1, Insert

Inputs:

  • 10K documents total.
  • One insert has 1K documents.
  • Insertions are done sequentially.
DBTime, average (ms)Index
Mongo Atlas M304,127No
Mongo Atlas M304,232Only needed
Mongo Atlas M3010,872Wildcard
Cosmos DB API for MongoDB12,015Only needed
Cosmos API for NoSQL15,605Only needed
Cosmos API for NoSQL16,304Wildcard
Cosmos DB API for MongoDB198,320Wildcard

Test Group 2, Read

Inputs:

  • 10K documents read.
  • Read in pages of size 200.
  • Reads are done sequentially.
DBTime, average (ms)Index
Mongo Atlas M3010,600Only needed
Mongo Atlas M3011,840No
Cosmos DB API for NoSQL14,618Only needed
Cosmos DB API for NoSQL21,407Wildcard
Mongo Atlas M3023,859Wildcard
Cosmos DB API for MongoDB78,674Wildcard
Cosmos DB API for MongoDB81,251Only needed

Test Group 3, Read

Inputs:

  • 10K documents read.
  • Read in pages of size 1K.
  • Reads are done sequentially.
DBTime, average (ms)Index
Cosmos DB API for NoSQL5,806Only needed
Mongo Atlas M307,371Only needed
Mongo Atlas M307,689No
Cosmos DB API for NoSQL8,581Wildcard
Mongo Atlas M3017,811Wildcard
Cosmos DB API for MongoDB71,197Wildcard
Cosmos DB API for MongoDB77,275Only needed

Test 4, Read

Inputs:

  • 10K documents read.
  • Read in pages of size 200.
  • Sort by deep field (e.g. a.b.c.d.e.f).
  • Reads are done sequentially.
DBTime, average (ms)Index
Mongo Atlas M308,258Only needed
Mongo Atlas M309,638No
Cosmos DB API for NoSQL11,286Only needed
Cosmos DB API for NoSQL16,278Wildcard
Mongo Atlas M3021,361Wildcard
Cosmos DB API for MongoDB78,993Wildcard
Cosmos DB API for MongoDB82,552Only needed

Test 5, Read

Inputs:

  • 10K documents read.
  • Read in pages of size 1K.
  • Sort by deep field (e.g. a.b.c.d.e.f).
  • Reads are done sequentially.
DBTime, average (ms)Index
Cosmos DB API for NoSQL5,560Only needed
Mongo Atlas M306,500Only needed
Mongo Atlas M307,025No
Cosmos DB API for NoSQL8,196Wildcard
Mongo Atlas M3018,075Wildcard
Cosmos DB API for MongoDB73,205Wildcard
Cosmos DB API for MongoDB79,765Only needed

Summary

Based on the numbers collected, here is a summary and interpretation of the findings:

  • Cosmos DB API for NoSQL performs faster than Cosmos DB API for MongoDB.
  • In all insertion tests, including scenarios with no index, only necessary indexes, and wildcard index, MongoDB Atlas M30 outperforms both Cosmos APIs.
  • In read tests, Cosmos DB API for NoSQL and MongoDB Atlas M30 show similar performance.
  • MongoDB Atlas M30 without indexes performs on par or even better than Cosmos DB API for NoSQL with indexes.
  • Among the tested databases, Cosmos DB API for MongoDB is the slowest.

In essence, the only similarity between Cosmos DB API for MongoDB and MongoDB itself is the name and some shared APIs. For small applications with low loads and simple functionality, the differences between the two may not be significant. However, for applications with higher demands as in our use case, Cosmos DB may not be the optimal choice.

✨ If you found this post helpful and want to say "Thank you! 💖", you can treat me with a cup of tea.