Short explanation on how to add or update fields in documents referencing the values of other already existing fields by leveraging the aggregation pipeline framework.
Table of Contents
Initial document situation
Let’s assume you have the following documents with hourly temperature values in a collection called temperatures
…
[
{
_id: ObjectId("62508bd0742bfb98b29dbe71"),
date: ISODate("2022-04-08T08:00:00.000Z"),
tempC: 7.3
},
{
_id: ObjectId("62508bf0742bfb98b29dbe8c"),
date: ISODate("2022-04-08T09:00:00.000Z"),
tempC: 7.8
},
{
_id: ObjectId("62508c02742bfb98b29dbe93"),
date: ISODate("2022-04-08T10:00:00.000Z"),
tempC: 8.5
}
]
The given temperature in field tempC
is in degrees Celsius but you may also need the temperature in degrees Fahrenheit. For various reasons it’ll make sense to have the Fahrenheit values persisted in MongoDB instead of calculating them always on-the-fly in your application.
So you want to add a field tempF
to every document which holds the temperature in Fahrenheit. The calculation formula for that would be easy: tempF = tempC * 1.8 + 32
. But how to achieve that in MongoDB?
Caveats with common update operators
Trying to solve this simple appearing task with the basic MongoDB functions, you will quickly face the following problems:
- The $set operator used to add new fields or update existing ones cannot be used together with expressions. It only takes plain values, so it is not possible to reference other fields.
- The traditional MongoDB update operators like
$mul
and$inc
which would be needed here to calculatetempF
are not sufficient. This is because they are only made to change an existing field in-line together with a fix value, e.g. “add 10 to a field” or “multiply a field by 2”.
So what is the way to go for with MongoDB to add a new field or update it when the resulting value references another existing field?
Updating documents using aggregation pipelines
Starting with MongoDB 4.2, it is possible to use the powerful aggregation pipeline with updates. This enables the usage of aggregation pipeline operators in normal update statements. These operators are more flexible than the traditional ones and allow expressions referencing other fields by using the ‘$…’ field path notation.
Having this in mind, we can now add the needed field using a $set
aggregation pipeline stage using $add
and $multiply
as following…
db.temperatures.updateMany(
{},
[{ $set: { tempF: { $add: [ { $multiply: ['$tempC', 1.8] }, 32] } } }]
);
The new field tempF
is now added to every document based on the already existing tempC
field…
[
{
_id: ObjectId("62508bd0742bfb98b29dbe71"),
date: ISODate("2022-04-08T08:00:00.000Z"),
tempC: 7.3,
tempF: 45.14
},
{
_id: ObjectId("62508bf0742bfb98b29dbe8c"),
date: ISODate("2022-04-08T09:00:00.000Z"),
tempC: 7.8,
tempF: 46.04
},
{
_id: ObjectId("62508c02742bfb98b29dbe93"),
date: ISODate("2022-04-08T10:00:00.000Z"),
tempC: 8.5,
tempF: 47.3
}
]
Note: As the $set aggregation operator would overwrite the value of the specified field if it already exists in the document, this approach also works perfectly for updating fields.
At the end of the day, that was easy – right? 😉