As a follow-on to my recent post about the upcoming features in MongoDB Version 3.2, I wanted to mention a specific feature that hasn't made headlines, but is nonetheless quite interesting to developers: Bitwise Queries.

MongoDB has supported bitwise update operations for some time. But if you wanted to find or atomically find-and-modify documents with fields matching a certain bit pattern you were out of luck. But now, as of Developer Preview 3.1.6, bitwise query support is built-in.

Example

For data collection systems with many multi-select lists and checkboxes to deal with, this could be a particularly useful feature. You may be storing these lists as an array using (at least) 4 bytes (int32) to save each and every selection made by the user. With a multi-select offering, say, 30 options/checkboxes, if the user checks all of them that's 120 bytes.

By storing this information in a bitwise fashion in a single, 32-bit value, for example, the space occupied by the element would be 4 bytes, regardless of which checkboxes are checked by the user.

Yes, this savings of 116 bytes seems negligible. But if we saved this over 20 fields per document, and 500 documents per quarter, and 1000 users per quarter, it adds up! In this example, it adds up to a savings of about 1GB per quarter, per database by my calculations. If you had 25 such databases, that's roughly 100GB in space savings per year.

Bitwise Storage Can Improve Overall Performance

To quote a MongoDB blog post regarding data compression:

Size is one factor, and there are others. Disk I/O latency is dominated by seek time on rotational storage. By decreasing the size of the data, fewer disk seeks will be necessary to retrieve a given quantity of data, and disk I/O throughput will improve. In terms of RAM, some compressed formats can be used without decompressing the data in memory. In these cases more data can fit in RAM, which improves performance.

So there's more to be gained by implementing bitwise data storage schemes than simply the cost savings on storage media.

Another subtle implication of this bitwise approach is that updates to these types of elements in a given document can occur "in-place" without expanding the size of the document.