Thought on MongoDB Document Encryption on Windows Azure

My company was asked to deploy a system managed by an Indonesia government department to Windows Azure infrastructure. Apparently, the biggest concern of enterprise and government to deploy internal information system applications to 3rd party cloud infrastructure is security. There should be a way to avoid unwanted access to the data by any parties (including the cloud infrastructure owner), but the owner of the system.

In my case, the system uses Node.js as development platform and MongoDB as database system. MongoDB document encryption needs to be implemented in addition to any built-in security measure.

Please note that this post is used as my note of my research so far, not as a guideline. As the project is not yet started, there’s no way I know which one of the alternatives I describe here that actually works. Will update later with more practical guide.

Encryption

Adding security aspect to data means applying encryption to two primary levels:

  • “Data-in-motion” refers to the data that are moved on wire, and can be easily protected with SSL/TLS.
  • “Data-at-rest” refers to the data that are stored on storage. This level is tricky.

As stated, data are stored and managed in MongoDB and MondoDB doesn’t come with built-in encryption mechanism for “data-at-rest”. It means if somehow someone has access to the machine that hosts MongoDB documents, they can virtually see the contained data just by loading those documents to mongod process. More over, MongoDB document files are somehow human readable, that can be opened by any text editor application. That makes sensitive data are open to untrusted eyes.

Encryption of MongoDB documents can be done by encrypting every needed fields before they are stored, thus Field-based Encryption. That means if you already have an existing application, it  needs to be changed in such as way so that it can store encrypted data to MongoDB and then decrypt the data back when reading from MongoDB, so that the data are usable within the application. That requires some major changes in app, depends on how many fields that need to be encrypted and how many parts in application that are related to those fields.

Another alternative is by encrypting the file system where the MongoDB documents are stored. That may seems easy but the MongoDB process (mongod) can not read encrypted file system by default. There should be a modification to mongod binary, or a intermediary between mongod and encrypted file system.

Gazzang

To deal with encrypted file system, the MongoDB-recommended option is using third party Linux package called Gazzang zNcrypt. It has built-in support for MongoDB that will allow mongod process to read/store data from/to encrypted file system. The encryption key can be stored outside the MongoDB host OS, either on premise or using Gazzang Key Store Service cloud server. More information:

  • http://www.gazzang.com/products/zncrypt/mongodb
  • http://www.mongodb.com/presentations/securing-data-mongodb-gazzang

Unfortunately, as it seems the best solution available now, but Gazzang’s pricing take us by surprise. It’s way too expensive for our project. May be it’s not that expensive for you. If you have the budget, you should really consider this solution. As for me, I should consider another approach as alternative to Gazzang.

Another Alternatives

Another alternatives my team can come up for now:

1. TrueCrypt
TrueCrypt allows encryption/decryption on the fly against certain disk volumes, partitions, or folders. The idea is encrypting the MongoDB data folder with TrueCrypt. The challenge is how to configure/setup so that mongodb process can still read the data. Some approach is by putting mongodb process inside encrypted folder, along with the MongoDB data. Need more exploration.

2. Partial field-based encryption
While I’ve explained that field-based encryption will require to change application and takes time, may be we can only deal with certain fields that contain sensitive data, instead of all fields in all documents. While still require application change, but needed time can be reduced.

3. Trend Micro SecureCloud
Another alternative is based on the recent announcement from Trend Micro about its  SecureCloud product in support of Windows Azure. It basically provides functionality to encrypt and decrypt virtual volumes at rest in real time without impacting applications/services functionality. As data security control, cipher keys can be managed remotely. While it seems very feasible for the project use case, we need to further explore this alternative as it’s very  new, especially in Azure, and the documentation and resources are quite scarce for now.

So we contacted Indonesia representative of Trend Micro to ask for pricing. For a number of instances of MongoDB that we plan, the pricing seems OK. But need further discussion and exploration for this.


That’s for now. Hopefully later I can update with some more practical guideline, after the project is approved and finished :)