even a relatively simple mobile application requires more than one database to function properly.
The good news? There are great open-source alternatives to MongoDB, each with their own unique strengths, weaknesses and optimal use cases. When it comes to NoSQL databases, there is no silver bullet and the real difficulty is knowing which database is right to solve your specific problems. While a number of options are available, these are top five MongoDB alternatives that developers and data engineers should know.
MongoDB excels if you need significant scalability and caching for real-time analytics or if you have a situation where you have no clear schema definition.
- Pros- MongoDB documents can be used to store the unstructured data in a way that is relatively straightforward to update should you have a situation where you are de-normalizing your database schema. Also, It offers a high insert rate which is useful in a situation where the write load is high.
- Cons - It is not built for transactional data (such as accounting systems). No function or stored procedure exists where you can bind the logic. As in all NoSQL, most solutions are not strongly ACID-compliant. While MongoDB does not provide durability by default, it does let you tune the configuration of a replica set should you be willing to sacrifice enough performance.
- Best for - Mobile apps, content management, real-time analytics, and applications involving IoT.
Apache Cassandra is a massively scalable open source non-relational database. It’s strengths include continuous availability, linear scale performance, operational simplicity and easy data distribution across multiple data centers and cloud availability zones. Cassandra was originally developed at Facebook (note: designed for massive scalability). Cassandra was written in Java and can be deployed on BSD, Linux, OS X, and Windows.
- Pros – Cassandra is able to scale while still being reliable. It is possible to deploy Cassandra across multiple servers easily. Part of this is due to the fact that Cassandra handles replication with minimal configuration, making it straightforward to set up. A new feature called Lightweight Transactions (LWT) was released in Cassandra 2.0 that brings ACID capabilities to your data model when needed. .
- Cons – No ad-hoc queries, meaning that you must model your data around the queries you want to surface, rather than around the structure of the data itself. If doing ad-hoc analysis is a requirement for your application then Cassandra may not be for you. While newer versions of Cassandra have limited support for aggregations with a single partition, aggregations are resource intensive even when possible.
- Best for - If you require very heavy and scalable write system and you need a responsive reporting system built on top of that stored data, e.g as in real time analytics.
If you need a high performance, high-speed database, Redis is hard to beat (but there are some important caveats). It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. You can run atomic operations on these types. Redis’ speedy performance is based on its working with an in-memory dataset.
And there’s the rub. It works best with clearly defined—and limited—datasets. It is typically not a standalone database (Craigslist deploys Redis alongside a primary database). It is best used when you’ve got rapidly changing data with a foreseeable database size (that can—or mostly can--fit in memory). Examples include storing real-time stock prices, analytics, communication and Leaderboards.
- Pros - A high performance, exquisitely fast database. Redis can be a good way to increase the speed of an existing application, including those in which quick access to specific keys is a priority.
- Cons - You have to plan your design and decide in advance how you want to store and then organize your data. Redis is not well suited for prototyping. Due to its working with in-memory datasets, it does not scale well. There is no official support for Windows builds, although Microsoft develops and maintains a Win-64 port of Redis.
- Best for - Redis is the way to go if you need a highly scalable data store shared by multiple processes, multiple applications, or multiple servers.
Written in Erlang, CouchDB is a document-oriented database that can be deployed as a standalone application framework or with high-volume, distributed applications. It is especially useful for web applications that handle huge amounts of loosely structured data due to its simple model for storing, processing, and accessing data. It is especially suited to CRM, CMS systems. Its capacity for master-master replication allows for easy multi-site deployments.
- Pros - Database consistency that is based on bi-directional replication. Also, as CouchDB functions as a web server, meaning, in the words of one of its creators, J. Chris Anderson, “you can serve applications directly to the browser without any middle tier.” So in essence, you can start up a couchdb server and start serving web pages automatically. It speaks the languages of the web (HTTP, JSON, HTML) s you can start writing "CouchApps" without another layer on top of the database.
- Cons – Needs occasional compacting.
- Best for- Web applications that handle huge amounts of loosely structured data due to its simple model for storing, processing, and accessing data.
Created in Java for hosting very large tables, Apache HBase provides random, real time access to data in Hadoop. It is especially well suited for storing multi-structured or sparse data. HBase is a great option “for storing semi-structured data like log data and then providing that data very quickly to users or applications integrated with HBase.” It is especially useful for any use case in which scanning huge, two-dimensional join-less tables are a requirement.
- Pros - Users are able to query HBase for a particular point in time, making “flashback” queries possible.
- Cons - Not optimized for classic transactional applications or even relational analytics.
- Best for- well suited for storing multi-structured or sparse data.
MySQL has has established itself as “the leading database choice for web-based applications, used by high profile web properties including Facebook, Twitter, YouTube, and all five of the top five websites. Additionally, it is an extremely popular choice as embedded database, distributed by thousands of ISVs and OEMs.” MySQL is a central component of the LAMP open-source web application software stack (and other "AMP" stacks; LAMP being an acronym for "Linux, Apache, MySQL, Perl/PHP/Python").
- Pros – Optimized for Web applications, becoming the “go to” platform for web developers and the default database for web-based applications. Well suited to small to medium web pages and is commonly deployed “for php and java based web applications that require a DV storage backend.” It remains open source, although with proprietary features that can be purchased from Oracle. It has a very well and loyal following amongst developers who sing its praises.
- Cons - Some issues with stability and clustering. With the regular version, it can be challenging to install a consistent database cluster with MySQL. Depending on the database storage, MySQL will support transactions or not.
- Best suited for - specialized primarily in OLTP transactions and if it's architectured properly, MySQL can be scaled to thousands of queries per second.
The range of NoSQL databases is significant, and each has it’s own optimal use cases. Therefore, when making a choice, knowing “where in your data — and where in its connections — the business value lies.” is crucial. Whatever you choose, Panoply gives you a single data management solution that will connect Redis, MongoDB, Cassandra, CouchDB and MySql, and more without having to write a single line of code.