ifacethoughts

Scaling By Eliminating The Database

Frank Sommers discusses an idea presented by Robert McIntosh about building a high volume application without a RDBMS. The idea itself is not new, I have seen references to scalability problems converging towards the database. But the details with which Robert McIntosh tries to achieve some advantages of the database, like indexing, through other options are engaging.

Again, SQL is wonderful for queries, especially dynamic queries. This could be countered with using either XPath style queries or a search engine like Lucene, or even a combination of the two. Yes, yes, nothing really compares the power of SQL for query capabilities and I know that. The point is that it would really depend on the kind of queries needed. If you are not doing analytical queries and really only need “get object where property is like X” style queries, then other methods can provide that.

In spite of too many keywords there, the words that caught my attention were “it would really depend on the kind of queries needed”. As usual, I am trying to see why should we eliminate the database. There have been thoughts on how database scalability can be achieved.

I have gone down this path earlier, especially when thinking about single-source publishing. Can we use output of the print pubishing tools directly for online rendering? I believe there can be a significant advantage because this will eliminate a lot of administration and processes. Of course, there we also have companion tools nowadays that do export the markup, but it is not efficient in context of many CMSs. This thought has always ended up with a hybrid model where the actual content is stored in files and the pointers to them in the database.

I consider a database is necessary today because of two reasons – along with the main content we serve a lot of related content and data integrity. The related content includes everything from bits and pieces from past, contextual information to the contextual advertisements. For example, even for a simple activity like blogging, related posts are considered helpful. The data integrity is enforced through various mechanisms, right from transactions to the various primary and foreign key implementations.

Robert McIntosh has already presented the alternatives like Lucene to create indexes to speed up retrieval. This is one of the biggest advantages of a database today. However, I am not able to find replacements for relating content and data integrity. Of course of this can be moved to the software, by building a domain model, but not all.

However, a database does create problems, in scalability and sometimes in modeling the content. Because of some of its advantages it has become indispensible. Being able to create scalable applications without a database will open a lot of options for some scenarios.

Discussion [Participate or Link]

  1. Thoughts from the trench - by Prakash Muralidharan » Can we do away with RDBM’s ? said:

    [...] Products, Innovation — Prakash Muralidharan @ 2:54 pm Abhijit brings up an interesting thought on the possibility of scaling without a database:"Again, SQL is wonderful for queries, [...]

Say your thought!

Who are you?

If you want to use HTML you can use these tags: <a>, <em>, <strong>, <abbr>, <code>, <blockquote>. Closing the tags will be appreciated as this site uses valid XHTML.

This is the weblog of Abhijit Nadgouda where he writes down his thoughts on software development and related topics. You are invited to subscribe to the feed to stay updated or check out more subscription options. Or you can choose to browse by one of the topics.