Yesterday we posted my NoSQL Tech Brief on GigaOM Pro. It’s an exciting time (both in the database market and for me as this is my first report for GigaOM Pro). Sweeping change is taking place across the database landscape. Read on to find out how and why – and what it means to you.
Carlo Strozzi coined the term NoSQL (“not only SQL”) in 1998, referring to a lightweight database that did not expose a SQL interface. In 2009, Eric Evans of Rackspace dredged up the term, whose meaning is still being debated by its minter, while organizing an event with Johan Oskarsson of Last.fm to discuss the growing number of non-relational distributed data stores.
No one likes this term. Attempting to describe something by what it isn’t typically doesn’t work—and this is about data store relationships and not about SQL at all. Yet NoSQL databases have significant advantages, including:

  • Seemingly infinite scalability
  • Extraordinary fault tolerance
  • High availability
  • A design-friendly lack of schema
  • Integration of both RESTful and cloud computing technologies

Disadvantages revolve around a basic fact: These are not relational databases built to rapidly process transactions, perform error checking, and maintain data integrity.
Today’s large-scale databases are designed to dynamically repair node failures by partitioning and replicating data across clusters. Partitioning the data not only minimizes the impact of any single hardware failure but also distributes the load of database operations. Nonrelational databases typically have the ability to maintain multiple hot copies of data. Nodes can fail or be added and replications compiled and moved on the fly. Some NoSQL databases are flexible enough to allow for control over which objects are stored on which replicas to improve performance and scalability.
SQL RDBMS transaction processing is not going to disappear. Traditional database design principles still hold true when transactional integrity and immediate consistency are required. However, where horizontal scaling to millions of concurrent users is a requirement, nonrelational or NoSQL databases warrant serious consideration.
Has this posting piqued your curiosity? If so, read the full report on GigaOM Pro. The report includes an history of the database market and how we got to NoSQL, an overview of the current NoSQL landscape, evaluations of specific NoSQL software, and cases studies demonstrating the utility of NoSQL for real businesses.

We had a lively discussion at Structure 2010 about Scaling Databases in the Cloud.

You can view an archived stream of the panel discussion at the Structure website, but for now here are some of the cool things we discussed:

  • Is your project appropriate for NoSQL?  Start by looking at the data model and the workload and you’ll know what works best for your needs
  • NoSQL is not necessarily any better for the cloud than SQL.  That’s not a compelling reason to use it.
  • Advantages of NoSQL include being designed from the ground up for parallel distributed processing and automated clustering, low latency, in memory caching and concurrency processing
  • A major advantage of NoSQL databases are their ability to scale horizontally on commodity hardware
  • The ability to recover transparently from hardware failure
  • Developers don’t like SQL to begin with so we’re not going to have to fire them all to have them learn NoSQL solutions
  • Neo4J says 70-80% of their implementation are in the enterprise and Terracotta says they’re 50-60% in the enterprise

Stay tuned for more NoSQL and Structure 2010 info.

UPDATE: Watch the panel discussion, below.

Today is the big day, the culmination of 3 months of research on NoSQL (not-only SQL) research and writing. It’s the panel discussion at 8am at Structure 2010.

Scaling the Database in the Cloud

As we move our legacy applications to the cloud we are discovering that not all elements scale equally — in particular, legacy databases. In this panel we investigate the differing approaches taken by hot new technology startups and the options customers have when it comes to choosing between scaling legacy systems or transferring to new database platforms.

I’m moderating and speakers include:
Roger Bodamer SVP Product and Engineering, 10gen
Emil Eifrem CEO, Neo Technology
Mike Hoskins CTO and GM, Integration Products, Pervasive Software
Paul Mikesell Founder and CEO, Clustrix
Amit Pandey CEO, Terracotta
James Phillips Co-Founder and CSO, NorthScale

Watch live streaming video from gigaomtv at livestream.com
WordPress Appliance - Powered by TurnKey Linux