So I jumped on the bandwagon and joined all the cool kids in scorning relational databases and playing with CouchDB (for a sort of long and excellent wrap up on this versus that, read this). I installed it, read through a lot of docs, thought I understood it pretty well and immediately started searching the tubes for what Ruby framework/library/gizmo would best allow me to get kickstarted with using it on a new project.
Turns out that was a bit premature, as my brain couldn’t really handle trying to model the domain of my intended application onto this totally new way of thinking so abruptly. It’s like when you’re a native Portuguese speaker and you’re drunk, trying to make yourself pass as speaking Spanish and you keep spilling out the German you’ve been learning for the past two years. I’m sure everyone can relate to that.
Time to take a step back.
The best way I found to get more familiar with this new type of database was to get rid of all the mental cruft I had around it. So I forgot about my app, its data model and the web framework and went on to play with the database alone.
I had seen this blog post about how to store hierarchical data in CouchDB and decided to play with the example data and views the guy provided (big thanks to Paul Bonser!). [Note: If you can follow that, you don't need to read this, as it's basically an expanded version of one of his examples.]
This is the data:
{
"docs": [
{"_id":"Food", "path":["Food"]},
{"_id":"Fruit", "path":["Food","Fruit"]},
{"_id":"Red", "path":["Food","Fruit","Red"]},
{"_id":"Cherry", "path":["Food","Fruit","Red","Cherry"]},
{"_id":"Tomato", "path":["Food","Fruit","Red","Tomato"]},
{"_id":"Yellow", "path":["Food","Fruit","Yellow"]},
{"_id":"Banana", "path":["Food","Fruit","Yellow","Banana"]},
{"_id":"Meat", "path":["Food","Meat"]},
{"_id":"Beef", "path":["Food","Meat","Beef"]},
{"_id":"Pork", "path":["Food","Meat","Pork"]}
]
}
Which corresponds to this tree:

To create a database and import this data, save that snippet as a file somewhere (I’ll use /tmp/data.json). CouchDB talks to the world in HTTP. We’re gonna use curl for that so you really see what’s going on. Web browsers are for wimps.
Usually CouchDB runs locally on 127.0.0.1, port 5984. To create a new database all you need to do is PUT to that address with the name you want your DB to have in the URL and no payload. We’ll save that URL in a variable because we’re gonna use it a lot.
DB="http://127.0.0.1:5984/hierarchical_data"
curl -v -X PUT $DB
Here -v means verbose, and -X lets you choose the HTTP method.
We have our database, now we import the data using the bulk document API. We specify the payload (data) with -d and feed it as a string from the file by prefixing its path with a @ (that’s one bash trick I didn’t know about!):
curl -v -d @/tmp/data.json -X POST $DB/_bulk_docs
And this is the view I was interested in (it lists all descendants of a node, including itself):
{
"language": "javascript",
"views": {
"descendants": {
"map": "
function(doc) {
for (var i in doc.path) {
emit(doc.path[i], doc)
}
}"
}
}
}
This thing about views going through all objects in your database took a little time to sink in with me. Initially I thought the query took place in the view, that I would somehow pass the node from which I wanted the descendants as the doc argument to that function. That’s not how it works. The query actually takes place in the view parameters, and the view function itself only flattens everything out into a convenient array so you can query it better.
This view I just mentioned, for example, doesn’t actually give you the elements in a sub-tree. It goes through each object (document) in the database and adds it to the array of results once for each of its ancestors.
To see if for yourself, save it in a file somewhere (/tmp/view.json in my case) and add it to the database. We do that by creating a special design document:
curl -v -d @/tmp/view.json -X PUT $DB/_design/tree
Now, to run it, just execute:
curl -v -X GET $DB/_design/tree/_view/descendants
Or see it in the browser: http://localhost:5984/hierarchical_data/_design/tree/_view/descendants
This is what you get:
{"total_rows":29,"offset":0,"rows":[
{"id":"Banana","key":"Banana","value":""},
{"id":"Beef","key":"Beef","value":""},
{"id":"Cherry","key":"Cherry","value":""},
{"id":"Banana","key":"Food","value":""},
{"id":"Beef","key":"Food","value":""},
{"id":"Cherry","key":"Food","value":""},
{"id":"Food","key":"Food","value":""},
{"id":"Fruit","key":"Food","value":""},
{"id":"Meat","key":"Food","value":""},
{"id":"Pork","key":"Food","value":""},
{"id":"Red","key":"Food","value":""},
{"id":"Tomato","key":"Food","value":""},
{"id":"Yellow","key":"Food","value":""},
{"id":"Banana","key":"Fruit","value":""},
{"id":"Cherry","key":"Fruit","value":""},
{"id":"Fruit","key":"Fruit","value":""},
{"id":"Red","key":"Fruit","value":""},
{"id":"Tomato","key":"Fruit","value":""},
{"id":"Yellow","key":"Fruit","value":""},
{"id":"Beef","key":"Meat","value":""},
{"id":"Meat","key":"Meat","value":""},
{"id":"Pork","key":"Meat","value":""},
{"id":"Pork","key":"Pork","value":""},
{"id":"Cherry","key":"Red","value":""},
{"id":"Red","key":"Red","value":""},
{"id":"Tomato","key":"Red","value":""},
{"id":"Tomato","key":"Tomato","value":""},
{"id":"Banana","key":"Yellow","value":""},
{"id":"Yellow","key":"Yellow","value":""}
]}
As you can see, there was no view parameter in that call, and this looks nothing like a list of descendants. Each emit call is responsible for one line of the output, which contains the id of the doc object, a key and a value. [I replaced doc as the value in the original emit call with '' to make it more readable.]. Note that lines do not appear in the order they were emitted (otherwise you’d see lines with the same id grouped together). CouchDB automatically sorts them by key. Another thing you’ll notice is that all the lines whose keys have the same element are also a descendant of that element. Convenient, huh?
Now, to get the descendants of one particular node, just query the view with that node’s name in the key:
curl -v -X GET 'http://localhost:5984/hierarchical_data/_design/tree/_view/descendants?key="Fruit"'
Or, again, use the browser: http://localhost:5984/hierarchical_data/_design/tree/_view/descendants?key=%22Fruit%22
And bingo!
{"total_rows":29,"offset":13,"rows":[
{"id":"Banana","key":"Fruit","value":""},
{"id":"Cherry","key":"Fruit","value":""},
{"id":"Fruit","key":"Fruit","value":""},
{"id":"Red","key":"Fruit","value":""},
{"id":"Tomato","key":"Fruit","value":""},
{"id":"Yellow","key":"Fruit","value":""}
]}

