Well hello there, welcome to Dieren Theater!
Right now you are probably asking yourself a legitimate question, what is Dieren Theater? Dieren Theater is a project to offer to anyone structured and open data extracted (in fact scraped) from the website of the Belgian parliament (also knows as La Chambre or De Kamer).
Again, you might are asking yourself "Why? What is the meaning of that?". Well in fact, open data is one of the first step for us citizens to appropriate ourself political informations (political as in « the life of the city »), the way it is treated and displayed, to move from passive consumer to active creator and to get involved in this democratic process.
Sound too theoretical? Lets me show you some example:
None of those would have been possible without data and none of this has yet happen (in my knowledge) in Belgium. So Dieren Theater is an attempt to build the first step by providing data to everyone and therefor allowing everyone to build similar tools or analysis.
It is interesting to notes that (except Baptise Coulmont works) all of these website are Free software (as in speech).
For the moment this project can be considered as an alpha, not because it is unstable be because it is very young, that a part of the data is missing (not already scraped) and that the available data (you have a list bellow) will probably needs polishing.
In a long term it is totally possible to think of parsing other similar websites than (lachambre|dekamer).be (like the senate website).
Info those links point you to a web interface that gives you an idea of the available data and their format, to get the actual data use the API bellow.
Warning small differences between the data displayed on the web interface and the data get from the API is possible. If you notice this situation, please report a bug.
Some other data generated by me or given to me. As the rest of the data, everything is licenced under the ODbL.
As requested by some persons, here are dump of each collections of the database. All data are in json since it's coming from mongoDB. For bandwidth reason I'm compressing them using xz:
Okay, lets start funny things, how to get data!
Everything starts here:
And in a more convenient way (in a shell if you are using an unix based operating system):
curl "http://www.dierentheater.be/api/v1/?format=json" | python -m json.tool
The API is self describing so I won't write a lot here about the data in it. If you want to have and idea of what is available just browse it via the links in the previous section.
An interesting possibility of the API is that you can add the GET argument "lachambre_id" to the url to query an exact data like this:
curl "http://www.dierentheater.be/api/v1/document/?format=json&lachambre_id=2005" | python -m json.tool
Always use this way a querying an individual document whenever possible if you want to build something that is supposed to last, the id field (coming from mongoDB) isn't reliable at all for the moment.
You can find tastypie API documentation here
I hope you'll have fun playing with it!
This website is build in Python, using django-nonrel, a fork of django to use mongoDB instead of a SQL database. Parsing is done using the very awesome BeautifulSoup with long sessions under IPython. The API is done with a patched version of tastypie-nonrel. And I use boostrap to make you believe that I know how to do a pretty website.
So you want to help? Well thanks a lot for your proposition!
So, how can you do this?
All data provided by this website is licenced under the Open Data Commons Open Database License (ODbL) also known as the "CC-BY-SA of databases".