My name is Simon McManus. I work as a web developer. After recently attending a UKGovBarCamp I noticed that it was difficult to reuse parliament's publications. I made a comment on a parliamentary blog post
which resulted in Richard contacting me via e-mail. The fact that I was able to comment in the first place has made it possible for me to speak to you now. Thank you for this opportunity.
The essential dissatisfaction I have with the parliament website is that the information is not being published for re-use. In this paper I will explain what I mean by this, why I believe it and offer some alternative solutions. I would be more than happy to come and discuss this with you further and would appreciate any feedback that you might have.
Websites like Wikipedia demonstrate how conversations can take place around information. For each article there is a discussions tab which allows readers and authors to discuss the articles. If you would like the same thing to occur around your meeting transcripts and legislation you need to change the format to make the data referenceable, commentable and easily queried by a programming language.
I believe there are five steps to opening up Parliamentary data:
each of these parts will now be discussed.
Publishing data online holds little value if the data has not been licensed for reuse. I suggest all parliamentary publications be made available under a creative commons copyright, so that anyone can republish commercially or otherwise.
It is important that the raw data dumps which make any application possible are also available to the public. The data should be available be with no style information, no scripting nothing but pure, unadulterated data. While there is value in Parliament building websites/applications it is far more important that developers have equal access to the original data so they can build other applications without the unaffected by the preconceptions of parliament.uk developers.
Currently most parliamentary publications are in PDF form. This causes a number of problems :
Theyworkforyou.com have put together a basic template of how parliament can improve the semantics of parliamentary publications. More details of their suggestions can be found at the following address :
http://www.theyworkforyou.com/freeourbills/techy
I fully endorse these suggestions. If followed, they would make it a great deal easier for developers like me to build new and richer interfaces because it makes the data more meaningful.
When writing the paper it was particularly difficult to find the references from transcripts of your Committee's meetings. It was sent to me in the following form :
"The transcript of the meetings the Committee has had as part of its inquiry are available here:
http://www.publications.parliament.uk/pa/ld/lduncorr.htm#info
(See in particular questions 78, 85 and 86 of the 1 April meeting)."
Finding the information required the following steps to be taken:
I would like to see an implementation where clicking the following three URLs would take you straight to view each question, allow you to read its answer and comment against either.
If the information is published in HTML files which are being indexed by Google the bills will be findable in google and extend your outreach to every single user of google.
The simplest way to expose data on the web is to break it down into small individually addressable sections each of which has a unique URL. These URLs can then be sent round in emails, added to a user's favorites or programmatically interrogated.
An Application Programming Interface (API) provides developers an interface for interacting with a data set easily. By making it possible to programatically search legislation and comment against a particular section from a remote site, it becomes much easier for people to build new interfaces for the available data. A good API would make it really easy to build new ways of browsing, searching and commenting on legislation.
Data should be exposed so that it can be presented in ways never expected by those collating the data. It is through this approach that you help people to view and, most importantly, interact with both Houses regarding proposed legislation.
A good API will make data available in a number of different formats. HTML, XML and JSON are a good starting point. From the earliest possible opportunity any code being used to expose data should be open sourced so that developers can extend the existing code base without needing to start from scratch. Not only does this allow people to build things more quickly, it allows developers to extend functionality and form a community of developers working together to improve the nations data infrastructure.
If Parliament wants to engage with people it will be a great deal easier on sites they already visit rather than the parliament.uk site. You cannot expect to engage the majority of the electorate at parliament.uk. It needs to be made particularly easy to integrate the goings on of both Houses into any website so that useful (relevant) data can be pulled in about a given subject.
A site about digital rights and copyright should be able to make a call to the API which looks for any recent mentions of "Digital Rights" and "Copyright" and can then embed the results in its own site. I also suspect that providing functionality to comment against the results would massively increase the potential of both Houses to engage with the electorate.
Below I have put together a general criteria for exposing data on the web :
The following criteria are not essential but I suspect could have a major effect on parliaments ability to interact with the people online :
Please note all that all the above should be possible for very little cost. All the software required is available for free with open source software licenses. The primary cost should be for one or two developers who work with the community to expose data based on user/developer feedback.