Mostrando entradas con la etiqueta Share PSI. Mostrar todas las entradas
Mostrando entradas con la etiqueta Share PSI. Mostrar todas las entradas

martes, 5 de julio de 2011

About the Digital Agenda Assembly and Open Data licenses

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "La Asamblea de la Agenda Digital Europea y las licencias Open Data".

On June 16th I was lucky to be one of the 1300 participants in Brussels at the 1st Digital Agenda Assembly, the first high-level event organized by the European Commission about the Digital Agenda for Europe. The main objective was to evaluate the development of objectives and actions of the Europe's Digital Agenda, which as you know is one of the 7 strategic actions of the Europe 2020 strategy, aimed to put Europe on the path of smart, sustainable and inclusive growth. That's nothing.
by Jose M. Alonso

I was also honoured to be a speaker at the seminar held in the plenary room: "Open data and re-use of public sector information", where I talked about Euroalert and our experience building a business powered by open data that provides information services about public procurement for European PYMEs. At Euroalert's blog you can find more information about the presentation, that was live streamed, and the pictures taken Jose M. Alonso.

The two days were pretty intense and lots of discussions took place, many as a follow-up to the May Share PSI seminar, either in person or via twitter (see hashtag #daa11psi and stats). You can find a great summary at the Open Data@CTIC blog. I'm going to focus on two important details and one announcement that I'd like to share with you.

The first one is the speech about the State of the Digital Union, as vice president Neelie Kroes called her speech at the first plenary session. I recommend you to have a look at it, because in these times of illiterate politicians when it comes to technology, this remarkable woman is an inspiration. I never imagined I would recommend here the speech of a politician. I hope she will achieve all these ambitious goals.

The second one is a tweet from Michele Barbera, quoting Federico Morando, which did not have a big impact, though it represents an important topic I've been discussing with members of the open data community and which I find extremely important for the development a pan-european market for data re-use.


Many of us believe that if the future revision of the PSI directive endorse a simple license, applicable by default to datasets released by governments, a critic roadblock would be removed, especially for companies that operate with pan-European vision. From the point of view of a company like Euroalert, that creates value from data aggregated from multiple sources and countries, a unique EU license would contribute with legal certainty to operations in the single digital market

It seems to me that the pursuit of interoperability for the growing number of data licenses is becoming a grail that threatens to appear as one of the greatest barriers to data re-use.The idea implemented in the draft of the Spanish PSI Royal Decree, which includes as an Annex a very simple license to be applicable by default, in my opinion would be ideal to be copied into the new directive. Maybe with a EU logo or seal recognizable to all operators... but I am not qualified to judge which one is the best license to be included and endorsed by the directive.

Moreover, thanks to the long networking sessions (great success) of the DAA,  I was finally able to spend some time with Chris Taggart figuring out how Euroalert and Open Corporates  can exchange data and information for the benefit of our users and the Open Data community at large. Soon we will be releasing more details of what we hope will be a small contribution to the European single market.

The truth is that I came back very happy to belong to such an active and motivated community which is luring more and more members everyday and that step by step is becoming mainstream. Too bad that in this June full of events I have missed the great OKCon2011 in Berlín. You can not be everywhere.

jueves, 19 de mayo de 2011

Adobe and the PDF role in Open Data context

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "El papel de Adobe y los PDF dentro del panorama Open Data".

PDF
Source: Wikipedia
Last week in Brussels a lot what was discussed about Open Data in the Share PSI Workshop, but I also had a number of interesting conversations in the time between sessions. As I said, one issue that caught my attention was the attendance of large technology companies at an event about Open Data. Some like IBM and Orange made ​​their presentations based on the position papers they had sent, and others like Adobe were attending with relevant representatives. So far I had only identified the work done by Microsoft to enter the scene with its OGDI initiative, aimed at positioning Azure as an opendata friendly technology and the occasional attendance at relevant Open Data events from for example Telefonica or Google, but with a very low profile activity.

As I wrote, this step forward from large technology companies, which I think has very much to do with the work of ETSI and W3C, will be for good if we manage to focus their interests in the right direction, that is to say Open Data, but well done. And that's what this post is about. One of the topics that myself in my presentation about Euroalert, and others like Chris Taggart or François Bancilhon stressed, was the fact that the release of public sector data in Adobe PDF format is not adequate for reuse. It is clearly the best format for distributing information (reports, documents, presentations, etc.) but, as HTML itself, not to publish datasets or machine readable information.

Mr Marc Straat spoke from the public to tell us how Adobe is working on PDF technology so it can evolve to be a more useful format within the Open Data context. I must admit that I did not know about the potential of PDF as a container for other types of information, and after reading the article My PDF Hammer that Marc talked me about in a very pleasant conversation over lunch, I think I have a clear idea of what he meant.

I find very interesting the idea that a PDF container may associate the usual PDF file with its editable original version, whatever format it comes from: either a Microsoft Word document (.Doc), OpenOffice or LibreOffice (.Odt), or whatever. If Adobe works to promote that all the tools that convert documents to PDF do the job of embedding the source file, and contributes to disseminate and encourage the use of the feature, I think it would be a great step forward. And excellent news if the governments take as common practice the distribution of their reports in PDF along with the original file and datasets within the PDF file as a container.

However, after thoroughly reading the article, the idea of using PDF as a container for open data files, seems to me an even worse idea than in my first thought. I really see no advantage in using a PDF container instead of a simple ZIP file to distribute XML datasets along with XSD schemas and their documentation or manuals of course in PDF.

On the other hand I do see a major drawback. No programming language has native support for processing PDF files, while there are many options (and well known) for dealing with ZIP and of course XML, XSD or plain text. This means that an almost trivial data processing task, for which exist many well known open source tools, could be turned into a problem that will require licenses and very specific knowledge with no additional benefit for developers in exchange.

As a conclusion, I will say that I do not believe that solutions based on PDF as a container for open data should be promoted. Considering existing tools, it is much more practical for re-users to deal with information distributed in ZIP containers. Instead, it seems a great idea to start encouraging the practice of embedding the original files and even XML datasets within PDF reports or documents to facilitate reuse.

By the way, as a Linux user, I keep waiting for a version of Adobe Acrobat Reader for my platform (x86_64). At present I am not able to open most of the files that make use of advanced PDF features such as forms, published by public authorities.