miércoles, 15 de febrero de 2012

The important thing about the EU Open Data License is not which License will be selected.


Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Lo importante de la Licencia Open Data europea es que exista, no cuál será la licencia elegida".

I have often written about how the open data community in the UK is working to become a global leading force. The Prime Minister, David Cameron, is leading an ambitious Open Data agenda aimed to boost digital economy in Britain. I have also been very critical with spanish political leadership on open government issues and with the lack of ambition of open data initiatives already launched in Spain.

However something seems to be changing. Today, I have to say that I am very proud of the Spanish Open Data Community because of its leadership and support to the EU Open Data license campaign. As you know, the European Council is now carrying on the negotiations for the revision of the RISP Directive, and a few days ago, Andres Nin in his blog, launched a campaign to request a single licensing model for open data in the European Union (#1OdataLicenseEU). To date, over 330 supporters signed for the campaign, some as relevant as Patxi Lopez, president of the Basque Government. And surely many more will join in the coming days.


As you know, I am supporting the campaign because I believe that a single EU license is very important for the development of Open Data companies such as Euroalert.

However, during these past days, when I have been following and supporting the campaign, some relevant people and organizations of the European open data community told me why they are not actively supporting it. Main reasons regard to discussions about which would be the selected license or if it would be better to include an Open Data Definition rather than just a license.

In my humble opinion, at this point, it is not important to agree on which is the most appropriate license as there are a number of licenses that would fit perfectly.
"What is truly important is that we could have a single Open Data license for all  European Union countries to strengthen the single market"
And I am very concerned that this discussion may be reducing the strength of the campaign. It would be really sad that interests on the selection of the license would make us miss this opportunity. So, let's support the inclusion of a single Open Data license in the RISP Directive and then let's work so the license can be as simple as the one proposed by Alberto Ortiz on his blog. I wish it could be that easy.



viernes, 3 de febrero de 2012

A single Open Data licence is very important for companies

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Una licencia open data única es muy importante para las empresas".

As you all know in Euroalert we are working on the exciting challenge of building a pan-European platform to aggregate tendering and procurement data from all public authorities in the European Union Member States. A few months ago, following my presentation at the First Digital Agenda Assembly, I wrote about the importance of a single open data licence for the development of a pan-european data market.

The European Commission is currently reviewing the Directive on the Reuse of Public Sector Information, and a new draft was published in December. There is therefore a great opportunity to establish a single licensing model for open data in the European Union. However the ideal would be a the single open data global licence.

I will share a true example to elaborate on the subject. Euroalert aggregates data from many diverse sources with the most heterogeneous licenses, inspired by the laws of different countries and not always compatible between each other. Sometimes we've been asked, especially from NGOs, to release aggregated databases of procurement data for studies or other projects. Although we would have been glad to donate these datasets, we could not do it because of the restrictions of the licenses. As you know the licences of some datasets often forbid mixing its data with other databases, others set limitations to the commercial re-use or in some cases even any treatment other than the publication as we get it is limited.

Just studying the legal implications of the redistribution of our aggregated raw databases is something that we could not afford. Our project that will publish a Linked Open Data node for procurement data is facing a similar problem that could be easily solved with a single EU license.

Andres Nin yesterday launched a petition "Say to @neeliekroesEU we want a single #opendata licence in the #EU" to raise awareness of the matter. This is truly a key issue in the development of companies that aim to create wealth through pan-European initiatives for the reuse of public data. And one more opportunity to help the development of a European single market in which companies powered by open data, like Euroalert, operate. I encourage you to sign the petition to the European Commissioner Neelie Kroes and to help us in its promotion in order to make the voice of the Open Data Community heard in the European Institutions.

martes, 5 de julio de 2011

About the Digital Agenda Assembly and Open Data licenses

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "La Asamblea de la Agenda Digital Europea y las licencias Open Data".

On June 16th I was lucky to be one of the 1300 participants in Brussels at the 1st Digital Agenda Assembly, the first high-level event organized by the European Commission about the Digital Agenda for Europe. The main objective was to evaluate the development of objectives and actions of the Europe's Digital Agenda, which as you know is one of the 7 strategic actions of the Europe 2020 strategy, aimed to put Europe on the path of smart, sustainable and inclusive growth. That's nothing.
by Jose M. Alonso

I was also honoured to be a speaker at the seminar held in the plenary room: "Open data and re-use of public sector information", where I talked about Euroalert and our experience building a business powered by open data that provides information services about public procurement for European PYMEs. At Euroalert's blog you can find more information about the presentation, that was live streamed, and the pictures taken Jose M. Alonso.

The two days were pretty intense and lots of discussions took place, many as a follow-up to the May Share PSI seminar, either in person or via twitter (see hashtag #daa11psi and stats). You can find a great summary at the Open Data@CTIC blog. I'm going to focus on two important details and one announcement that I'd like to share with you.

The first one is the speech about the State of the Digital Union, as vice president Neelie Kroes called her speech at the first plenary session. I recommend you to have a look at it, because in these times of illiterate politicians when it comes to technology, this remarkable woman is an inspiration. I never imagined I would recommend here the speech of a politician. I hope she will achieve all these ambitious goals.

The second one is a tweet from Michele Barbera, quoting Federico Morando, which did not have a big impact, though it represents an important topic I've been discussing with members of the open data community and which I find extremely important for the development a pan-european market for data re-use.


Many of us believe that if the future revision of the PSI directive endorse a simple license, applicable by default to datasets released by governments, a critic roadblock would be removed, especially for companies that operate with pan-European vision. From the point of view of a company like Euroalert, that creates value from data aggregated from multiple sources and countries, a unique EU license would contribute with legal certainty to operations in the single digital market

It seems to me that the pursuit of interoperability for the growing number of data licenses is becoming a grail that threatens to appear as one of the greatest barriers to data re-use.The idea implemented in the draft of the Spanish PSI Royal Decree, which includes as an Annex a very simple license to be applicable by default, in my opinion would be ideal to be copied into the new directive. Maybe with a EU logo or seal recognizable to all operators... but I am not qualified to judge which one is the best license to be included and endorsed by the directive.

Moreover, thanks to the long networking sessions (great success) of the DAA,  I was finally able to spend some time with Chris Taggart figuring out how Euroalert and Open Corporates  can exchange data and information for the benefit of our users and the Open Data community at large. Soon we will be releasing more details of what we hope will be a small contribution to the European single market.

The truth is that I came back very happy to belong to such an active and motivated community which is luring more and more members everyday and that step by step is becoming mainstream. Too bad that in this June full of events I have missed the great OKCon2011 in Berlín. You can not be everywhere.

jueves, 19 de mayo de 2011

Adobe and the PDF role in Open Data context

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "El papel de Adobe y los PDF dentro del panorama Open Data".

PDF
Source: Wikipedia
Last week in Brussels a lot what was discussed about Open Data in the Share PSI Workshop, but I also had a number of interesting conversations in the time between sessions. As I said, one issue that caught my attention was the attendance of large technology companies at an event about Open Data. Some like IBM and Orange made ​​their presentations based on the position papers they had sent, and others like Adobe were attending with relevant representatives. So far I had only identified the work done by Microsoft to enter the scene with its OGDI initiative, aimed at positioning Azure as an opendata friendly technology and the occasional attendance at relevant Open Data events from for example Telefonica or Google, but with a very low profile activity.

As I wrote, this step forward from large technology companies, which I think has very much to do with the work of ETSI and W3C, will be for good if we manage to focus their interests in the right direction, that is to say Open Data, but well done. And that's what this post is about. One of the topics that myself in my presentation about Euroalert, and others like Chris Taggart or François Bancilhon stressed, was the fact that the release of public sector data in Adobe PDF format is not adequate for reuse. It is clearly the best format for distributing information (reports, documents, presentations, etc.) but, as HTML itself, not to publish datasets or machine readable information.

Mr Marc Straat spoke from the public to tell us how Adobe is working on PDF technology so it can evolve to be a more useful format within the Open Data context. I must admit that I did not know about the potential of PDF as a container for other types of information, and after reading the article My PDF Hammer that Marc talked me about in a very pleasant conversation over lunch, I think I have a clear idea of what he meant.

I find very interesting the idea that a PDF container may associate the usual PDF file with its editable original version, whatever format it comes from: either a Microsoft Word document (.Doc), OpenOffice or LibreOffice (.Odt), or whatever. If Adobe works to promote that all the tools that convert documents to PDF do the job of embedding the source file, and contributes to disseminate and encourage the use of the feature, I think it would be a great step forward. And excellent news if the governments take as common practice the distribution of their reports in PDF along with the original file and datasets within the PDF file as a container.

However, after thoroughly reading the article, the idea of using PDF as a container for open data files, seems to me an even worse idea than in my first thought. I really see no advantage in using a PDF container instead of a simple ZIP file to distribute XML datasets along with XSD schemas and their documentation or manuals of course in PDF.

On the other hand I do see a major drawback. No programming language has native support for processing PDF files, while there are many options (and well known) for dealing with ZIP and of course XML, XSD or plain text. This means that an almost trivial data processing task, for which exist many well known open source tools, could be turned into a problem that will require licenses and very specific knowledge with no additional benefit for developers in exchange.

As a conclusion, I will say that I do not believe that solutions based on PDF as a container for open data should be promoted. Considering existing tools, it is much more practical for re-users to deal with information distributed in ZIP containers. Instead, it seems a great idea to start encouraging the practice of embedding the original files and even XML datasets within PDF reports or documents to facilitate reuse.

By the way, as a Linux user, I keep waiting for a version of Adobe Acrobat Reader for my platform (x86_64). At present I am not able to open most of the files that make use of advanced PDF features such as forms, published by public authorities.

lunes, 16 de mayo de 2011

Road Blocks to a Pan European Market for PSI Reuse, a long summary

Note: This article is a translation with a few add-ons of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Obstáculos para el desarrollo de un mercado pan-europeo de reutilización, un largo resumen".

SharePSI #daa1psi
Source: ePSIplatform
On Tuesday 10th and Wednesday 11th I participated in Brussels at the workshop "Removing Road Blocks to a Pan European Market for PSI Reuse" held by the Share PSI initiative. It was superbly organized by W3C y ETSI for the European Commission, and gathered a good number of members of the European open data community: governments, businesses and civil society organizations.

The European Commission will use the output of the SharePSI workshop at the 1st Digital Agenda Assembly event: "Beyond raw data: public sector information, done well". Ultimately the contributions, like those obtained through the public consultation on Open Data held at the end of 2010, will help to make the reform of the PSI directive richer and more effective.

In my opinion, compared to other seminars, the level of the discussion was very high in most of the sessions, though there are a few topics that are recurrent in this type of events (pricing, licensing, return of investment and privacy). Clearly, this shows that we are not being able to resolve the issues satisfactorily. In a few occasions it was also clear that not everyone is at the same level of discussion, but it is entirely normal because the open data community is growing at a rapid pace and many new people are joining the discussion.

I think the seminar was very intense and productive and this was largely due to the excellent work done by the program committee and especially by Margot Dor (ETSI) and Thomas Roessler (W3C) to create the workshop programme from the large number of position papers sent from all over Europe.

As you know I usually attend this type of events, and this time one of the things that caught my attention was, the presence of representatives of large companies in the discussion. It was rare until now that Adobe, IBM or Orange were interested in the Open Data movement. And I strongly believe that this is a good thing, because their software and their position in the IT services in governments can provide solutions that will drive the development of a more effective Open Data.

I guess that their presence has much to do with W3C and ETSI. I hope they are here to stay and contribute much to the debate and the solutions, though for now they are still far from the more advanced group. However, I also believe that it is the responsibility of those of us who have been long time in the debate, to bring them to the vision of what is the main objective of the Big Idea, Open Data, but well done.

I will also highlight the number of national government representatives that I could identify in the room (at least from Spain, Denmark, the Netherlands and Finland). And New Zealand representation in the person of Laurence Millar, who described us the situation in his country, which is enviable in many respects, such as the very active community of developers they have.

I found very interesting the discussion on the pricing of the meteorological datasets and the apparent long-running dispute which has been brought now to the open data ring by the Association of Private Meteorological Services (PRIMET). I think it's for good that this happened and that these discussions come to enrich the open data debate. There were also several new use cases like the very interesting FearSquare, presented by Andrew Garbett or the impressive Arcticweb that Erin Lynch showed us, that called my attention

On the other hand, it was a pleasure to hear entrepreneurs like François Bancilhon speaking about his work at Data Publica or like Chris Taggart on his excellent Open Corporates, which I have been following for a while. The risks that people like them are taking contribute greatly to push the boundaries of what can be done, although for sure they may have to face problems, because they are disrupting the established situation. My most sincere admiration, respect and support to go ahead.

On my side, I presented the work that Euroalert is doing to develop our 10ders Information Services platform, which aggregates data on procurement notices across the EU. You can find the slides and the summary of the intervention at Euroalert Blog. I also was the moderator of the second half of the session on Use Cases, where we heard the complains of the Federation of European Publishers about the difficulties they face in competing with they still call the culture of free. I was surprised by their approach in the context of Open Data, which I believe is completely misleading again. I hope they will take a more positive position in the future. I was also lucky to have one of the best quotes of the event, made ​​by Hervé Rannou, from ITEMS International, who presented the lessons learned in the Open Data project of the City of Marseille: "The use of the data is infrastructure, like roads"

On June 16th we will see at the 1st Europe Digital Agenda Assembly the most interesting outcomes and conclusions that the European Commission has harvested from this Workshop. I hope it will be useful to take firm steps forward to enable a more favourable environment for market growth based on the development of new information services. In short, for companies powered by opendata as I like to call them. I also hope that among all of them Euroalert will be a remarkable Open Data company, both because of the success of its value proposition and for our contribution to the development of this environment.

To finish this long post, though the occasion deserved it, I will leave some resources that you will find useful to dive into what was said in the workshop. I have used them review what was said in the last two sessions, which I could not attend. I highly recommend to read the excellent work done in collaborative note-taking which reflects faithfully the discussions. You can also check out the tweet archive created by the University of Lincoln, the slides used by speakers, the list of twitter accounts of attendees, the position papers submitted or the snaps of event.

viernes, 24 de diciembre de 2010

Data Driven Journalism in Spain with Pro Bono Público

Note: This article is a translation with a few add-ons of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Data driven journalism en España de la mano de Por Bono Publico".

 Fuente: http://www.slideshare.net/mirkolorenz/data-driven-adamOne of the most interesting things that happened to me in London during the condensed course of government transparency I attended in November, was the relationships I could strengthen. And not only with Open Data celebrities. One of these relations was with the people from Pro Bono Publico, the civil society association you may know for organizing the challenge Abre Datos. This contest was first held in Spain to promote the reuse of public sector information.

Since I met Jacobo Elosúa, David Cabo and Álvaro Ortiz at the Open Government Data Camp 2010 and we shared a good time discussing about the state of open data in Spain, I have followed a bit more closely their activities. For example, last week Pro Bono Publico released an excellent contribution to the spanish public consultation on the draft Royal Decree on the reuse of public sector information. Along with them, Fundación CTIC has also published comments and the Spanish Ministry of Industry has released an over 100 pages document with all 26 contributions received. Of course, one of those contributions is the one that Euroalert has published on its official blog.

Another initiative in which Pro Bono people are working on, and that I really think as very valuable, is the organization of a workshop on data driven journalism, aimed primarily at journalists and other information professionals. The conference is tentatively scheduled for February 15th 2011 and for the moment we are in the process of testing the real interest and gathering some feedback. I say we, because as a few weeks ago I spoke at the University of Valladolid to students of journalism about the challenges of open data, I'm trying to contribute what I learned from this experience. Of course, we ask for your comments and for your help to spread the event. As we get more details we will release them.

The words of Sir Tim Berners-Lee seem to have been an inspiration to all of us who were attending there:
"The responsibility [to analyze datasets] needs to be with the press"
... especially for Alvaro when Sir Tim took a sit beside him.

viernes, 26 de noviembre de 2010

Lessons about Open Data at the Open Government Data Camp in London

Note: This article is a translation with a few add-ons of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Grandes lecciones sobre open data en el Open Government Data Camp en menos de 24 horas".


After changing my agenda, last week I was finally able to attend the second day of the Open Government Data Camp held in London. In addition, invited by the Open Knowledge Foundation I was privileged to be present at the press conference which announced the release as raw data of the database containing all the UK Government expending over 25,000 pounds (about 29,000 Euros).

The event where the announcement of the release was made was presented and explained by Minister Francis Maude escorted by Professor Nigel Sadbolt, Tom Steinberg, Rufus Pollock ... and Tim Berners-Lee, inventor of the web and advisor to the British Government. It is highly recommended to watch the video of David Cameron as a lesson on Open Government. This video was the opening of the event, so we must assume that the Prime Minister "wanted to be" there.

Despite being "only" an official announcement, there were a few things that caught my attention and I'd like to share, just in case we can be inspired in Spain, and perhaps in other countries:
  • It was not a typical press conference, at least as we know them in Spain. I mean, it was not a long speech of the minister, followed by questions from reporters. Instead, it was organized as very short presentations about many important points around the matter of the announcement, both political and operational and even technological.
  • There were presentations from all the organizations that had been involved in the project and not just from the government. Representatives from mySociety, the Transparency Board and the OKF, made their points. But there was also an important space reserved for independent software developers such as Chris Taggart, who had been working in a demo of the possibilities of the data released.
  • The wide technological culture that exhibit the members of government, including the Prime Minister. Comparisons are odious, so I will not make any. This is an exercise for you.
  • The commitment of UK Government to transparency and open data is remarkable. "We know this will be a very uncomfortable process within the government departments" said Maude. In fact, uncomfortable news began to circulate during the event, such us the payments to Nick Clegg's wife's law firm or the rents paid to Prince Charles from the Ministry of Justice. But the commitment is as firm as defined by the words of Francis Maude “It is our ambition to make the UK the most transparent and accountable Government in the world"
I noticed that in the UK the work done for years by activists of all kinds: journalists, associations, developers, companies and even civil servants and politicians has already achieved a great success: the message of openness and transparency is firmly installed at the highest political levels, of all signs. I think a good summary are these words said by Rufus Pollock, founder of OKF, half jokingly half seriously:
"It's very encouraging to see that UK government is becoming more radical than me in terms of open data and transparency"
Thanks to the OKFN for the great work organizing the Open Government Data Camp and in particular to Jonathan and Rufus himself for pursuing my attendance at the event. I hope that all the networking and interaction and the envy and inspiration that gives the speed at which things are developing there, allows us to COPY in Spain and create a true Open Data community. I believe we have the grounds, but we need a little more action, less complacency ... and some more reusable raw data.

About OGDCamp sessions, I'll write another post as this is becoming a bit heavier than I'd like to :) I recommend you check out the hashtags #openuk and #ogdcamp and conclusions of the working groups.

PS: On my way back to Valladolid, I read at the airport that the Government of Spain had released the draft Royal Decree implementing Law 37/2007 of 16 November, on reuse of public sector information and launched a public consultation on it. Personally I prefer the approach of less regulation and more publishing of open data as raw data, but still congratulations on the move. As soon as I can read it in detail I will try to make my contribution.

domingo, 7 de noviembre de 2010

Quick guide to Opendata EC Public Consultation, the on-line survey on the PSI Directive

The Digital Agenda for Europe lists the revision of the Directive 2003/98/EC on the re-use of public sector information (PSI Directive) among its first key actions. It is worth reminding the key role of Spanish Presidency including PSI reuse in the Granada Declaration showing a commitment with promotion of open data that we all hope it will be followed with some action.

In September European Commission opened a public consultation with the purpose of gathering information from as many sources as possible on their views on the review of the PSI Directive. The consultation will feed into the debate on possible policy options that should be considered for the review, and will contribute to the impact assessment that will be carried out subsequently, associated with proposals for possible legislative or other measures.

As you all know a public consultation is a regulatory tool employed to seek the opinions of interested and affected groups in a certain matter. Gathering their views, opinions and contributions using the Internet both Member State administrations and EU institutions can understand the needs of citizens and enterprises better.

The PSI con­sul­ta­tion doc­u­ment is pub­lish­ed only in English but res­pon­ses are ac­cep­ta­ble in all EU lan­gu­a­ges, as it is not stated oth­er­wise on the con­sul­ta­tion do­cu­ments them­selves. So please do not think that being a non-english fluent speaker is a barrier to participate.

Werther you are a government, a public sector content holder, a commercial or non-commercial re-user or other interested parties your contribution is really important because your view will feed on the review of the PSI Directive.

The consultation includes questions divided on several blocks:

  1. the PSI re-use context and possible action to consider,
  2. substantive issues regulated by the PSI Directive,
  3. practical measures,
  4. changes that have taken place and barriers that still exist
  5. other issues to comment regarding the review of the PSI Directive.
It will take you some time, perhaps over 30 minutes to make a good contribution, although it is stated that it is only 15 minutes to complete the survey. But is it worth the effort. You can answer online survey on the PSI Directive but I should recommend to have a look at the pdf version of the consultation document, in order to have a complete view, before answering.

It is also worth noting the commitment that the Commission Vice-President for the Digital Agenda, Neelie Kroes is showing with the open data community. As she pointed out, much of Europe's PSI is insufficiently or even sometimes not exploited, which means losing out a great opportunity to generate innovation.

The replies to this consultation will be published on the Commission's PSI web site. The consultation will run until 30 November 2010 and 3 weeks before closing, EC has gathered about 350 responses, which is not much, comparing with the importance of the matter. So please contribute with your views to build a more innovative Europe.