Mostrando entradas con la etiqueta open data. Mostrar todas las entradas
Mostrando entradas con la etiqueta open data. Mostrar todas las entradas

miércoles, 31 de agosto de 2016

What if we could calculate our own real-time customized “official indicators”

Note: This article is a translation of what I wrote in Spanish for the International Open Data Conference 2016 blog. You can read the original post in spanish: ¿Y si todos pudiésemos calcular nuestros propios “indicadores oficiales” personalizados en tiempo real and in english: What if we could calculate our own real-time customized “official indicators”?.
Almost all governments worldwide and multilateral institutions such as OCED, the UN, the World Bank or the European Commission began their open data policies with the release of the statistic datasets they produce. Because of that, we have a big amount of indicators we can work with in reasonably accessible formats to study almost any issue, either environmental, social, economical or a combination of all these aspects.

Besides providing us with the datasets, in some cases they have created tools to access the data easily (APIs), and even applications that help us work with the indicators (visualizations).
These indicators follow periodic cycles which can happen monthly, yearly or even multiannual perdiods due to the high cost of their production. In general, the methodologies used to calculate the indicators are not available for citizens. In the best-case scenario, they are documented in a very superficial way on their fact sheet.
Photo by William Iven
Now let’s imagine for a moment that the national social security systems, the company registers, the customs registers, the environmental agencies, etc, release the data they hold as open datasets in real time. One of the effects we could easily imagine is that a lot of indicators that these days are being released periodically could be known and, even better, explored in real time.

Besides, this would remove the possibility for anyone to get privileged information, considering that we all could have the same ability to analyze the evolution of the indicators to take our own decisions. Even more, we could customize calculations according to our own particular situation by working on the methodologies.
The fact is that in many cases, the production cycle of some indicators could be shortened until we get closer to ‘real time’, and the cost of production could be reduced greatly as well thanks to open government data.

Even though this is a big step ahead, I don’t think we should settle down with having the indicators as open data; we should aspire to examine the open datasets and methodologies used to calculate these indicators and even customize them, because if conveniently anonymized there is no reason for them not to be released as open data.

sábado, 20 de agosto de 2016

Some very simple practices to help with the reuse of open datasets

Note: This article is a translation of what I wrote in Spanish for the International Open Data Conference 2016 blog. You can read the original post in spanish: Algunas prácticas muy sencillas que facilitan la reutilización de conjuntos de datos abiertos and in english: "Some very simple practices to help with the reuse of open datasets".
In the past few years, an important number of materials have been published to help data holders with that release of open government data. In his article “The good, the bad…and the best practices”, Martin Alvarez gathered a total of 19 documents, including guides, manuals, good practices, toolkits, cookbooks, etc. Different kinds of authors have invested resources in the development of these materials: national governments, regional governments, foundations, standardization organisms, the European Commission, etc.; hence the number of different perspectives.
On the other hand, a large amount of effort is still being made in the development of standards for the release of open government datasets, either for general purposes or specific domains.
Photo by: Barn Images
Too often, however, very easy rules that facilitate sustainable open datasets reuse are forgotten when datasets are published. I am just mentioning some of the obstacles we often find when we explore a new dataset and assess whether it is worth incorporating it to our service:
  1. Records do not include a field with a unique identifier, which makes it very difficult to monitor changes when the dataset is updated.
  2. Records do not contain a field with the date when it was last updated, which also complicates monitoring which records have changed from one publication version to the next one.
  3. Records do not contain a field with the date of creation, which makes it difficult to know the date each one were incorporated to the dataset.
  4. Fields do not use commonly agreed standards for the type of data they contain. This often occurs in fields with dates and times, or economic values, etc…but is also common in other fields.
  5. Inconsistencies between the content of the dataset and its equivalent published on HTML web pages. Inconsistencies can be of many types, from records published on the website and not exported to the dataset to differences in fields that are published in one format or the other.
  6. The record is published on the dataset much later than on the website. This can make a dataset useless for reuse if the service requires immediacy.
  7. Service Level Agreements on the publication of datasets are not specified overtly. It is not that important to merely judge those agreements as good or bad; what is really important is that they are known, as it is very hard to plan data reuse ahead when you do not know what to expect.
  8. These elements are not provided: a simple description about the content of the fields and structure of the dataset, as well as the relevant criteria used to analyze that content (lists of elements for factor variables, update criteria, meaning of different states, etc.).
As you can see, these practices are not necessarily linked to open-data-related work; they rather deal with the experience in software development projects, or simply with common sense.

Even though most of them are very easy to implement, they are of great importance to convince somebody to invest their time in an open dataset. As you may know, dealing with web scrapping can be more convenient than reusing open datasets; And these are a few simple practices that make the difference.

sábado, 6 de agosto de 2016

How far should a public administration go with regard to the provision of value—added services based on open data?

Note: This article is a translation of what I wrote in Spanish for the International Open Data Conference 2016 blog. You can read the original post in spanish: ¿Hasta dónde debe llegar la administración en la prestación de servicios sobre datos abiertos? and in english: How far should a public administration go with regard to the provision of value—added services based on open data?
Last Monday I took part in the panel “Reuse of EU open data: challenges and opportunities” during the Reuse of EU legal data Workshop, organized by the Publications Office of the European Union. One of the interesting issues that came up during the panel (you can watch it here) focused on the well-known question: How far should a public administration go with regard to the provision of services based on open government datasets?

The discussion in the context of fostering open government data reuse, arise from the difficulties of finding a balance between the services that every public administration must provide to citizens for free and the space that should be left for private initiatives to create value from open government datasets. In many cases, that unstable balance creates certain tensions that do not contribute to innovation.

In the past few years, I have heard numerous arguments both from the supply and the demand side. These arguments show positions from one end: “public administrations should only provide raw data and no services;” to the opposite: “the public administration should go forward in the value chain as much as possible when providing services to citizens.”

My position in this matter, which I had the chance to defend during the debate, is that it is not useful to work in drawing a red line between what should be considered a basic/core service and a premium/value-added service. Quite on the contrary, we should work on the definition of the minimum incentives that should be designed for opendata-driven innovation to flourish and deliver wealth creation.
Photo by: Rodion Kutsaev

For that reason, I used the panel to make the following statement, which could be a starting point to clearly define the minimum conditions that a reuser needs to create value added services:

“open government datasets should be released in such condition that a reuser can build the same services that are provided for free by the data holder.”

This is basically because, in many cases, value creation starts from that baseline; this is, from just improving a service that already exists. If an existing public service cannot be reproduced, for example due to a few hours delay in the release of the underlying dataset or because of the limited quality of the released data, then it will not be possible to innovate by launching and improved product or service to the market.

In my opinion, this approach to the issue can help us make some progress in this debate. I hope this first statement can be improved and better worded by contributions from the community, or otherwise proved wrong by evidence or better arguments than my own.

martes, 15 de marzo de 2016

Let’s open more datasets, because what could go wrong?

Note: This article is a translation of what I wrote in Spanish for the International Open Data Conference 2016 blog. You can read the original post in spanish: Abramos más conjuntos de datos, ¿qué puede salir mal? and in english: "Let’s open more datasets. What could go wrong?".
In conversations between members of the open data community, especially those responsible of providing data, one often overhears statements such as “it’s necessary to stimulate the demand for open data,” “we can’t reach the reusers,” “it would be interesting if data providers and reusers talked more.” I am sure that you have heard such statements in many occasions.
Most probably, this uneasiness is not unknown to the IODC organizers, whom need to be aware that previous editions of the event have mostly been focused on what is usually called the “supply side,” this is the public organizations in charge of the custody and providing groups of open data. What is true is that in Spain, possibly due to the fact that it is the Ministry of Industry the one that promotes open data policies, it has always been encouraged that reuse companies are very present in events about open data. And this will surely be noticed in the program of the 4th IODC next October.


However, I would like to tell you a secret that could help understand why, apparently, there is no such long-awaited open data demand: it turns out that for reuse companies, it is often more productive to obtain data from the web than using open data portals. Unfortunately, technologies for data extraction from documents have advanced in recent years much faster than the existent datasets in portals.

Even though it is quite inefficient and we may not like it, currently it is the only possible way in many sectors for companies to generate data value. In other sectors, when there is no published data, neither in documents nor in datasets, there is no demand to stimulate. Companies, especially small companies, survive on the value that they can create and sell today, not on future promises.

If you were a company, where would you put resources? On an open source library to improve a data-extraction algorithm for PDFs or taking part in circular arguments about the best way of opening data?
In my opinion, as I am on the “demand side,” I would like IODC 2016 to be a turning point, not as much as to define more standards, more indexes and policies and laws, but to obtain a publication agreement of more useful datasets.

If we actually aim to encourage innovation and creation of value from open data, I suggest we flood portals with useful datasets. What could go wrong? Actually, much of these data are already inside published documents on the web, and much effort is being put on extracting and cleaning them when it could rather be put on creating data value.

miércoles, 15 de febrero de 2012

The important thing about the EU Open Data License is not which License will be selected.


Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Lo importante de la Licencia Open Data europea es que exista, no cuál será la licencia elegida".

I have often written about how the open data community in the UK is working to become a global leading force. The Prime Minister, David Cameron, is leading an ambitious Open Data agenda aimed to boost digital economy in Britain. I have also been very critical with spanish political leadership on open government issues and with the lack of ambition of open data initiatives already launched in Spain.

However something seems to be changing. Today, I have to say that I am very proud of the Spanish Open Data Community because of its leadership and support to the EU Open Data license campaign. As you know, the European Council is now carrying on the negotiations for the revision of the RISP Directive, and a few days ago, Andres Nin in his blog, launched a campaign to request a single licensing model for open data in the European Union (#1OdataLicenseEU). To date, over 330 supporters signed for the campaign, some as relevant as Patxi Lopez, president of the Basque Government. And surely many more will join in the coming days.


As you know, I am supporting the campaign because I believe that a single EU license is very important for the development of Open Data companies such as Euroalert.

However, during these past days, when I have been following and supporting the campaign, some relevant people and organizations of the European open data community told me why they are not actively supporting it. Main reasons regard to discussions about which would be the selected license or if it would be better to include an Open Data Definition rather than just a license.

In my humble opinion, at this point, it is not important to agree on which is the most appropriate license as there are a number of licenses that would fit perfectly.
"What is truly important is that we could have a single Open Data license for all  European Union countries to strengthen the single market"
And I am very concerned that this discussion may be reducing the strength of the campaign. It would be really sad that interests on the selection of the license would make us miss this opportunity. So, let's support the inclusion of a single Open Data license in the RISP Directive and then let's work so the license can be as simple as the one proposed by Alberto Ortiz on his blog. I wish it could be that easy.



viernes, 3 de febrero de 2012

A single Open Data licence is very important for companies

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Una licencia open data única es muy importante para las empresas".

As you all know in Euroalert we are working on the exciting challenge of building a pan-European platform to aggregate tendering and procurement data from all public authorities in the European Union Member States. A few months ago, following my presentation at the First Digital Agenda Assembly, I wrote about the importance of a single open data licence for the development of a pan-european data market.

The European Commission is currently reviewing the Directive on the Reuse of Public Sector Information, and a new draft was published in December. There is therefore a great opportunity to establish a single licensing model for open data in the European Union. However the ideal would be a the single open data global licence.

I will share a true example to elaborate on the subject. Euroalert aggregates data from many diverse sources with the most heterogeneous licenses, inspired by the laws of different countries and not always compatible between each other. Sometimes we've been asked, especially from NGOs, to release aggregated databases of procurement data for studies or other projects. Although we would have been glad to donate these datasets, we could not do it because of the restrictions of the licenses. As you know the licences of some datasets often forbid mixing its data with other databases, others set limitations to the commercial re-use or in some cases even any treatment other than the publication as we get it is limited.

Just studying the legal implications of the redistribution of our aggregated raw databases is something that we could not afford. Our project that will publish a Linked Open Data node for procurement data is facing a similar problem that could be easily solved with a single EU license.

Andres Nin yesterday launched a petition "Say to @neeliekroesEU we want a single #opendata licence in the #EU" to raise awareness of the matter. This is truly a key issue in the development of companies that aim to create wealth through pan-European initiatives for the reuse of public data. And one more opportunity to help the development of a European single market in which companies powered by open data, like Euroalert, operate. I encourage you to sign the petition to the European Commissioner Neelie Kroes and to help us in its promotion in order to make the voice of the Open Data Community heard in the European Institutions.

martes, 5 de julio de 2011

About the Digital Agenda Assembly and Open Data licenses

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "La Asamblea de la Agenda Digital Europea y las licencias Open Data".

On June 16th I was lucky to be one of the 1300 participants in Brussels at the 1st Digital Agenda Assembly, the first high-level event organized by the European Commission about the Digital Agenda for Europe. The main objective was to evaluate the development of objectives and actions of the Europe's Digital Agenda, which as you know is one of the 7 strategic actions of the Europe 2020 strategy, aimed to put Europe on the path of smart, sustainable and inclusive growth. That's nothing.
by Jose M. Alonso

I was also honoured to be a speaker at the seminar held in the plenary room: "Open data and re-use of public sector information", where I talked about Euroalert and our experience building a business powered by open data that provides information services about public procurement for European PYMEs. At Euroalert's blog you can find more information about the presentation, that was live streamed, and the pictures taken Jose M. Alonso.

The two days were pretty intense and lots of discussions took place, many as a follow-up to the May Share PSI seminar, either in person or via twitter (see hashtag #daa11psi and stats). You can find a great summary at the Open Data@CTIC blog. I'm going to focus on two important details and one announcement that I'd like to share with you.

The first one is the speech about the State of the Digital Union, as vice president Neelie Kroes called her speech at the first plenary session. I recommend you to have a look at it, because in these times of illiterate politicians when it comes to technology, this remarkable woman is an inspiration. I never imagined I would recommend here the speech of a politician. I hope she will achieve all these ambitious goals.

The second one is a tweet from Michele Barbera, quoting Federico Morando, which did not have a big impact, though it represents an important topic I've been discussing with members of the open data community and which I find extremely important for the development a pan-european market for data re-use.


Many of us believe that if the future revision of the PSI directive endorse a simple license, applicable by default to datasets released by governments, a critic roadblock would be removed, especially for companies that operate with pan-European vision. From the point of view of a company like Euroalert, that creates value from data aggregated from multiple sources and countries, a unique EU license would contribute with legal certainty to operations in the single digital market

It seems to me that the pursuit of interoperability for the growing number of data licenses is becoming a grail that threatens to appear as one of the greatest barriers to data re-use.The idea implemented in the draft of the Spanish PSI Royal Decree, which includes as an Annex a very simple license to be applicable by default, in my opinion would be ideal to be copied into the new directive. Maybe with a EU logo or seal recognizable to all operators... but I am not qualified to judge which one is the best license to be included and endorsed by the directive.

Moreover, thanks to the long networking sessions (great success) of the DAA,  I was finally able to spend some time with Chris Taggart figuring out how Euroalert and Open Corporates  can exchange data and information for the benefit of our users and the Open Data community at large. Soon we will be releasing more details of what we hope will be a small contribution to the European single market.

The truth is that I came back very happy to belong to such an active and motivated community which is luring more and more members everyday and that step by step is becoming mainstream. Too bad that in this June full of events I have missed the great OKCon2011 in Berlín. You can not be everywhere.

jueves, 19 de mayo de 2011

Adobe and the PDF role in Open Data context

Note: This article is a translation of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "El papel de Adobe y los PDF dentro del panorama Open Data".

PDF
Source: Wikipedia
Last week in Brussels a lot what was discussed about Open Data in the Share PSI Workshop, but I also had a number of interesting conversations in the time between sessions. As I said, one issue that caught my attention was the attendance of large technology companies at an event about Open Data. Some like IBM and Orange made ​​their presentations based on the position papers they had sent, and others like Adobe were attending with relevant representatives. So far I had only identified the work done by Microsoft to enter the scene with its OGDI initiative, aimed at positioning Azure as an opendata friendly technology and the occasional attendance at relevant Open Data events from for example Telefonica or Google, but with a very low profile activity.

As I wrote, this step forward from large technology companies, which I think has very much to do with the work of ETSI and W3C, will be for good if we manage to focus their interests in the right direction, that is to say Open Data, but well done. And that's what this post is about. One of the topics that myself in my presentation about Euroalert, and others like Chris Taggart or François Bancilhon stressed, was the fact that the release of public sector data in Adobe PDF format is not adequate for reuse. It is clearly the best format for distributing information (reports, documents, presentations, etc.) but, as HTML itself, not to publish datasets or machine readable information.

Mr Marc Straat spoke from the public to tell us how Adobe is working on PDF technology so it can evolve to be a more useful format within the Open Data context. I must admit that I did not know about the potential of PDF as a container for other types of information, and after reading the article My PDF Hammer that Marc talked me about in a very pleasant conversation over lunch, I think I have a clear idea of what he meant.

I find very interesting the idea that a PDF container may associate the usual PDF file with its editable original version, whatever format it comes from: either a Microsoft Word document (.Doc), OpenOffice or LibreOffice (.Odt), or whatever. If Adobe works to promote that all the tools that convert documents to PDF do the job of embedding the source file, and contributes to disseminate and encourage the use of the feature, I think it would be a great step forward. And excellent news if the governments take as common practice the distribution of their reports in PDF along with the original file and datasets within the PDF file as a container.

However, after thoroughly reading the article, the idea of using PDF as a container for open data files, seems to me an even worse idea than in my first thought. I really see no advantage in using a PDF container instead of a simple ZIP file to distribute XML datasets along with XSD schemas and their documentation or manuals of course in PDF.

On the other hand I do see a major drawback. No programming language has native support for processing PDF files, while there are many options (and well known) for dealing with ZIP and of course XML, XSD or plain text. This means that an almost trivial data processing task, for which exist many well known open source tools, could be turned into a problem that will require licenses and very specific knowledge with no additional benefit for developers in exchange.

As a conclusion, I will say that I do not believe that solutions based on PDF as a container for open data should be promoted. Considering existing tools, it is much more practical for re-users to deal with information distributed in ZIP containers. Instead, it seems a great idea to start encouraging the practice of embedding the original files and even XML datasets within PDF reports or documents to facilitate reuse.

By the way, as a Linux user, I keep waiting for a version of Adobe Acrobat Reader for my platform (x86_64). At present I am not able to open most of the files that make use of advanced PDF features such as forms, published by public authorities.

lunes, 16 de mayo de 2011

Road Blocks to a Pan European Market for PSI Reuse, a long summary

Note: This article is a translation with a few add-ons of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Obstáculos para el desarrollo de un mercado pan-europeo de reutilización, un largo resumen".

SharePSI #daa1psi
Source: ePSIplatform
On Tuesday 10th and Wednesday 11th I participated in Brussels at the workshop "Removing Road Blocks to a Pan European Market for PSI Reuse" held by the Share PSI initiative. It was superbly organized by W3C y ETSI for the European Commission, and gathered a good number of members of the European open data community: governments, businesses and civil society organizations.

The European Commission will use the output of the SharePSI workshop at the 1st Digital Agenda Assembly event: "Beyond raw data: public sector information, done well". Ultimately the contributions, like those obtained through the public consultation on Open Data held at the end of 2010, will help to make the reform of the PSI directive richer and more effective.

In my opinion, compared to other seminars, the level of the discussion was very high in most of the sessions, though there are a few topics that are recurrent in this type of events (pricing, licensing, return of investment and privacy). Clearly, this shows that we are not being able to resolve the issues satisfactorily. In a few occasions it was also clear that not everyone is at the same level of discussion, but it is entirely normal because the open data community is growing at a rapid pace and many new people are joining the discussion.

I think the seminar was very intense and productive and this was largely due to the excellent work done by the program committee and especially by Margot Dor (ETSI) and Thomas Roessler (W3C) to create the workshop programme from the large number of position papers sent from all over Europe.

As you know I usually attend this type of events, and this time one of the things that caught my attention was, the presence of representatives of large companies in the discussion. It was rare until now that Adobe, IBM or Orange were interested in the Open Data movement. And I strongly believe that this is a good thing, because their software and their position in the IT services in governments can provide solutions that will drive the development of a more effective Open Data.

I guess that their presence has much to do with W3C and ETSI. I hope they are here to stay and contribute much to the debate and the solutions, though for now they are still far from the more advanced group. However, I also believe that it is the responsibility of those of us who have been long time in the debate, to bring them to the vision of what is the main objective of the Big Idea, Open Data, but well done.

I will also highlight the number of national government representatives that I could identify in the room (at least from Spain, Denmark, the Netherlands and Finland). And New Zealand representation in the person of Laurence Millar, who described us the situation in his country, which is enviable in many respects, such as the very active community of developers they have.

I found very interesting the discussion on the pricing of the meteorological datasets and the apparent long-running dispute which has been brought now to the open data ring by the Association of Private Meteorological Services (PRIMET). I think it's for good that this happened and that these discussions come to enrich the open data debate. There were also several new use cases like the very interesting FearSquare, presented by Andrew Garbett or the impressive Arcticweb that Erin Lynch showed us, that called my attention

On the other hand, it was a pleasure to hear entrepreneurs like François Bancilhon speaking about his work at Data Publica or like Chris Taggart on his excellent Open Corporates, which I have been following for a while. The risks that people like them are taking contribute greatly to push the boundaries of what can be done, although for sure they may have to face problems, because they are disrupting the established situation. My most sincere admiration, respect and support to go ahead.

On my side, I presented the work that Euroalert is doing to develop our 10ders Information Services platform, which aggregates data on procurement notices across the EU. You can find the slides and the summary of the intervention at Euroalert Blog. I also was the moderator of the second half of the session on Use Cases, where we heard the complains of the Federation of European Publishers about the difficulties they face in competing with they still call the culture of free. I was surprised by their approach in the context of Open Data, which I believe is completely misleading again. I hope they will take a more positive position in the future. I was also lucky to have one of the best quotes of the event, made ​​by Hervé Rannou, from ITEMS International, who presented the lessons learned in the Open Data project of the City of Marseille: "The use of the data is infrastructure, like roads"

On June 16th we will see at the 1st Europe Digital Agenda Assembly the most interesting outcomes and conclusions that the European Commission has harvested from this Workshop. I hope it will be useful to take firm steps forward to enable a more favourable environment for market growth based on the development of new information services. In short, for companies powered by opendata as I like to call them. I also hope that among all of them Euroalert will be a remarkable Open Data company, both because of the success of its value proposition and for our contribution to the development of this environment.

To finish this long post, though the occasion deserved it, I will leave some resources that you will find useful to dive into what was said in the workshop. I have used them review what was said in the last two sessions, which I could not attend. I highly recommend to read the excellent work done in collaborative note-taking which reflects faithfully the discussions. You can also check out the tweet archive created by the University of Lincoln, the slides used by speakers, the list of twitter accounts of attendees, the position papers submitted or the snaps of event.

viernes, 24 de diciembre de 2010

Data Driven Journalism in Spain with Pro Bono Público

Note: This article is a translation with a few add-ons of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Data driven journalism en España de la mano de Por Bono Publico".

 Fuente: http://www.slideshare.net/mirkolorenz/data-driven-adamOne of the most interesting things that happened to me in London during the condensed course of government transparency I attended in November, was the relationships I could strengthen. And not only with Open Data celebrities. One of these relations was with the people from Pro Bono Publico, the civil society association you may know for organizing the challenge Abre Datos. This contest was first held in Spain to promote the reuse of public sector information.

Since I met Jacobo Elosúa, David Cabo and Álvaro Ortiz at the Open Government Data Camp 2010 and we shared a good time discussing about the state of open data in Spain, I have followed a bit more closely their activities. For example, last week Pro Bono Publico released an excellent contribution to the spanish public consultation on the draft Royal Decree on the reuse of public sector information. Along with them, Fundación CTIC has also published comments and the Spanish Ministry of Industry has released an over 100 pages document with all 26 contributions received. Of course, one of those contributions is the one that Euroalert has published on its official blog.

Another initiative in which Pro Bono people are working on, and that I really think as very valuable, is the organization of a workshop on data driven journalism, aimed primarily at journalists and other information professionals. The conference is tentatively scheduled for February 15th 2011 and for the moment we are in the process of testing the real interest and gathering some feedback. I say we, because as a few weeks ago I spoke at the University of Valladolid to students of journalism about the challenges of open data, I'm trying to contribute what I learned from this experience. Of course, we ask for your comments and for your help to spread the event. As we get more details we will release them.

The words of Sir Tim Berners-Lee seem to have been an inspiration to all of us who were attending there:
"The responsibility [to analyze datasets] needs to be with the press"
... especially for Alvaro when Sir Tim took a sit beside him.

viernes, 26 de noviembre de 2010

Lessons about Open Data at the Open Government Data Camp in London

Note: This article is a translation with a few add-ons of what I wrote in Spanish for my personal blog. You can see the original post in Spanish: "Grandes lecciones sobre open data en el Open Government Data Camp en menos de 24 horas".


After changing my agenda, last week I was finally able to attend the second day of the Open Government Data Camp held in London. In addition, invited by the Open Knowledge Foundation I was privileged to be present at the press conference which announced the release as raw data of the database containing all the UK Government expending over 25,000 pounds (about 29,000 Euros).

The event where the announcement of the release was made was presented and explained by Minister Francis Maude escorted by Professor Nigel Sadbolt, Tom Steinberg, Rufus Pollock ... and Tim Berners-Lee, inventor of the web and advisor to the British Government. It is highly recommended to watch the video of David Cameron as a lesson on Open Government. This video was the opening of the event, so we must assume that the Prime Minister "wanted to be" there.

Despite being "only" an official announcement, there were a few things that caught my attention and I'd like to share, just in case we can be inspired in Spain, and perhaps in other countries:
  • It was not a typical press conference, at least as we know them in Spain. I mean, it was not a long speech of the minister, followed by questions from reporters. Instead, it was organized as very short presentations about many important points around the matter of the announcement, both political and operational and even technological.
  • There were presentations from all the organizations that had been involved in the project and not just from the government. Representatives from mySociety, the Transparency Board and the OKF, made their points. But there was also an important space reserved for independent software developers such as Chris Taggart, who had been working in a demo of the possibilities of the data released.
  • The wide technological culture that exhibit the members of government, including the Prime Minister. Comparisons are odious, so I will not make any. This is an exercise for you.
  • The commitment of UK Government to transparency and open data is remarkable. "We know this will be a very uncomfortable process within the government departments" said Maude. In fact, uncomfortable news began to circulate during the event, such us the payments to Nick Clegg's wife's law firm or the rents paid to Prince Charles from the Ministry of Justice. But the commitment is as firm as defined by the words of Francis Maude “It is our ambition to make the UK the most transparent and accountable Government in the world"
I noticed that in the UK the work done for years by activists of all kinds: journalists, associations, developers, companies and even civil servants and politicians has already achieved a great success: the message of openness and transparency is firmly installed at the highest political levels, of all signs. I think a good summary are these words said by Rufus Pollock, founder of OKF, half jokingly half seriously:
"It's very encouraging to see that UK government is becoming more radical than me in terms of open data and transparency"
Thanks to the OKFN for the great work organizing the Open Government Data Camp and in particular to Jonathan and Rufus himself for pursuing my attendance at the event. I hope that all the networking and interaction and the envy and inspiration that gives the speed at which things are developing there, allows us to COPY in Spain and create a true Open Data community. I believe we have the grounds, but we need a little more action, less complacency ... and some more reusable raw data.

About OGDCamp sessions, I'll write another post as this is becoming a bit heavier than I'd like to :) I recommend you check out the hashtags #openuk and #ogdcamp and conclusions of the working groups.

PS: On my way back to Valladolid, I read at the airport that the Government of Spain had released the draft Royal Decree implementing Law 37/2007 of 16 November, on reuse of public sector information and launched a public consultation on it. Personally I prefer the approach of less regulation and more publishing of open data as raw data, but still congratulations on the move. As soon as I can read it in detail I will try to make my contribution.

domingo, 7 de noviembre de 2010

Quick guide to Opendata EC Public Consultation, the on-line survey on the PSI Directive

The Digital Agenda for Europe lists the revision of the Directive 2003/98/EC on the re-use of public sector information (PSI Directive) among its first key actions. It is worth reminding the key role of Spanish Presidency including PSI reuse in the Granada Declaration showing a commitment with promotion of open data that we all hope it will be followed with some action.

In September European Commission opened a public consultation with the purpose of gathering information from as many sources as possible on their views on the review of the PSI Directive. The consultation will feed into the debate on possible policy options that should be considered for the review, and will contribute to the impact assessment that will be carried out subsequently, associated with proposals for possible legislative or other measures.

As you all know a public consultation is a regulatory tool employed to seek the opinions of interested and affected groups in a certain matter. Gathering their views, opinions and contributions using the Internet both Member State administrations and EU institutions can understand the needs of citizens and enterprises better.

The PSI con­sul­ta­tion doc­u­ment is pub­lish­ed only in English but res­pon­ses are ac­cep­ta­ble in all EU lan­gu­a­ges, as it is not stated oth­er­wise on the con­sul­ta­tion do­cu­ments them­selves. So please do not think that being a non-english fluent speaker is a barrier to participate.

Werther you are a government, a public sector content holder, a commercial or non-commercial re-user or other interested parties your contribution is really important because your view will feed on the review of the PSI Directive.

The consultation includes questions divided on several blocks:

  1. the PSI re-use context and possible action to consider,
  2. substantive issues regulated by the PSI Directive,
  3. practical measures,
  4. changes that have taken place and barriers that still exist
  5. other issues to comment regarding the review of the PSI Directive.
It will take you some time, perhaps over 30 minutes to make a good contribution, although it is stated that it is only 15 minutes to complete the survey. But is it worth the effort. You can answer online survey on the PSI Directive but I should recommend to have a look at the pdf version of the consultation document, in order to have a complete view, before answering.

It is also worth noting the commitment that the Commission Vice-President for the Digital Agenda, Neelie Kroes is showing with the open data community. As she pointed out, much of Europe's PSI is insufficiently or even sometimes not exploited, which means losing out a great opportunity to generate innovation.

The replies to this consultation will be published on the Commission's PSI web site. The consultation will run until 30 November 2010 and 3 weeks before closing, EC has gathered about 350 responses, which is not much, comparing with the importance of the matter. So please contribute with your views to build a more innovative Europe.

jueves, 10 de junio de 2010

The Web of Data is here to stay... and will be full of Public Sector Information

Yesterday I was speaking at the PSI Meeting 2010, that in my opinion was a great success for all the PSI community, which is a very important part of the open data movement. With an excellent organization by Spanish Proyecto Aporta and the European Commission's project ePSIPlaform, many people interested in the reuse of PSI around European Union, and even from Hong Kong, met in Madrid.

It was a very intense day, with 19 thoughtful speeches made by people coming from public administrations (including European Commission) and platforms, companies that reuse information and Trade Associations representing their interests and civil society organizations. I was specially impressed by the honest description of how some civil servants face the unlocking of their own data made by Catherine Lippert; the didactic description of open data picture and issues presented by Daniel Dietrich and the inspiration of the examples and replicas made by Chris Taggart.

But I am also worried about the strength that the business case trend is taking. Many voices, including the European Commission are asking to measure something that it is not possible to measure for the moment. I would say that if a company is willing to make an investment to develop a business model based on the reuse of a certain PSI, that should be enough proof of usefulness. If it doesn't work the only one who will lose is the company who is taking the risk. The information is being produced anyway and we should not ask the public bodies to over-engineer the formats or to build easy to use and expensive to exploit APIs. At least for the moment and at least during economic downturn. That can be part of the investments the companies can make. So please RAW DATA NOW!.

With these conditions, the business case will be built by itself in a few years for very little money. Because what we know for sure is what Marc de Vries said about the direct relation between access and reuse "No access, no reuse". And information industries need the raw material that represent raw data to innovate, specially as an stimulus for the recovery. It is not that difficult and I really think it is not that expensive. It is more a matter of will by the information holders.

There were many take-aways from all the presentations and round tables, so I recommend you to read the tweet stream of the PSI Meeting 2010, where you will find many of them. You will discover that many of the speakers are also twitterers and you can follow their activities on open data. And that in the public sector, even in Spain, there are many people like Alberto Ruiz de Zárate or Amalia Velasco that can make things happen.

For me the best part was to see that there is great group of brilliant people that are really committed to the objective of making Public Sector Information open and reusable for the benefit of citizens and businesses. And as usual in big causes, the best of them are volunteers. For the first time since I am following the open data community I really have the feeling that we are not very far from a big domino effect (as Catherine defined) that will make great things to happen... despite the business-casers.

jueves, 3 de junio de 2010

Presenté mi libro en la misma tribuna en la que han estado Vinton Cerf y Bert Bos... Wow!

Ayer por la tarde presenté en Oviedo mi libro "Web 2.0: Una descripción muy sencilla de los cambios que estamos viviendo". Como os adelanté hace unas semanas, la Escuela de Informática de la Universidad de Oviedo, a través de su director Jose Emilio Labra, me brindó la posibilidad de celebrar el acto en el marco de uno de los Master en Ingeniería Web más prestigiosos de España. Como sabéis la región de Asturias, alrededor de la Oficina del W3C en España (albergada en Fundación CTIC) y la Universidad de Oviedo (y especialmente de la EUTIO) ha construido un polo de innovación en torno a las tecnologías para la web.

La tarde comenzó con una muy interesante charla con Chus Neira, periodista del periódico La Nueva España, que publicó esta entrevista en la que ha recogido muy bien la esencia de una conversación en la que tocamos con gran profundidad las luces y las sobras de la web (propiedad intelectual, nuevas generaciones, censura, privacidad, delincuencia, etc.) Cuando la persona que tienes enfrente sabe tanto o más que tú acerca del tema es fácil que la entrevista salga bien.

Para la presentación, contar con una audiencia tan especializa supuso un reto y un orgullo añadido ya que, para que os hagáis una idea, tuve que superar el miedo escénico de hablar en la misma sala en la que han realizado presentaciones Vinton Cerf, uno de los padres de Internet, o el creador del lenguaje CSS que sirve para dar aspecto a las páginas web Bert Bos

Como siempre, os dejo las transparencias con las que ilustré la presentación en la que hice un pequeño recorrido por las principales ideas que desarrollo en el libro:
Web 2.0 Una Descripcion Sencilla de los cambios que estamos viviendo

Al igual que todos los títulos de la colección Pocket Innova, el libro podéis adquirirlo online en la plataforma BibloWorld o por supuesto en Amazon. En su apuesta por hacer llegar sus contenidos a todas las plataformas y en todos los formatos, la editorial Netbiblo también lo comercializa como libro electrónico y lo podéis adquirir completo o por capítulos.

Después de la presentación y como aperitivo a la PSI Meeting 2010 de la semana que viene donde participaré en la mesa 2, realicé una ponencia sobre Open Data y sus modelos de negocio. Estaba dirigida principalmente a los alumnos del Master y por eso intenté acercarles la visión y el uso real que desde una empresa como Euroalert, se hace de tecnologías como Linked Data en las que ellos son especialistas. Os dejo la presentación, que además tiene ya alguna corrección que me aportó alguno de los asistentes (gracias a Carlos Tejo):
Euroalert Open Data Business Models

Tanto después de la presentación del libro como después de la ponencia sobre Open Data tuve un intercambio animado de opiniones con los asistentes, del que al menos yo aprendí muchas cosas. Hablamos del futuro del software libre, de los riesgos de la pérdida de privacidad o de la necesidad en educar mejor a los usuarios de la web.

En definitiva fue todo un lujo, en el que tengo que agradecer la implicación de la editorial Netbiblo y sobre todo de la la EUTIO por ofrecerme un marco tan agradable para presentar el libro. Y por supuesto muchas gracias a todos los asistentes!!!, entre los que además pude volver a encontrar a algún viejo amigo como Jose Manuel González Corral.

lunes, 24 de mayo de 2010

Presentación del libro “Web 2.0: Una descripción muy sencilla de los cambios que estamos viviendo”

Ya está disponible en Amazon el que es mi primer libro, y que lleva por título "Web 2.0: Una desripción muy sencilla de los cambios que estamos viviendo" y que se ha publicado dentro de la colección Pocket Innova de la editorial Netbiblo. Acostumbrado a escribir propuestas, proyectos, artículos para blogs, tweets, ponencias o cursos, la experiencia de comunicar a través de un libro ha sido todo un desafío. Personalmente ha resultado muy enriquecedor, sobre todo porque se trata de un tema que, como sabéis, me apasiona y que es no es otro que el invento más exitoso de la historia de la humanidad, la Web. Escribir este libro me ha permitido dedicar un poco de tiempo a reflexionar acerca del modo en el que están cambiando muchos aspectos de nuestras vidas. Ése es precisamente el objetivo para el que espero que sirva a los lectores: comprender mejor los cambios a los que nos está arrastrando vertiginosamente el increíble instrumento igualador que es la Web.

Aunque el título lleva el manido "2.0", he intentado recorrer los últimos 15 años de la Web buscando el origen de las transformaciones radicales que nos están obligando a repensar diversos conceptos que llevaban varios siglos bien afianzados en nuestra cultura. No me ha resultado nada fácil analizar con un poco de perspectiva acontecimientos que en el mejor de los casos han ocurrido hace unos pocos años o que incluso estaban ocurriendo mientras estaba documentando los capítulos pero espero que el esfuerzo haya valido la pena.

El formato de la colección Pocket Innova, para la que, como ya sabéis, además realizo coordinación editorial junto con Juan Vicente García, pretende acercar los temas relacionados con la innovación con sencillez y rigor, principalmente a directivos, académicos, estudiantes y técnicos de empresa. Me he permitido, por tanto, en algunas ocasiones licencias como intercambiar los términos software libre y software open source, para poder mantener un lenguaje claro y directo.

Además he tenido el honor de contar para el libro con un prólogo de Jose Emilio Labra, director de la EUITIO, y con quien ya tuve la oportunidad de escribir un capítulo que contribuimos conjuntamente para el libro "Web 2.0: The business model" (Springer).

La presentación del libro se realizará el próximo miércoles 2 de Junio a las 18:00h en el salón de Actos de la Escuela de Ingeniería Informática de la Universidad de Oviedo: C/ Valdés Salas, S/N y a la que por supuesto estáis invitados todos quienes tengáis disponibilidad para acercaros.

Acompañando a la presentación del libro realizaré la ponencia “Open Data: modelos de negocio de la reutilización de información del sector público”, principalmente dirigida a los alumnos del Master y Doctorado en Ingeniería Web de la Universidad de Oviedo, pero a la que también podrá asistir quien tenga interés en la incipiente Web de los Datos.

La ponencia tendrá algunos puntos en común con la que realizaré en Madrid el 9 de Junio con motivo de la PSI Meeting 2010 y cuyo título provisional es: ¿Los datos se liberan para fomentar modelos de negocio innovadores o ... son sólo para que juguemos con ellos sobre mapas online? RAW DATA NOW!!! Sin embargo tendrá un carácter mucho más formal como corresponde al foro que supone la Universidad de Oviedo y en ella repasaré los diferentes conceptos que intervienen en la apertura de información del sector público. Sobre todo a la forma de construir modelos de negocio innovadores y por tanto generar riqueza para nuestra maltrecha economía. Nos vemos en Oviedo.

viernes, 21 de mayo de 2010

Weirdest reason against open data in Spain

While I was working in my presentation for the PSI Meeting 2010, where I will be the 8th June representing Euroalert, I remembered the weirdest argument I've ever heard against Open Data. I'd like to share it with you because it is a really good (though disappointing) example of what many public organizations might be thinking about their data.

I was in a meeting with a public authority, trying to reach an agreement that would let Euroalert the re-use of public information they manage and that it is not open yet. Well, they really think it is open data because it is published in their website. So, again I had to explain that webpages or PDFs are not true machine readable formats. But this misunderstanding is quite usual and surely you have heard this many times. What really knocked me down was what came after several other evasive and not well fundamented arguments. It was something like this:

"If we unlock the data and you (and other companies) develop a service that improves the features we are providing for free, that will harm small companies because they might not be able to buy it"
I cannot be sure if they really believe that by keeping data locked they are helping anyone but I had the feeling that they were against the idea of companies making money with public data. Anyone had this type of discussion? I will be very interesting in sharing views on the subject.

By the way, in the presentation in the roundtable "Turning the reuse of public sector information into new business models and innovative services", I will be talking about the difficulties the open data movement is facing, when it comes to enhancing new and innovative business models. (More insights on the matter from the Chamber of Commerce of Stockhom at ePSIplatform)

My point is that on the one hand we have the Linked Data dreamers and on the other hand we have to deal with PDFs and HTML with partial data and civil servants that fight against innovation. And we will loose most of the potential of PSI reuse if more action is not taken. Otherwise a Open Data will be just a fancy playground with old datasets running over online maps. Beautiful but not very useful for companies.

But this discussion is not today's objective. I will publish the slides in Open Economy and probably the Euroalert team will make a good coverage of the event in the blog. In the mean time say with me: RAW DATA NOW!!! (I really recommend Sir Tim's TED talk on open data and the next web)

sábado, 3 de abril de 2010

Open Data Movement: Free our Data

It has been several months since I wanted to talk you about the Open Data Movement, specially after my participation in November in FICOD 09 representing Euroalert.net, in the roundtable about creating value through the reuse of public sector information.

During the past few months we have been living kind of an Open Data Rush that started with Obama's promise of government transparency, which in terms of data openness ended, among others, in Data.gov pioneering initiative. Since then several countries, mainly English speaking ones, have launched their own initiatives. You can check, New Zeeland Open Data Catalogue, or Australia Catalogue, and also individual cities like Open Toronto or New York City open data websites. There is a good repository of governments that are opening up their data vaults around the world at The Guardian Open Data Platform or at Fundacion CTIC website

In the European Union since 2003 we have the Directive 2003/98/EC on the Re-use of Public Sector Information (PSI) that has already been transposed into all the members national laws. That means, in simple words, that the EU27 Member States are enforced by law to promote open data, although to date only United Kingdom seems to be doing a significant effort. Prime Minister Gordon Brown, who is being advised by Sir Tim Berners Lee and Professor Nigel Shadbolt, has talked in several speeches about his aim of turning UK into a world leader in making government data more accessible to the public. This commitment to Open Data Community, considered as an important element of Building Britain’s Future, has led to Data.gov.uk website for the promotion of the reuse of UK public data or the announcement of the creation of "The Institute of Web Science" initially funded with £30M.

In Spain, we are quite far from leading anything in the web, although it is remarkable the effort done by Proyecto Aporta, a small initiative at the Ministry of Industry with very scarce resources, that I presented you a few months ago. For example, in March 2010 Proyecto Aporta launched a beta version of a catalogue of public information in Spain, which sadly contains very little raw data available, and thus, useful for reuse by companies or individuals.

There is not a strong political support from Spanish authorities, and Spain is going to loose another opportunity to improve its performance in digital innovation and to participate in the major changes in economy and society we are living. I cannot not understand why an initiative that would have such a huge positive impact on innovation it is not being pushed firmly in our country. Open Data is a very cheap investment for a Government and the main reasons that are pushing forward these initiatives around the world are purely economic. The European Commission estimated in 2006 that the overall market size for the reuse of Public Sector Information in the EU is 27.000 million Euros (0.25% of the total aggregated GDP for the EU)

In order to be up to date, I recommend the EPSIPlatform, Europe's One-Stop Shop on Public Sector Information (PSI) Re-use, where you will find, among other useful resources, the best tracking I know of all news, announcements and moves in the Open Data World. You can also follow EPSIplatform in Twitter. If you want to get involved I suggest you to follow The Open Knowledge Foundation (OKFN) projects around any kind of information that can be freely used, reused, and redistributed.

Open Data might be a drop in the ocean of economy, but it is a really cheap move, it is easy for public authorities, there are not major shortcomings, it does not hurt any industry and there is a lot of political return, apart from the economic as transparency is a hot topic for citizens. So please move on!. Companies need raw data for developing new and innovative products! Free our data!

domingo, 21 de febrero de 2010

In Dublin speaking about the web as a way to improve market intelligence in the Open Economy

Tomorrow I will be flying to Dublin, where I will be speaking at the "SMART CONSTRUCTION - DOING BUSINESS IN THE NETHERLANDS" seminar, organized by Enterprise Ireland, the Government's agency for the development and promotion of Irish business sector.

The seminar will take place at Enterprise Ireland HQ in Dublin, next Tuesday February 23rd 2010, from 08.45, where the registration starts, to 15:30, where the pre-booked one-to-one meetings are expected to end. This event is one of the activities organized for the Ireland Construct community, created with the objective to help Irish construction companies achieve strong positions in global market, and in this case will focus on the Dutch Construction industry.

I have been kindly invited by EI in my position of CEO at Gateway SCS which as you know owns the internet property Euroalert.net, a well know brand in Europe in the public procurement information services field.

I will be speaking around noon and I will share the morning panel with several experts in the Netherlands building environment sector:

I will try to contribute with Euroalert.net expertise in helping individual SMEs across the European Union, to compete with biggest companies, by providing more accurate, and fastest information on tenders in Europe or with market intelligence tools on public procurement opportunities like 10ders Observatory

In my presentation "Digging business leads out of the internet", I will mix the concepts of public procurement as an affordable way to export for SMEs, and the levelling factor of the web that brings more cost-effective tools to market intelligence on public procurement. The web is reshaping every aspect of our life and businesses have the opportunity to perform better in this Open Economy, by using improved tools to do old tasks. I hope Euroalert's modest contribution will help with useful tips in this at-the-gate-of-recovery scenery. Please, feel free to review and comment on the full presentation:
Digging business leads out of the internet

If you are in Dublin these days, please contact me, so perhaps we can share a pint of Guinness in the evening ;-)