Usuario:Gabi/GSoC2012/Mejora backend nodos CNML

De Guifi.net - Wiki Hispano

Introduction

My proposal wants to improve the backend information model of community based telecommunications networks. This would provide better ways to organize, share and use the information and align their infrastructure to the principles of open data.

Community based telecommunications networks such as Freifunk, WlanSlovenija, AWMN and Guifi.net are growing steadily. Their underlying ideas are strongly based on open source philosophy. They share similar common values and practices, such as to lower with the digital divide and the barriers to access information.


There are many tasks that people participating in such networks have to do such as finding potential nodes to connect to, searching available services in the net, monitoring network status, and many others. It is important that the information of the network is available to tackle this tasks, furthermore, it is better if the information is well organize in accessible formats that could be used to build applications on top.

At the bottom layer open routing and network protocols ensure interoperability among networks making it possible for machines to communicate with each other, but the standards for sharing information of higher layers are more recent.


From the information perspective we find two problems.

Each network stores it´s own information using it´s own data model, describing different aspects of it that are considered important.

Fortunatly they joined efforts to develope a standar called CNML (Community Network Markup Language), and some of them such as Guifi.net

This standar has been implemented in some networks, such as Guifi.net (spanish), and others libre Freifunk are working on it [3].


Each of them stores information in their database making sometimes dificult to access directly to that information, end up looking like 'data silo'.

This is the main part of my propossal. It involves to migrate the actual Community Network Mark Up Language (CNML) based on XML to an ontology based model based on RDF/OWL and a triplestore such as Virtuoso or Sesame.


Project goals

  • Setup a Protege server for collaborative ontology definition, so other people could also take part and use it in the future. Possibly on Guifi servers.
  • Design an ontology based in CNML and generate and output in one or more of this formats RDF/OWL/Turtle.
  • Populate the ontology with Guifi.net CNML information.
  • Propose and install an infrastructure to run the semantic backend.
  • Create queries so it would be possible to retrieve data from the triplestore.
  • Document the process so people at other community networks could see the solution and follow similar steps.
  • Learn more about RDF/OWL/SPARQL!

Future developments

  • Set up web services to access data.


Implementation

To set up the server for collaborative ontology definition I would install a Protege server with the collaborative plugin, so it could be used by people in different locations.

I have experience setting up a Progete server with the collaboration plugin.

Once the ontology is defined, the output of the first would be a RDF/OWL/Turtle file.

I would them create the necessary scripts to populate the ontology with Guifi.net information. This would be done probably using Python.

Some of the possible triplestores to consider could be Openlink Virtuoso or Sesame that has a good python integration.

Queries would use SPARL. As an example a query could ask for all the nodes in a certain region, or for some specific services such as FTP or Voip servers.

For the documentation it would be good to ask the rest of the people involved for the best place to write it down.

Probably at interop wiki.


Project/proposal schedule

24-29 April:

Hello World! Identify people involved in CNML specification definition and Guifi.net backend and tell them about the project.
(I already started)
Start documenting the project at Guifi.net wiki and open a blog to post about the everyday process. 

30 April-13 May

Learn more about ontology definition RDF and OWL. Start studying CNML specifications.

14-21 May

Read about tripelstores to find the one that would fit best.

22-27 May

Choose and install triplestore and other libraries that would be used.
Set up a platform for building collaborative ontologies. Install Protege and collaborative extensions.

28 May-8 July

Creation of the ontology based on CNML standard. Ask mentor and other partners to give feedback, using collaborative Protege.

10-22 July

Search for methods to transform transform CNML information to RDF/OWL.
Create the scripts, possibly in python, for the migration.

23-29 July

Load the data from Guifi.net nodes into the triplestore. Start to create some sparql queries examples.

30-5 August

Continue with the SPARQL queries.

6-12 August

Take a second look to documentation and maybe other nice features like visualizations with processing, 
kml exporting (open week for more fun and creativity :)

13-19 August

Rest or finish work if necessary.

About me

My name is Gabriel Lucas, I study computer science in the UC3M in Madrid and TUM in Munich. Now I am also working at Medialab-Prado-

It is a very exciting place where I do different things like lead projects for the digital façade and research about different topics such as video streaming, open hardware, digital art, cultural center archives, licenses, or more philosophical themes as those around the concept of the commons.

I have been involved also in a project to create a semantic archive for a medialab.

Alejandro Martín and I started the group of Guifi.net Madrid.

And also I started another group about open video called Videoframesh.


Availability.

How many hours per week can you spend working on this?

I think I can work in the project around 15-20 hours a week.


What other obligations do you have this summer?

I would have to work at Medialab-Prado too, it´s a full time job but pretty flexible. If I get selected I can arrange to have time for this project.


How do you plan to continue with your project/proposal and within the wlan slovenija community after GSoC?

Once the project is over we could propose wlan slovenija to also migrate their information the same way. That would be very nice in fact because we would be able to query both networks.


Are you interested in doing some research in this field?

My degree final project is about linked data and network visualization for the metro public transportation system of Madrid. I am working on it right now.

I am interested in explaining to people how Internet works.


Benefits to the Free Software Community, who would gain from your project?

There are many benefits, the clear ones, information quality would improve providing better ways to use it to build any application on top of it.

No more screen scrapping would be needed and let me give an example.


There is a Guifi.net android application to search for nodes. It superposes the nodes in the camera image so you know how to align the antenas.

The problem of this app is that it takes the information from a GML file. The main developer is from Barcelona so the default GML file is the one from Barcelona.

If you live anywhere else you should search for the URL of the place you life in. By supporting SPARQL queries this could be resolved so the user could choose,

where he/she lives.

By the way, fortunately Guifi.net exports GML and CNML files, if not screen scrapping would be needed.

Herramientas personales