My friend and future colleague Pieter alerted me to an interesting challenge hosted on Belgian Television: a Wiki Wiki Challenge. The assignment is simple: find the link between two pages on the Dutch Wikipedia. How difficult can it be? Well, pretty difficult if you look at the assignments:
- Nicole Kidman -> TO -> Hugo Claus
- De ring van Brussel -> TO -> Thuis
- Okselhaar -> TO -> Postmodernisme
- Henk Rijckaert -> TO -> Chuck Norris
Importing nl.wikipedia.org into Neo4j
In order to calculate these paths between these pages, we first would have to import the Wikipedia articles into Neo4j. We had done this once before, using Graphipedia, a very easy piece of software to convert the downloaded wikipedia archive into a Neo4j database. It generates a 2.0 store format, but that's an easy upgrade to the 2.2M04 that I have running on my machine. The import was done in less than 30mins on this machine. So then we could start playing around...
Just quickly browsing through the "Pages" (a label in this database) and the "Links" (a relationship type in this database) gave us a feel for the assignment within Neo4j.
All we needed to do was to grab the two nodes in the Wiki Wiki challenge assignments, and run a "ShortestPath" algorithm on it - which is conveniently part of the Cypher query language. That would be it - so let's try it out.
Answering the Challenge Questions
Let's go through the 4 questions:
The query for this is pretty easy:
which gives you this result. So apparently there are quite a few paths possible:
If we just limit it to one result
you quickly figure out that we are going from
Similarly we can do the same with the next assignment:
2. De ring van Brussel -> TO -> Thuis
Same kind of query
and the result is in the city of Dilbeek, where there is a horseriding stable called "Hof ter Smissen" that is often featured in the TV show Thuis.
Again: job done! Onto the third (bizarre) assignment!
3. Okselhaar -> TO -> Postmodernisme
What is the link between armpit hair and postpodernism? We always wanted to know. A quick query later, we see the answer:
This answer is obviously a lot less trivial:
Here you can see that result in the Neo4j browser:
And another one bites the dust! One more to go!
4. Henk Rijckaert -> TO -> Chuck NorrisProbably the most telling and funniest link is the one between Henk Rijckaert and Chuck Norris. We only need three hops:
Look at it over here:
That was it. Easy peasy, and very cute to do - even though there are so many very very serious use cases where pathfinding over a graph is actually a fantastic use case for Neo4j. People like TomTom, the Belgian Railroad, UGent, and many others use this capability for very serious use cases - and it is so so powerful.
If you want to learn a bit more about similar use cases on Wikipedia, in English, then please take a look at WikiDistrict, developed by Kernix using Neo4j.
Hope you found this as interesting as we did.
Rik & Pieter
Target id: 370
Nomenclature: PAC1 receptor
Family: VIP and PACAP receptors
Annotation status: Annotated and expert reviewed. Please contact us if you can help with updates. » Email us
GtoImmuPdb view: OFF : Currently no data for PAC1 receptor in GtoImmuPdb