So, this is a copy of my node on RDF on Everything2. I'm not sure how accurate it all is - and I welcome corrections in comments - but it is what I've picked up over time. There's a pretty significant lack of anything related to RDF on E2, so most of the links in this entry are as yet incomplete - they don't actually link anywhere with information - although I hope to flesh at least a few of them out. Since E2 doesn't allow external links, there's no quoting of sources. Sorry for those of you who think that this is a bad thing.
This entry is geeky. It is filed under the "Technical->FOAF" category in my blog. This is the technology that powers the bot described in an earlier post.
RDF is a TLA for Resource Description Framework. It is designed to allow description of resources, specifically, those available on the world wide web. The technical specifications for RDF are controlled by the w3c, a standards body which is using RDF to further the goal of creating a semantic web, by which computer agents can understand the meaning of the content stored in webpages, rather than just being able to display the content for human consumption.
RDF is, first and foremost, a data model. RDF is a way to describe information in a way that computers can understand the data. All data in RDF is in the basic form of triples: statements contain an object which is related to a subject by a predicate.
- Subject: item being described.
- Predicate: URL relating object and subject
- Object: item describing subject
A combination of these statements creates a graph of data, which can be interconnected or not. this method of modeling makes creating descriptios simple.
RDF data can be seriallized in a number of ways. The most common serialization is XML. However, several others are in use: Turtle, Notation3, and ntriples. The serialization does not influence the content, only the manner of display.
RDF uses namespaces to allow expansion of the facts which can be described in RDF. Anyone can, using terms from OWL, create their own RDF namespace to describe a new topic. These namespaces can then be used as to form classes and properties. In addition, OWL allows information about these properties to be included, such as whether the property is an InverseFunctionalProperty.
RDF can be manipulated by a number of tools. There are RDF Query Languages, RDF Toolkits, and more, developed both by the W3C and external organizations. There are tools availble for manipulating and storing RDF data, as well as tools available for parsing RDF data and storing the content. One such toolkit is Dave Beckett's Redland. These tools are designed to make it simple to work with RDF.
Due to the nature of RDF's data model, the graph, merging datasets is simple. Simply combine the two sets of data, remove duplicates, and then work at creating consistency with the remaining data. For example, if I have a data set that says "Jim has a gender of male" and Joe had a data set that said "Jim is cool guy", there is no way to determine those Jims are the same people. If they both list an email address for Jim that matches, however, it becomes obvious that the two Jims are the same, and the statements describing each jim can be combined to describe a new jim.
RDF is a powerful technology which in many cases is not complete and unusable for public use. It can be serialized in a variety of ways, and manipulated with a variety of tools. Unlike other data modeling solutions, RDF can describe a number of things without relying on the tree format for information. This helps to merge distinct data sets, as well as a number of other ways.