According to Wikipedia (page: Linked Data), Tim Berners Lee coined the term "linked data" in 2006 in a document about the Semantic Web project. Nonetheless Wikipedia goes on to cite Bizer, Heath and Berners-Lee's 2009 paper entitled "Linked Data: The Story So Far" as a source for its opening definition of Linked data:
"a method of publishing structured data so that it can be interlinked and become more useful through semantic queries. It builds upon standard Web technologies such as HTTP, RDF and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried."
Why is Linked Data relevant to the PoMS project? First of all, I believe Linked and Open Data principles are particularly relevant to PoMS because PoMS is a prosopography, and, generally speaking, I believe that published prosopography offers an almost ideal kind of research that could be expressed as linked data. There are two senses in which prosopography connects with linked data's central principles. First, because a prosopography aims to develop the identity of their historical persons in a way that crosses multiple historical sources, these identified historical people act, by their very nature, as a kind of interlinking between these different sources. Second, a prosopography is, at least potentially, a global object — something used by other researchers throughout the world as a source for identities for historical people. The people-as-entities in a prosopography ideally have a global reach and can thus play a part in the Global Graph that web folk, and those in the Semantic Web and Linked Data in particular, talk about. For these reasons, it seems to me that a prosopography forms the basis for a particularly rich and interesting Linked Data kind of publication.
Furthermore, the People of Medieval Scotland (PoMS), like DDH/CCH's many other prosopographical projects, is constructed based on a representation of its materials in the form of highly structured data. Indeed, like DDH/CCH's other structured prosopographies, PoMS is built on top of that quintessential highly structured paradigm: the relational database, and as a result, PoMS's historical research work has been already expressed in terms of entities, attributes and relationships as they are thought of in the relational data model. Since the Linked Data model is also based on the idea of representing materials in the form of highly structured data that is accessible globally, PoMS's highly structured database would appear to fit well with it.
Finally, unlike the other structured data prosopography which has been expressed by DDH/KDL in Semantic Web's RDF technology, DPRR (http://romanrepublic.ac.uk/rdf) PoMS is in fact created using the factoid prosopography paradign (Bradley 2017) as one of its fundamental semantic principles. Indeed, it is explicitly connected to the Factoid Prosopography Ontology described in Bradley 2017 through its own ontology.
We have already said that PoMS is built upon the relational database — and called this the "quintessential structured data paradigmâ€. Here, however, we are talking about a linked data or semantic web representation of PoMS's materials, and although both linked data/semantic web technologies and relational database technologies are built upon a shared basic conception of highly structured data, they are not the same. What, then, is necessary to turn PoMS's already existing database-like structured materials into a publication that fits the similar-but-different Linked Data model? In order to think about this most usefully, we need to understand the fundamental principles of Linked Data.
Tim Berners-Lee gave a presentation on linked data at the TED 2010 conference. In it, he restated the linked data principles as three "extremely simple" rules:
More formally, Bizer, Heath and Berners-Lee's 2009 paper, mentioned earlier, specify four criteria that Berners-Lee had described as a "set of 'rules' for publishing data on the Web" in a way that all published data becomes part of a single global data space. These four principles are presented succinctly in Wikipedia's "Linked Data" entry:
To at least some extent, rules one and two of these four seem to be met already by PoMS's existing web browser oriented web application at URL https://www.poms.ac.uk/. Rule one, for example: there is one publishable URL provided by the browser-oriented web application for each person in PoMS, and it is a RESTful one (definition of RESTful URL's see Wikipedia's definition here). An example from PoMS would be the URL for Abraham, Bishop of Dunblane (fl.1210×14-1220×25): https://www.poms.ac.uk/record/person/749/.This URL, then, could be interpreted as rule 1's "URI" acting as a kind of name for its person. Furthermore, if the existing web app is directly presented with this person's URL, the application will return back an HTML page containing the information PoMS has about that person, you can see this happening when one clicks on the Bishop's URL shown above. Thus, as rule 2 requires, anyone with WWW access can use this HTTP URI to look up that PoMS person. Furthermore, the generated PoMS page provides, as rule 3 says, "useful information" about the entity it refers to — although, of course, the material is presented as an HTML page in a form suitable for presentation by a web browser and is not delivered using the semantic web standards of RDF. Finally, (rule 4) these generated web pages do in fact contain links to other URIs within PoMS.
So, what is missing from the existing PoMS web application that is needed to make it more fully into a Linked Data application? The key issue can be found in the second half of Wikipedia's definition of linked data — items 3 and 4. As Bizer, Heath and Berners-Lee say, to operate as Linked Data, the material has to presented "in a way that can be read automatically by computers." They then go on to say that this enables data from different sources to be "appropriately connected and queried." With the current "browser oriented" web application at www.poms.ac.uk the material is presented in terms of a HTML web page suitable for reading by a human user, rather than in the form that explicitly expresses the formal structured data. Of course, one can apply techniques called "screen scraping" to extract the data from the presented web pages, but screen scraping is broadly understood by its practitioners as awkward to do, and prone to error. Thus, when presented as a set of HTML web pages, PoMS's data cannot readily be processed, as data, by computers, and cannot readily be used as a source to be connected, as data, with other sources. This is why Bizer, Heath and Berners-Lee attach an explicit reference to RDF in their rule 3. RDF is described in its own documentation as a representation of a world-wide "graph-based data model" (section 1.1, https://www.w3.org/TR/rdf11-concepts/). By presenting the PoMS data in RDF, a language specifically designed for interlinking between data that can operate potentially world-wide, and is then available for further computer processing, one can present PoMS's research materials as a more satisfactory Linked Data source.
The work described and referenced in these pages does exactly this: it turns PoMS's relational database which holds most of the intellectual work embodied in PoMS into RDF, and then uses pieces of RDF-related technology to deliver RDF over the internet to anyone that wants to use it. However, the work done here goes further than just this. In addition:
So, the work described here resulted in several products:
The rest of this PoMS RDF server documentation site talks more about this work, and has three parts: