Last week I brought up the subject of semantic webs for legal documents. This week I want to expand the subject by discussing the technologies that I have encountered recently that point the way to a semantic web. Of course, there are the usual general purpose semantic web technologies like RDF, OWL, and SPARQL. Try as I might, I have been unable to get much practical interest out of anyone in these technologies. Part of the reason is that the abstraction they demand is just beyond most people’s grasp at this point in time. In academic circles it becomes easy to discuss these topics, but step into the “real world” and interest evaporates immediately.
Rather than pressing ahead with those technologies, I have chosen in recent years to step away and focus more on less abstract and more direct information modeling approaches. As I mentioned last week, I see two key areas of information modeling – the documents and the relationships between them. In some respects, there are three areas – distinguishing the metadata about the documents from the documents themselves. Typically I lump the documents with their metadata because much of the metadata gets included with the document text blurring the distinction and calling for a more uniform integrated model.
The projects I have worked on over the past decade have resulted in several legislative information models. With each project I have learned and evolved to result in the SLIM model found at the Legix.info demonstration website that exists today. Over time, a few key aspects have emerged as most important:
- First and foremost has been the need for simplicity. It is very easy to get all caught up with the information model, discovering all the variations out there and finding clever solutions to each and every situation. However, it easily becomes possible to end up with a large and complex information model that you cannot teach to anyone that does not share your passion and experiences in information modeling. Your efforts to satisfy everyone result in a model that satisfies no one due to the resulting complexity of trying to please too many masters.
- Secondly, you need to provide a way to build familiarity into your information model. While there are many consistently used terms in legislation, at the same time, traditions around the world do vary and sometimes very similar words have quite different meanings to different organizations. Trying to change long standing traditions to arrive at more consistent or abstract terminology always seems to be an uphill battle.
- Thirdly, you have to consider the usage model. Is the model intended for downstream reporting and analysis or does the model need to work in an editing environment? An editing model could be quite different from a model intended only for downstream processing. The reason for this is that the manner in which the model will interact with the editor must be given important consideration. Two important aspects require consideration. First, the model must be robust yet flexible enough to work with all the intermediate states that a document will exist at whilst being edited. Second, change tracking is a very important consideration during the amendment process and how that function will be implemented in the document editor must be considered.
While I have developed SLIM and its associated reference scheme over the past few years, in the last year I have started experimenting with a few alternate models in the hopes of finding a more perfect model to solve the problem of legislative information modeling. Most recently I have started experimenting with Akoma Ntoso developed by Fabio Vitali and Monica Palmirani at the University of Bologna. This project is supported by Africa i-Parliaments, a project sponsored by United Nations Department of Economic and Social Affairs. I very much like this model as it follows many of the same ideals in terms of good information modeling that I try to conform to. In fact, it is quite similar to SLIM in many respects. The legix.info site has many examples of Akoma Ntoso documents, created by translating SLIM into Akoma Ntoso via an XSL Transform.
While I very much like Akoma Ntoso, I have yet to master it. It is a far more ambitious effort than SLIM, has many more tags, and covers a broader range of document types. Like SLIM, it covers both the metadata and the document text in a uniform model. I have yet to convince myself as to its viability as an editing schema. Adapting it to work with the editors I have worked with in the past is a project I just haven’t had the time for yet.
The other important aspect of a semantic web, as I wrote about last week is the referencing scheme. Akoma Ntoso uses a notation based on coded URLs to implement referencing. It is partly based on a conceptually similar model URN:LEX model based around URNs developed by Enrico Francesconi and Pierluigi Spinosa at the ITTIG/CNG in Florence, Italy. Both schemes build upon the Functional Requirement for Bibliographic Records (FRBR) model. I have tried adopting both models but have run into snags with the models either not covering enough types of relationships, scaring people away with too many special characters with encoded meaning, or resulting in too complex a location resolution model for my needs. At this point I have cherry picked the best features of both to try and arrive at a compromise that works for my cases. Hopefully I will be able to evolve towards a more consistent implementation as those efforts mature.
My next effort is to start taking a closer look at MetaLex, an open XML-based interchange format for legislation. It has been developed in Europe and defines a set of conventions for metadata, naming, cross references, and compound documents. Many projects in Europe including Akoma Ntoso comply with the Metalex framework. It will be interesting for me to see how easily I can adapt SLIM to Metalex. Hopefully the changes required will amount mostly to deriving from the Metalex schema and adapting to its attribute names. We shall see…