A semantic user model for serendipitous discovery

I’ve been spending time recently on the design of SerenA’s user model. The user model is central to SerenA’s activities so that Serena explores the connections which are most significant to each user as an individual and follows an individual tour of the semantic web.

The user model:
• contains information about the user’s interests, expertise and goals, so that SerenA can discover relevant connections to present to the user;
• contains information about the user’s activities; which may include for example articles that the user has published, or the user’s current location;
• tells Serena where SerenA can find out more information about the user;
• contains information about how the user would like to interact with SerenA.

The SerenA user model uses concepts and vocabularies that are grounded in the semantic web. If the user is interested in “python”, “python” may have multiple meanings (e.g. the snake and the programming language) and we need a vocabulary to distinguish between them. Instead of storing that a user is interested simply in “python” we note their interest in an identifier that corresponds to the snake, or to the computer language.

We also need a vocabulary with semantic content so that SerenA can reason about the relationships between different concepts. The identifiers and relationships need to have meanings which are shared beyond SerenA itself, so that we can use these vocabularies to find connections in Linked Open Data. One of the vocabularies we use is that of dbpedia, a semantic web vocabulary which represents and connects the entries in Wikipedia. We can capture some of the different meanings of the word “python” by mapping it onto the different dbpedia identifiers:



:kingdom http://dbpedia.org/resource/Animal
:class http://dbpedia.org/resource/Reptile


:type http://dbpedia.org/resource/programming_language

Where dbpedia represents general knowledge, other vocabularies represent more specialist knowledge, e.g. knowledge about publication and authorship, or about particular technical or academic subjects.

As well as obtaining information directly from the user, the user can supply links to other sources of information which are stored in the user model. For example, a user who has published academic papers in computer science can supply a link to their identifier within DBLP, a semantic web bibliogrsphy for computer science. This link can be used to link to other information, such as co-authors. For example, keywords from academic articles that the user has published can be used to extend the model of the user’s expertise and interests.

It is also a requirement that information in the user model is under the user’s control. Different users will want to supply different kinds information and the user is not obliged to supply any pariticular kind of information to SerenA, although this will affect which connections SerenA is able to make for the user. For example, not all users will choose to share information about their current location with SerenA. If they do, SerenA can present up-to-date information about events in that location, perhaps a relevant workshop happening today in a city the user is visiting. Without the user’s location information, such immediate connections will not be available. The model is required to support privacy, so that the information in the user model is only shared with the user’s agreement, and openness, so that the user can see what information SerenA is holding about him or her and can correct any mistakes, out-of-date information etc.

With this semantic model of the user, we can use the model as the starting point for an exploration of LOD. By exploring different paths through LOD we hope to find information that is related to the user in unexpected ways.


Diana Bental
MACS, Heriot-Watt University

Leave a Reply

Your email address will not be published. Required fields are marked *



In other news