A Supplement to Chapter 4

A.2 Definitions 2: Attributes

In our graph model, nodes are objects composed of attributes that are used to keep metadata of nodes. These attributes are formulated using the notation n.a for an attribute a of a node n. The most important metadata kept for a node are n.name and n.type, where name is the natural language label of the node. The attribute type can only take a limited set of values: type{category,categoryValue,codeGroup,document,documentGroup}.

A.3 Definitions 3: Graph drawing

A drawing of a graph G=(N,L) is a collection of points in a two-dimensional space. Each point pi with coordinates x and y is the position of the node ni in the layout. Whenever there exists a link (pi,pj)L, a line is drawn between points pi and pj. The task of the layout algorithm is to find a positioning of points so that specific criteria are optimally met. Examples of commonly used criteria are: nodes should not overlap, neighbouring nodes should be grouped together, the number of crossing link should be minimised. Each algorithm and set of criteria has its own benefits and drawbacks.

A.4 Definitions 4: Degree & neighbourhood

For a node njn the degree is defined as the number of links a node has: deg(x)=|{nj:lijL}|. The set of linked nodes is called the neighbourhood of a node. The neighbourhood Hi for a node nj is defined as: Hi={nj:lijLljiL}.

A.5 Definitions 5: Betweenness centrality

The betweenness centrality of a node n is defined as bc(n)=sntσst(n)σst. Where σst is the total amount of shortest paths from node s to node t and σst(n) is the amount of those paths that pass through n. A path is a sequence of nodes, where each pair of nodes in the sequence is linked. The shortest path is the path between two nodes s and t that traverses the smallest number nodes. The equation for betweenness centrality takes into account that there may be several possible paths from s to t, with only some passing through n.

A.6 Definitions 6: Presence

The presence of all categories in a document group node nx is a set of all category nodes Categories(x)={nyN:(lxyLlyxL)ny.type=category}. The presence of a category nx in a document group is the set of nodes of the type document group for which there exist a link between this category and the document group. Formally defined as: Presence(x,documentGroup)={nyN:lxyLinksy.type=docGroup}.

A.7 Definitions 7: Intersection and difference

The absence of categories between a docGroup1 and docGroup2 is the set of categories present in the second document group minus the set of categories present in the first. In our notation: Absence(docGroup1,docGroup2)={Categories(docGroup2)Categories(docGroup1)}. The categories that are common between those same two document groups are determined using the intersection of the sets of categories that are present in either: CommonCodes(docGroup1,docGroup2)=docGroup1docGroup2. This operation is not limited to two sets. The intersection between more sets can be notated as ni=1Presence(docGroupi).

A.8 Table presence of code groups for authorities