July 31, 2017
by Deb Verhoeven and Toby Burrows
Over the past few years humanities researchers in Australia have established a large interconnected database of cultural information and used it to create detailed networks of relationships between the people, places and objects it decribes. This aggregated knowledge base is called the Humanities Network Infrastructure (HuNI — “honey”). HuNI takes records from many different research, museum and archive collections, locates them in one place and lets researchers propose the connections between these records in their own words.
This week we released a new iteration of HuNI that enables visitors to take advantage of the platform’s graph searching capabilities. Network graphs, those familiar spiderweb images of interconnection, are a powerful way of uncovering the meaning and significance of the knowledge embedded in cultural collections.
This article describes how some of these features work in HuNI. But before looking at the new functionality in detail, it’s worth outlining why this development is significant to humanities researchers.
Why: The graphic significance of HuNI
Relationships matter. They especially matter to historians, social scientists, literary scholars, media analysts and other humanities researchers. For humanities researchers, figuring out, following and filling in trails of relationships is fundamental to discovery. It’s how we “make connections”, in the sense of creating meaning.
These relationship trails, like the best detective work, are often painstaking to establish. The greater the nuance and complexity of relationships the more meaningful they are to humanities researchers. In an era of digital research methods, this preference for understanding and interpreting the rich and messy relationships between things or people or concepts has not necessarily been well supported.
The advent of the World Wide Web with its capacity for hyperlinking offers a type of connectivity that has proven to be convenient for researchers. But it has limitations. There’s no evident description of web links. When we lurch at warp speed from one page to the next, we can’t see how or why these two web pages are linked to each other for example. And hyperlinks are always singular, a pithy description of a one-to-one relationship. And they are uni-directional. We can’t see how the connection between websites or pages might differ if the direction of the hyperlink is reversed. The World Wide Web does not come even close to representing the intricately detailed, overlapping, interconnected world that typifies our lived experience.
Digital databases manage relationships with more fluency than the web. But despite their name, “relational databases” aren’t the best format for storing relationships between digital items or records. This becomes apparent when a researcher is looking to follow and understand “many-to-many” relationships between multiple points of data. Relational databases struggle to perform this kind of complex search, especially as the volume of data increases.
Because they are designed specifically to capture the connections between records, “graph databases” have emerged as a technology that can provide powerful support for the investigative work of humanities research. Graph databases are purposely intended to store, query and analyse the relationships between things. And they let you define and see these connections in different ways, using a list perhaps, or more impressively as a network visualization of the constellations and trails of association.
What: Explaining the HuNI graph capability
Let’s say you want to explore the influence of Australian landscape art on early Australian cinema. In the past this would have been a difficult undertaking involving many searches across many datasets and catalogues only some of which might be available online. HuNI is designed to streamline this research exercise without compromising the complexity of possible interpretations.
Using the HuNI graph we can see how individual artists are connected to filmmakers and follow (and edit or add to) these connections.
Items of information in HuNI are broadly categorised as people, places, events, organisations, works or concepts. In a graph database these items are called “nodes”. Relationships connect nodes. All relationships must have a name and a direction.
In HuNI “nodes” can be bi-directionally connected with each other. A link between two people might be described as “is the sibling of”. This would work in both directions. But “is the parent of” only works in one direction. In reverse it would need to be “is the child of” for example. HuNI supports these relationship reversals.
Connections between information items are called “edges”. In HuNI there are two different types of associations or “edges”. The relationships between records are themselves distinguished between standardized HuNI system generated links (black lines) and user created links (blue lines).
System generated links have been ingested into HuNI along with their accompanying data and conform to a “controlled vocabulary”, a limited set of agreed terms for describing relationships.
User-generated links are written in everyday language rather than drawing on a pre-formulated blueprint of the most important relationships. They are as creative and complex and nuanced and contrary as the multitude of HuNI users themselves. Instead of the standard “is the sibling of”, HuNI allows you to describe and/or evaluate a relationship in more detail: such as, “is mostly the evil sibling of”. HuNI also makes it possible for users to contest these links and provide alternative interpretations of how records are connected. In fact, we encourage it.
User-generated and system-generated edges in HuNI
If you click on the line of an edge you will see a pop-up window describing the link in detail:
A pop-up link description in HuNI
How: Using HuNI to discover connections
Queries of the HuNI data can be performed via keyword or through the graph search or a combination of both.
Once you’ve found a node of interest (in this case a filmmaker, Ken G. Hall), you can click on the magnifying glass and then the “Find reachable nodes” button, to find all the records in HuNI that can be arrived at by traversing the network graph from Ken G. Hall’s record. This can be viewed as a list of all the connections (534 nodes can potentially be reached from this record):
A list of all the connected nodes that can be reached from Ken G. Hall’s record
For each of these links, HuNI calculates the “Bacon Distance” from the original starting-point: this describes how many intervening links there are from the original node. A “Bacon Distance” of 1 means there is just one step between Ken G. Hall and that record. You can see a visual representation of these links by hitting “view path”.
You can also see how the original node, Ken G. Hall, is connected to another node, by clicking the magnifying glass icon associated with the second node. Or if you don’t want to scan all the way down the list you can type directly into the keyword search box and then hit the magnifying glass to quickly find out if other records are related to Ken G Hall.
So for example, if I search on the Australian artist Walter Withers from Ken G. Hall’s page page, HuNI will attempt to find how these two are connected. The result is a network graph showing the intervening nodes and edges — in this case between the Australian artist Walter Withers and the filmmaker Ken G. Hall.
A visual representation of Walter Withers’ connections to Ken G. Hall
In terms of our initial quest — to understand the influence of Australian artists on an emergent Australian cinema — these connections are inspiring:
A production still from The Squatter’s Daughter (Ken G. Hall, 1933). Source: NFSA
The Drover (Walter Withers, 1912). Source: Bendigo Art Gallery
Viewing search results as a network visualisation is also intended to aid speculative exploration. Clicking on any node in the visualization opens up all the connections associated with that node — in other words, you can see the web of links which stretch outwards from this starting-point and then from the next node and so on. In this way HuNI encourages researchers to meander along the byways of connection.
Sometimes however, there are so many connections and constellations that it’s hard to see how two nodes are related. To help in these circumstances HuNI allows you to find the shortest path between two records.
Here is a graph representation of just some of the diffuse relationships extending from the trail between Walter Withers and Ken G. Hall:
To simplify this visualization (without affecting its underlying detail) it is possible to use the graph search. Simply hit the magnifying glass and then click on the node in the visualization you are seeking the shortest path to. Here is the streamlined, “shortest path” version of the networks extending from Walter Withers to Ken G. Hall:
Visualisation of the shortest path between Walter Withers and Ken G. Hall
The figure of Hilda Rix Nicholas looms importantly in this trail of connection. Prompted by this insight, further investigation reveals that her work closely accords with Ken G. Hall’s depiction of women in pastoral Australian settings.
Production still, The Squatter’s Daughter (Ken G. Hall, 1933). Source: NFSA
The Fair Musterer (Hilda Rix Nicholas, 1935). Source: Queensland Art Gallery
Conclusion: The semantic web (and more)
In 1999, in his book Weaving the Web, Tim Berners Lee outlined his vision for the future of the internet:
“I have a dream for the Web . . . and it has two parts.
In the first part, the Web becomes a much more powerful means for collaboration between people. I have always imagined the information space as something to which everyone has immediate and intuitive access, and not just to browse, but to create. […] Furthermore, the dream of people-to-people communication through shared knowledge must be possible for groups of all sizes, interacting electronically with as much ease as they do now in person.
In the second part of the dream, collaborations extend to computers. Machines become capable of analyzing all the data on the Web — the content, links, and transactions between people and computers. A “Semantic Web,” which should make this possible, has yet to emerge…”
HuNI goes some way to putting this two-part vision into practice. It enables researchers to both create, browse and discover knowledge — through a combination of human and system generated “connections”.
To this vision HuNI also contributes the additive nature of graph databases in which new kinds of relationships can be proposed and included without disrupting the overall functionality of the knowledge base.
Everyone is welcome to add their insights and knowledge to HuNI, all you need in order to log in is a social media account.
In fact, the more that researchers use and add to HuNI, the sweeter it is for everyone.