1 Introduction

Email has been referred to as ‘the backbone of the internet’, a ‘virtual working environment’ and the ‘main means for distributed collaboration’ ([1]). An email collection is an organically formed record that documents both important and everyday moments in an individual’s life and work. The extent of information that can be extracted from such a dataset makes email collections a rich source for investigating patterns of human behaviour, relationships, and communications (cf. [2,3,4,5,6,7]). However, there are caveats to the valuable nature of this data, most notably the enduring ethical concerns provoked by facilitating access to such personal content. Email research often thrives on the details of individual lives and connections with others, information that can be deeply private, sensitive, and/or confidential in nature. The challenge has hitherto encouraged a caution-driven practice of closing or severely restricting access to collections.

Scholars and custodians of data alike have explored and implemented a great range of methods for accessing and exploring email collections (e.g. [2, 3, 8,9,10]), and yet the impact of these with regard to privacy preservation is not widely discussed nor, seemingly, understood ([11]). This partly reflects the complexity of thoughts surrounding privacy in itself (cf. [12,13,14]). Regardless, this disconnect amplifies continued uncertainty, resulting in a ‘risk-adverse attitude’ (cf. [15]) amongst custodians of data. Consequently, a great swathe of potential research data remains locked within closed or ‘dark’ archives ([3, 4, 15,16,17]). Whilst preventing access might be ‘[t]he most intuitive way to preserve privacy’ ( [18]), it also, in many ways, defeats the purpose of maintaining the records, particularly in instances where the relevance of the data might be time sensitive ( [11, 15]). This is the second of three key challenges that Lise Jaillant identifies the archival sector to be facing along ‘the path from the appraisal of records to their analysis’( [16]).

Even in cases where an email archive is not ‘dark’, discoverability is a continued issue with a heavy reliance on search infrastructure and accurate metadata and cataloguing ( [16]), as well as a demand on the end user to have ’a rough idea of the information they are trying to retrieve’ ( [19]). In response, data visualisation has been used in many facets of email research to support the holistic, creative, and perhaps even ‘playful’ (cf. [20]) exploration of email datasets. They have been shown to reveal patterns and insights that may otherwise be obscure to researchers (cf. [21,22,23,24,25]). The exploratory and browsing behaviour encouraged by visualisations (cf. [26, 27]) is of particular use for high volume data. They ‘capitalise on the characteristics of digital sources’ ( [28]) facilitating a malleable perspective on a collection. In short, visualisations represent a method that may support both researchers and practitioners to engage usefully with email collections, irrespective of pre-existing data analysis skills.

Although many have noted the value visualisations have for research-enabling interface to email collections, the understanding of how the design of a visualisation interacts with, protects, or compromises privacy is understudied. Without an understanding of the impact of the visualisation on privacy, it is possible that the method of mediation might negatively impact upon access or open the data to reveal ‘previously unknown patterns and relationships’ ( [11]) that might, contrary to intention, compromise privacy.

It is within this knowledge gap that the research presented within this paper sits. It presents findings from an empirical investigation on the potential for visualisations to facilitate access for users and provide a degree of protection for any personal or sensitive data contained within a dataset. Through these findings, this paper intends to promote a greater understanding of the relationship between privacy management strategies and the impact that these might have on the perceived usefulness of visualisations to users. This, in turn, might support both researchers and archivists ‘to capitalise on the information available to them at the appropriate scale of privacy’ ( [11]), therefore mitigating the need to close email archives to adhere to the legal and ethical requirements of engaging with sensitive data. Should such an approach prove fruitful, it would fall within the calls for archivists and other custodians of knowledge to ‘consider very different types of access’ that more closely reflect user needs ( [29]).

In the next section, we start by setting the scene to explain our approach to selecting and implementing visualisations in our user case study. This is followed by a detailed methodology of the user study (Sect. 3). Section 4 sets out the findings from the study which, in turn, is followed by a reflective discussion in Sect. 5, that considers the results of the study, their implications, and future work that might be conducted in this area.

2 Background

Our approach to the current study is developed through three steps. First, previous email research is reviewed, especially where data visualisation techniques have been employed and/or evaluated (Sect. 2.1). Second, ethical concerns for digital archives are also discussed, with a special focus on concerns associated with privacy and email collections (Sect. 2.2). Finally, in Sect. 2.3, we explain how we bridge these areas, to formulate our research questions and to select and generate our visualisations for our user case study.

2.1 Email visualisation

Research related to emails often poses questions concerned with understanding how people use email for communication and what this can reveal about them, their environment ( [30,31,32,33]), and their social/professional network ( [34,35,36,37]). Building indirectly on this understanding of email usage are studies aimed towards improving the efficiency and efficacy of communication workflows ( [1, 38, 39]), and the filtering out of unwanted communication ( [40,41,42]). Additionally, in the humanities, email data research naturally aligns with that of older forms of correspondence such as letters (cf. [43]), for example, the close reading of selected passages for qualitative analysis in the context of other events and achievements in their lives. Features such as the metadata found in email headers (e.g. time stamps, subject, who is sending and receiving) help broaden this context, to open up the researchers’ gaze to a wider array of analysis than its technological predecessors might have allowed.

A systematic classification of email research ( [11]) reveals two strands of thought (cf. Fig. 1)—one with the focus of enquiry on people (e.g. the patterns of relationships and social network analysis), and one which concentrates on the emails themselves and their usage (e.g. topic identification, content analysis, patterns of behaviour).

Fig. 1
figure 1

A visual representation of the dyadic categorisation of email analysis and the more nuanced categories that sit within this. Source: [11]

The use of specific types of visualisation has, on the whole, been agnostic of these branches of research ( [11]), although there are exceptions to this with, for instance, studies focused on social network analysis prioritising network graphs (cf. [24, 44,45,46,47,48,49,50,51,52,53]). The great variety and adaptability of visualisations ensure that many common designs (e.g. bar charts [24, 49, 54,55,56], line graphs [2, 24,25,26, 49, 54, 55, 57, 58], scatter/bubble plots [46, 54, 57,58,59], pie charts [60]) can be adapted to diverse research objectives. There have been several, more specialised types of visualisations that were employed across the spectrum of research interests, such as timelines ( [25, 57, 59, 61,62,63]), heatmaps ( [64]) and iconographic representations ( [65,66,67]), and some studies even creatively combine visualisations in a hybrid approach (e.g. [22, 25, 45, 48, 48, 53, 63, 65, 66, 68]).

The literature shows that network graphs, of various types (e.g. random, force directed, tree), are notable as a mainstay of social network analysis research (cf. [24, 44,45,46,47,48,49,50,51,52,53]) with all 18 items reviewed in this area using this visualisation. The research for patterns of relationships employs a more varied selection with no particular preference: including widely popular visualisations such as scatter and/or bubble plots ( [54, 59]) to newer visualisations such as mountain graphs ( [57]). Bar charts are most regularly used in literature for studies investigating patterns of behaviour, although, as a mainstay of visualisation creation, they also appear in studies focused on other branches of investigation (cf. [24, 49, 54,55,56]). Email content analysis ‘aims to help users navigate a collection and withdraw meaningful data whether as a search or summary mechanism’ [11] and, as with many forms of textual analysis, the forms of visualisation used are quite broad (cf. [25, 52, 57, 57, 60,61,62,63, 69, 69, 70, 70]). The word tree visualisation is one of these (cf. Fig. 9), a type of visualisation that has proved useful for the early stages of textual exploration (cf. [71,72,73,74]) and, as such, will be employed in our study.

In exploring the ‘state-of-the-art’ approaches to visualisation design, [75] highlights several criteria that encompass successful visualisation. They indicate that data visualisations should be ‘familiar’, ‘able to convert abstract information’ in a way that ‘preserves its underlying meaning but also provides insights to the user’. In each of the studies above, it is argued, if indirectly, that the method of visualisation utilised fulfils these criteria, therefore creating a useful interface for the potential users (cf. [2, 25, 26]). These studies, however, centre their focus on the particular features of the visualisation under investigation, rather than exploring the broader applications or benefits of the design outside of the stated purpose. Therefore, whilst the visualisations might be well suited to the task at hand and they might also fulfil the key criteria of good design established by [75] and other scholars, it is not possible to extrapolate meaningfully from these studies as to what might benefit the sector as a whole, particularly with reference to user needs.

2.2 Email collection ethics

Within the context of email collections, it is necessary to advance discussions of email visualisation beyond the immediate needs of the researcher to also address questions of ethical needs. Emails, in their raw form, not only contain information that can identify people by name, email address, and/or affiliation, but contain detailed information about locations, events, and relationships between people. Metadata alone can be used to infer identities, and sensitive and/or confidential information. For example, "e-mail headers reveal who is central to your professional, social and romantic life"( [76]). The access to such collections creates opportunities for "private information within these collections to be disseminated widely and without consent" ( [77]) even where it creates opportunities for much needed research ( [78]).

Emails also often have a tendency to include information beyond that which is written or intended to be received by the email account owner, or worse, those who access it later. For example, emails have attachments which could, if distributed further, entail copyright infringement or communication of privileged, proprietary, or confidential information. In established archival practice, it is standard practice to consider materials of long deceased individuals of less risk of disclosure. Even when the primary owner of the email is deceased, content is directly associated with others who may still be living, potentially causing distress or issues of privacy. This challenge is compounded by the potential for the emails of others to get copied in as a thread and, sometimes, even sent to individuals who were not intended to have access. Effectively, when you archive emails in one person’s personal archive, you are archiving other people’s emails as well.Footnote 1 It has to be recognised that when it comes to digital forms of communication, it is not always possible for creators to be aware how the information would be used in later contexts and can interfere with an individual’s right to forget ( [79]). In addition, some laws and/or regulations stipulate that the control of the data needs to take into account cultural needs.Footnote 2

Privacy management is an especially thorny and shifting concept ( [12]), and an intersection of research relevant to email research and visualisation which has, thus far, remained largely unexplored. The majority of the studies identified in [11] made no mention of privacy, and nearly a quarter of the studies were tested on participants’ own email collections, or in a slightly smaller sample, on the popular open source email dataset, the Enron dataset.Footnote 3 It is highlighted that only two out of the 39 reviewed papers engaged with a personal email archive (cf. [25, 56]) and, of these, one involved the owner of the archive as a co-author. This is a distinct gap within literature pertaining to email visualisation research, one which has arisen, at least in part, due to the difficulties involved in defining privacy.

Debated in scholarship at least since the philosophies of Aristotle ( [80]), little has been agreed about the definition of privacy other than that it is a multifaceted concept encompassing legal, ethical, cultural and personal dimensions. There have been ‘many attempts to create a synthesis of existing literature’ ( [81]), but the default approach to protecting privacy for many institutions, archives included, has necessarily been to rely on the more concrete legal definitions of, for example, personal and sensitive data,Footnote 4 as well as on the ethical mandate to limit harm.

This situation is not one that will improve with time. In 2012, it was noted that approximately ‘75% of the email accounts belong to individual users, with only 25% belonging to organisations’ ( [19]). This statistic is more than a decade old at the point of writing and, therefore, does not necessarily reflect the proportions of email data that are destined to be archived in coming years. Whilst the ‘risk adverse attitude’ (cf. [15]) of present custodians of data is quite logical given the potential ramifications from mis-managed email data, the great swathes of incoming, culturally significant data necessitates the inclusion of alternative approaches in order to facilitate effective user-driven access and, therefore, research.

There is not, at present, a nuanced and consistent approach for managing privacy with respect to email collections, although [11] presents the first steps towards this. The paper explores existing literature pertaining to the visualisation of emails and the impact of different design choices on the level of privacy consciousness. The five privacy consciousness (PrivCon) levels discussed in the paper ( [11]) represent a scale of privacy management strategies that might be applied to the data that forms the basis for different visualisations. These strategies range from full disclosure (PrivCon 0) through to closed to public access (PrivCon 4) with each level representing a category of privacy management as opposed to a specific method. The description of the levels is reproduced in contracted form below:

  • PrivCon0—the open end of the scale with no accounting for privacy, there are visualisations that contain the full range of the data as would be utilised in, for example, in a forensic examination of the data or an archivist’s appraisal when a full collection has been donated.

  • PrivCon1—the introduction of redaction that ‘includes situations whereby the data have been altered or removed in order to obscure the identity of individuals contained within’.

  • PrivCon2—‘the grouping or amalgamation of data to the point that individuals become ‘lost in the crowd’, minimising the risk that details might be identified’.

  • PrivCon3—the introduction of noise which ‘involves shifting the data through the use of an algorithm, statistical model or encryption, in a way that maintains the statistical characteristics of the data set, but the detail does not consistently reflect the original’.

  • PrivCon4—it represents a closed collection which has been fully redacted and contain only a descriptive representation of the collection with only a cursory indication of contents. This presentation of the data represents what might be found in an online catalogue for an archival collection that only permits on-site access, or for a fully embargoed collection.

The manner in which the PrivCon levels (0-3) might be applied to a dataset is displayed in Fig. 2. The paper ( [11]) reveals a skewed distribution of approaches in the thirty-nine papers reviewed, with a clear tendency leaning towards anonymised/pseudonymised content (PrivCon 1). This is summarised in Table 1 along with pros and cons of each type of strategy.

Fig. 2
figure 2

A representation of different PrivCon levels. PrivCon 0 is a direct representation of the data. PrivCon1 has had the sensitive data removed. PrivCon2 has been amalgamated, in this instance by domain type. PrivCon3 has had noise introduced, at random, to obscure the relationship between certain data points

Table 1 Distribution of PrivCon levels adopted in the literature review of [11], collection type, and pros and cons of each level

2.3 Research questions and approach

The present paper sits at the intersection of the knowledge gaps identified above, seeking to explore how email visualisations research, privacy, and useful user-driven access might interact.

To investigate the interplay between usefulness and privacy with regards to visualisations, we generated visualisations utilising data drawn from a filmmaker’s personal digital archive. We then filtered these through the privacy-aware strategies that reflect the PrivCon levels discussed in Sect. 2.2. The visualisations were presented to researchers and archivists to explore the extent to which each type of visualisation and each level of privacy was perceived to be useful to their respective workflows.

To advance this research towards a practical solution for privacy management in the archive, this paper engages with arts and humanities scholars as well as archive practitioners to explore how perceived usefulness of email visualisations changes for these stakeholders as data are curated to respect different levels of privacy. It approaches this in response to the following research questions:

  • \(RQ_1\): What is the relationship between the extent of privacy-awareness applied to visualisations of email collections and the usefulness of these visualisations to researchers/practitioners?

  • \(RQ_2\): What design features of the privacy-aware visualisations are the most/least useful for researchers and practitioners as an interface for the email collection?

Given the great variety of visualisation designs in the literature (Sect. 2.1), it was necessary to select a subset to feature within this study. Visualisations are selected to include at least one used in the branches of research identified in [11]. Of the five branches of research (cf. Fig. 1), the area of topic identification was excluded from this study. Topic identification was most often keyed towards the removal of spam, or the automatic categorisation of content (e.g. [64, 82]). Within the bounds of the current study, and given the scope of the available data, it was not deemed viable to train a model for the automatic detection of content. In fact, within archives, the use of AI and machine learning is a relatively new, but growing field ( [83]). Whilst it may be possible, at a later date, to incorporate such methods as standard within an archival setting, the sector has not yet arrived at this point.

In Sect. 2.1, we noted the prominence of network graphs for social network analysis. For this reason, we included directed network graphs as one of our visualisation for the study. We further noted that other categories of research were not inclined towards any particular visualisation. As such we selected establish standards for patterns of relationships and behaviour: scatter plots and bar graphs, respectively. We further included two newer forms of visualisations, mountain graphs, and word tree, observed in the literature to be useful for analysing patterns of relationships and content analysis. Examples of all these visualisations are shown in Sect. 4.3.

We describe below how each type of visualisation was implemented to incorporate privacy management strategies reflecting PrivCon levels (Sect. 2.2).

2.4 Implementation of network graphs

For the study, the participants were presented with four network graphsFootnote 5 at each level of the PrivCon Scale (0-3). At PrivCon0, the visualisation contained full email addresses. These were removed for PrivCon1, leaving only the shape of the connections for interpretation. PrivCon2 involved aggregating the dataset, grouping data points by email domain name. The most stringent level of privacy, PrivCon3, is implemented using noise introduced to reflect the general distribution but to avoid issues of identity reconstruction noted with network graphs ( [84,85,86,87]).

2.5 Implementation of mountain graphs

The mountain graphsFootnote 6, otherwise known as stacked line graphs, have proved quite popular in literature to present and portray the ebb and flow of relationships over time. For this study, each layer in the graph represents a unique contact and the area within the layer demonstrates the extent of email communication (to, from, CC and BCC) on the given date denoted by the x-axis. Visualisations were created for PrivCon levels 0, 1 and 2. PrivCon0 included a key with full email addresses for each contact. For PrivCon1, the key is removed as a form of anonymisation. For PrivCon2, as a form of aggregation, the emails were grouped utilising the personal, professional, shopping, practical categorisations (discussed in Sect. 3.1). These categories were then used as the different layers of the graph.

2.6 Implementation of scatter plots

The second type of visualisation selected, scatter plots,Footnote 7 is in line with studies such as [54, 58, 59]. These were created showing the points of contact for each email throughout the date range of the dataset. The email contacts were arranged in order of frequency, from the highest number of contacts to the lowest. Colour was used to denote whether the point of contact was To, From, CC, or BCCing the individual. These visualisations were included only as PrivCon0 and PrivCon1. The former included the email addresses listed on the x-axis and the latter has these redacted.

2.7 Implementation of bar graphs

For this study, bar graphsFootnote 8 are used to show the number of contact points in the email collection on any given day as well as the type of contact (To, From, CC, BCC). For clarity, the dataset has been limited to show only those contacts with greater more than one connection edge. The PrivCon0 graph depicts the frequency and type of contact (to, from, cc, bcc) for each of the higher frequency email addresses. PrivCon1 is similar to this, but with the email addresses redacted. The PrivCon2 graph aggregates the activity and presents it by date rather than by individual email addresses. This is a similar presentation for the PrivCon3 graph, but, for that graph, noise has been introduced to limit the potential for reconstructing identities.

2.8 Implementation of word tree

For this study, the word treesFootnote 9 are presented as PrivCon0-2. For PrivCon0, the visualisation includes the word tree with a reading panel on the right-hand side that allows the participant to see the content portrayed in the visualisation in context of the collection as a whole. At PrivCon1, the sensitive information (names, email addresses etc.) has been redacted in both the visualisation and the reading panel. As a form of aggregation, for PrivCon2, the contextualising reading panel is removed.

3 Methodology

3.1 The data

The dataset used for this study is a personal email archive of a filmmaker who used their email account for both personal and professional purposes. Their professional activities revolved around the conception and creation of avant garde films. They attended a range of conferences and film festivals and were in contact with film institutes and artist support networks with the intent of archiving their life’s work. They worked with students on their own projects and they engaged in a variety of other artistic pursuits. They maintained a strong network of personal and professional relationships, with many individuals sitting within both categories. Furthermore, as a disabled artist, the filmmaker engaged with a variety of individuals and organisations to support them day to day. Each of these activities, alongside things like internet shopping, has left their trail within the email collection, making it a rich source for exploring the artist’s life and professional activities.

The email collection was recovered from the legacy internal hard drives of a Mac OSX desktop used by the late filmmaker. The recovered data comprise 5095 emails spanning the time period 2006-2012 (the year of the artist’s death), although there are only a very small number dating back to 2006 and nothing after that until 2009. For our study, in addition to the email content, associated metadata was extracted to comprise: email address of sender; email address of recipient; email address of people copied into the email; the date on which it was sent. Although it was not used for the current study, broader metadata included information on whether the email is junk, has been read, has a high or low priority—an additional layer of complexity for an already complex data-type ( [88]) for further exploration in future studies.

To offer more digestible visualisations for the participants of this study, the data source was sampled to include two months of the filmmaker’s life. We targeted the date range from December 2010 to January 2011 (351 emails), chosen to offer a good range of personal and professional emails to represent a number of known major events—a major holiday period and the period of production for what would be his last major film. To ensure that the dataset was focused on emails with a high level of interest to the archive and researchers, the data were manually coded into one of five categories (personal, professional, shopping, practical, advertising) based on a review of the content of the email and the presence of, for example personal anecdotes, receipts or unsolicited or periodical content from institutions. This was done by one of the researchers in the research project (and an author of this paper) who had been responsible for exploring the email collection and for advising the archive on its content. Those in the advertising section (e.g. spam, circulars) were excluded from the visualisation as they were judged to be of little interest or value regarding the artist’s life and work. This resulted in a dataset of 218 emails out of the two months sample or 5.4% of the collection as a whole.

3.2 The participants

Given the sensitive nature of the material under investigation and the restricted status of the featured archive for public release,Footnote 10 the participants for this study included only those individuals who had been granted privileged access to the collection as a part of the associated research project. The benefits of this arrangement were twofold:

  • It ensured that participants were familiar with the central subject of the email collection, creating a facsimile of the natural process of discovery experienced by researchers or archival practitioners.

  • The arrangement allowed for the testing of visualisations at all levels of the privacy scale without risking the release of sensitive data.

The participants were selected due to their range of expertise and a shared research and/or professional interest in the dataset. A more detailed breakdown of the participants’ professional profiles is given in Sect. 4.2 as the findings for Stage One of this research. The participant pool included four individuals, two who were trained archivists at different stages of their career and with different day to day responsibilities as well as two who were Arts & Humanities researchers, each from different disciplines, their work characterised by quite disparate methodologies. All participants had worked with the archive in question for a number of years prior to the commencement of this study, and so they were intimately familiar with an array of contents from within the wider collection. None, however, had engaged with the email collection beyond abstract conversations and reports at team meetings.

The number of participants involved in the study may seem small, yet, it is comparable to similar studies. For instance, MUSE and the professional counterpart ePADD, used for email collections in cultural memory institutions [2, 25, 26], exploit a range of data analysis techniques to promote the exploration of email collections. In [25], the usefulness of the tool was explored through an experiment involving six participants (two archivists, a historian and three working professionals). Working with their own email collections, the participants rated the tool on a five point scale, supplementing this with qualitative comments to contextualise the responses. This model of investigation is a familiar model, repeated in many of the studies to determine the extent to which a selected visualisation design supported the participants’ needs. The study is organised as an in-depth three-stage exploration of a complex problem from multiple angles (see details in Sect. 3.3) to compensate for the limited availability of participants.

3.3 The study

The research underlying this paper adopted a delphi study model, a research method designed to ‘obtain the most reliable consensus of a group of experts’ [89]. It involves ‘a series of questionnaires interspersed with controlled opinion feedback’ [89]. Specifically, the participants partook of three rounds of questionnaires, with stages two and three incorporating the chance to review aggregated feedback from the previous stage.

3.3.1 Stage One

The focus of Stage One was to establish a baseline for how participants might engage with email data. The questions asked included background information, such as research discipline, interests and common methodologies used. This contextualises the findings from later stages and helps to understand how researchers the researchers might engage with email collections as they are usually presented. The key question of this stage was: “What kind of research can you envision yourself conducting with email data?”.

3.3.2 Stage Two

In Stage Two of the research, participants were first presented with a summary of the Stage One responses regarding their present research, allowing them to consider and augment their previous response in light of the shared ideas. Then participants were shown a series of visualisations, grouped by type but with visualisation type and level of privacy ordered randomly. By varying the order of presentation, it minimised the potential for bias brought about by increased knowledge of the dataset gained throughout the survey as well as that brought about by varied levels of interest through the experiment (cf. Appendix A, Table 9). Also to minimise potential for bias, the visualisations were given a consistent colour palette, scale, and, as much as possible, presentation (font, title/key placement, background, surroundings). Each visualisation was given a brief description to aid the participant in comprehending its scope and context as well as support them in the process of interpretation.

For each visualisation, participants were asked:

  1. 1.

    What kinds of information can you gather from this visualisation?

  2. 2.

    Does this type of visualisation support your approaches to research?

  3. 3.

    In what ways might visualisations like this help you to address your key questions/themes and/or envisioned outcomes?

  4. 4.

    In what ways could the visualisation be lacking in helping you address your key questions/themes and/or envisioned outcomes?

3.3.3 Stage Three

The final stage of the survey was intended to consolidate the participant’s understanding of the varied visualisation and to facilitate their ranking in terms of usefulness compared to levels of privacy protection. This was done by presenting the responses from Stage Two in a collated form for each visualisation presenting these to the participants as a part of the survey for review. The participants were then asked to reconsider the visualisations and give them a score for their usefulness, as well as give reasoning for this score. More precisely, for each visualisation they were asked to consider:

  1. 1.

    Is there anything you would like to add or change in relation to your initial assessment of this visualisation?

  2. 2.

    How useful is this visualisation for your research or practice? 1 (not useful)–7 (very useful)

  3. 3.

    Why have you given this rating?

4 Findings

4.1 Overview

The detailed findings related to each of the three stages of the research are presented in Sects. 4.2, 4.3 and 4.4. Here we present a holistic overview, with numbers drawn from the usefulness ratings from Stage Three of the study. While the results pertain to Stage Three of the process, it is assumed that participants made their assessment informed by their experience throughout all three stages of the study.

The average responses on the usefulness scale (cf. Table 2) revealed that the five PrivCon 0 visualisations were most highly rated (5.0), across all visualisations in this category. This is followed by the four PrivCon 2 visualisation (4.69) and then the five PrivCon 1 visualisations (3.8). Those perceived as the least useful were the two PrivCon 3 (2.88).

Table 2 Average usefulness responses, on a scale of 1–7, for each PrivCon level

It will be shown in the subsequent sections that the preference for the fully disclosed PrivCon 0 reported in Table 2 can actually be variable across different visualisations, suggesting that too much data can cloud the information contained within.

The average rating for all the visualisations was 4.28 and indeed for each visualisation type (cf. Table 3) was above the midpoint. This could indicate that participants are able to envisage uses within their own work for each type of visualisation. Most highly rated were the mountain graphs, a representation of the Patterns of Relationships area of research (5.0). Following after this, and closely rated with 4.58, 4.19 and 4.13, respectively, are the word trees, the bar charts, and the scatter graphs. Lowest rated (3.69) are the directed network graphs.

Table 3 Average usefulness responses, on a scale of 1–7, for each visualisation type

For a more detailed perspective on the range of responses given by participants, Fig. 3 presents the range, inter-quartile range, average usefulness scores given for each PrivCon level. This figure also demonstrates that the PrivCon 0 visualisations were consistently the highest rated, receiving only scores between 4 and 6, and with an inter-quartile range across this interval. PrivCon 2 retrieved the next highest score with a range between 6 and 3, but with the majority of responses clustered between 4 and 5. Both PrivCon 1 and 3 received responses across the spectrum of the scale, although PrivCon 1 trended towards the upper end of the scale and PrivCon 3 the lower end. This result is contrary to the expected relationship between privacy awareness and usefulness.

Fig. 3
figure 3

A box and whiskers diagram displaying the distribution of usefulness scores for each of the PrivCon levels

Fig. 4
figure 4

A box and whiskers diagram displaying the distribution of usefulness scores for each of the participants

Another important factor to consider is that whilst participants were given a scale upon which to rate the visualisations, there is a degree of subjectivity in the interpretation of this scale.Footnote 11 As such, and to provide a point of reference, Fig. 4 displays the range, inter-quartile range, average usefulness scores given by each participant. Participant 1 gave the broadest range of responses and was the only Participant to give a 1 to the visualisations. Participant 2 trended towards the positive, although assigned a few lower scores. Participant three usually responded in the mid to upper range of 3-5, although never gave above a 5. Finally, Participant 4 gave the most compact responses, usually rating visualisations between a 4 and 5, and they never responded below a 3.

4.2 Stage One

The first stage of the study helped to build a profile of each participant’s current work with and interest in email data, as well as an idea of overarching research interests and methodologies. The summary of the participants’ responses to this stage of research is found in Appendix A, Tables 10 to supplement the findings presented here.

The responses represent four individuals with quite diverse interests and focal points for their works. There are two archival practitioners and two Arts & Humanities researchers. The archivists have different roles within their respective institutions with one, Participant 2, focused on the processing of large volumes of digital data within an archive and the other, Participant 3, on an array of legal, theoretical and practical factors relevant to archive management. The two researchers both have diverse research interests with one, Participant 4, centred on interdisciplinary theoretical work and the other, Participant 1, interested in fine art with a focus on film. In terms of envisioned work with email collections, the two archivists were both interested in quite practical aspects of email examination, in particular understanding the dimensions, content and risks associated with the emails as well as the provision of access to content for researchers. As noted in Sect. 2, these are all aspects that have often been the driving force behind email visualisation research. The archivists could therefore, in theory, be both users of the visualisations to aid in their archival workflow and suppliers of access to content through the inclusion of visualisations in, for example, a catalogue. This means that they provide a valuable perspective for the second and third stages of the study. The two researchers included in the study, conversely, are subject matter experts. Their focus is on extracting historical and/or theoretical analysis of an artist’s life, work and network reflects the concerns of other researchers and potential users of email visualisations. This distribution of responses establishes a baseline from which to explore the participant’s responses to different visualisations in Stage Two of the study, highlighting both the similarities and disparities between their approach to an archival collection. Of particular interest are the threads of similarity between the envisioned work that Participants 1, 2, and 4 might conduct with email collections. Participant 2 describes their work as ’very practical’, but the processes involved in appraising, describing, reviewing and providing access to content align, at least in terms of the mechanics, quite closely with the work of the researchers— they each seek to comprehend the collection as a whole, the context within which it sits and the relationships between names individuals/institutions. Whilst not stated as explicitly, this may also align with the ‘collections development’ described by Participant 3.

4.3 Stage Two

For this stage of the analysis, each type of visualisation is addressed individually with a table of collated responses from participants. Where relevant, direct quotes are taken from the surveys to provide additional context and a deeper analysis. Similarly, specific participants are noted if their circumstances impact on the results. Tables 11, 12, 13, 14, 15 provide the collated responses to this stage of the survey for reference.

4.3.1 Directed network graph

Fig. 5
figure 5

A sample directed network graph representing PrivCon1, with email addresses redacted

Responses regarding the directed network graphs (see Fig. 5 for an example) were, on the whole, dependent on the level of detail present in the graph (cf. Table 11). PrivCon0 and 2 were viewed most favourably regarding potential for research. Both were described as enabling participants’ insights into the wider network, the individuals involved, power dynamics or spheres of influence. Participants also noted an indication of geographical location and professional affiliation from the email domain names. Particularly intriguing for the archivist participants was the potential for using this visualisation to link up with other resources within the archive to aid in or develop wider protocols for content or sensitivity review—an ongoing dilemma for large-scale digital collections (cf. [83, 91,92,93,94]). For the researchers, the interest was on the details that might be gleaned from the visualisation, allowing for high-level examination of the email collection and the artists’ personal and professional milieu.

PrivCon 1 and 3, those less favourably reviewed by participants, were perceived as giving an idea of overarching patterns in the dataset, but were considered too abstract. For PrivCon1, one of the researchers found the form of the visualisation itself interesting but notes the difficulty of identifying key contacts from those which represented, for instance, food delivery services. This issue was exacerbated for PrivCon3 with all participants noting the lack of information and a difficulty interpreting the data without understanding the impact of the noise introduction.

Some other issues arising with this visualisation included the density of the data points making interpretation quite difficult. In addition, it was noted at all levels that the inclusion of additional data, referencing for example the number of emails each line represented, the date of communication, or the subject of the email would enhance the usability of the visualisation. Given that the visualisation is already information dense, it would be necessary to integrate any additional data using hover over or similarly interactive functions. Making the visualisation interactive may also mitigate the issues of density, supporting a more malleable approach to exploring the data, for instance, by reshaping the network around specific data points.

4.3.2 Mountain graph

Exploring mountain graphs as depicted in Fig. 6, the participants viewed PrivCon 0 and 2 equally well, but were less engaged with PrivCon1. PrivCon0 supported the identification of patterns of communication for all participants, with the archivists postulating additional usages in linking this visualisation to other archive items. Participant 1 found it particularly useful for identifying individual contacts within the dataset. It should be noted however, that Participant 3, an archivist, highlighted that the inclusion of email addresses gave rise to sensitivity concerns. The PrivCon2 visualisation was useful to the participants in a different way, helping to identify workflow and workload over time. The inclusion of categorisation allied some of the participants’ concerns over understanding the content of the email collection, particularly for the archivists who saw scope for ‘confirming aspects of context and... content’. Despite this, Participant 1 raised the concern that the categorisation applied to emails not to contacts, so could not account for instances where there was ‘crossover... between professional and personal’. However, this is a data structure and markup issue, rather than an issue with the visualisation itself. Least well received of the mountain graphs was PrivCon1, for a familiar reason—the lack of detail. Comments such as ‘needs names’ and ‘Content - as always!’ highlight the extent to which this is a priority for researchers in particular, although Participant 2 also concurred that the lack of information was a hindrance.

As a general comment on the visualisations, Participants 4 noted an important point about accessibility for this visualisation. As it relies on colour and, more particularly differentiation between layers of colour, there would be issues for any user who was colour blind. Similarly, it was noted that the visualisation was ‘quite tiring on the eyes’, something echoed by Participant 2. This suggests that prolonged usage may not be tenable, particularly for larger or more complex data sets. This is more of a design issue than one relating specifically to the usage and usefulness of the visualisation, but it is an important consideration nonetheless.

4.3.3 Scatter plots

Fig. 6
figure 6

A sample mountain graph representing PrivCon1, with email addresses redacted

Fig. 7
figure 7

A sample scatter plot representing PrivCon1, with email addresses redacted

Of the two scatter graphs (see Fig. 7 for an example), the PrivCon0 was judged to support research and practice for three participants, although Participant 2 did not agree as there was insufficient detail to support the archival work they do. Participant 3 conversely saw some potential of the high level perspective on the email collection ‘to link to cataloguing data... and to aid in sensitivity review’. Despite this, they are again cautious of the privacy risks associated with the inclusion of email addresses. The researchers both agreed that the visualisation could support their work, highlighting patterns in the relationships as well as periods of high activity. Despite this, Participant 4 noted that the precise design of the visualisation was against expectation, in that they believed the date should be placed on the x-axis, something which added a barrier to comprehension. The more privacy-aware visualisation, PrivCon1, was again less well received by participants with the primary issue being the level of detail.

4.3.4 Bar charts

Fig. 8
figure 8

A sample bar chart representing PrivCon2, with email contacts aggregated

Out of all the visualisations, the bar charts as shown in Fig. 8 seemed to be the least supportive of the participants’ work. Only the PrivCon0 visualisation was judged useful, although Participant 4 would require for the data to be contextualised to a specific event. Most participants again wanted additional detail, such as the subject or theme of specific emails. Participant 4, for instance notes that ‘this visualisation can respond to ’who’ questions... [b]ut it can’t tell me ’what’ the content of those conversations were’. PrivCon1 received the least favourable review out of the whole collection, with no participant seeing potential for it to support their research/practice. Attempts were made by Participants 2 and 4 to interpret the data, but these interpretations were quite broad. Participant 3 also unfavourably compares the efficacy with the mountain and scatter graphs. A possible reason for this is that, within this study, both the mountain and scatter graphs were keyed towards the relationships between people, whereas the bar chart was aimed towards their behaviour. The difference is quite nuanced, but, based on the participants’ responses, important. The PrivCon2 bar chart had slightly more potential for the participants. Whilst Participants 1 and 2 could not find a use for the visualisation, Participants 3 and 4 thought that it may support their work. Participant 3, for instance, considered that an archivist may be able to use it to contextualise a collection, particularly if used in conjunction with other resources. Participant 4 was able to gain ‘greater insight into the filmmaker’s behaviours’ as well as contextualising these in terms of the dates. Even Participant 1 considered that the visualisation could be used to compare ‘who contacted him more than he them’, but this is qualified by some doubt when they note that ‘maybe I’ve misunderstood the information’. This is one of the few times that one of the participants highlighted an issue interpreting the visualisation. For PrivCon3, only Participant 4 considered that the visualisation might support their work, but this is qualified when they note that they felt ‘increasingly uncertain... as to what data I am looking at’, once more highlighting the need to additional clarification regarding the introduction of noise.

4.3.5 Word tree

Fig. 9
figure 9

A sample word tree representing PrivCon2, with identifiable data redacted

The word tree visualisation as shown in Fig. 9 was quite divisive amongst participants and was the only type of visualisation to elicit responses across the spectrum with regards to usefulness. That being said, it also appears to have been one of the most liked visualisations, allowing participants to delve, in reasonable detail, into the content of the email collection. PrivCon0, for instance, was useful to two participants, one was not certain and one did not feel that it would support their research/practice. Participant 1 consistently did not feel as though the visualisation supported their work, although this is tempered by the note that the visualisation would be improved ‘if there were key words’, suggesting that the central word was the issue. For Participant 4, PrivCon1 was the most useful for their research, highlighting the frequent structures and themes present within the collection. They also indicate that, whilst not all the information present in the visualisation is relevant to their research, it would likely have wide reaching applicability, encompassing items of import for many kinds of researcher. At the higher PrivCon levels, they found that the lack of identifiable information was an impediment to their research. However, in direct contrast, they noted a level of caution about the extent of detail present in PrivCon0 and how that would position them with GDPR, particularly in relation to evidencing research in later work. Whilst not explicitly stated, this is also an issue with PrivCon1, where they indicate that they could ‘hazard a guess’ as to whether the contents originated from the filmmaker or another individual, suggesting potential for identity reconstruction, even with the redaction of sensitive content.

The archivists agreed at all levels that the visualisation could support their practice. At PrivCon0, Participant 2 notes that the visualisation would be ‘great for identifying emails with sensitive content’ and, at the other levels they highlight that whilst the pre-redacted content is less useful for the archivist, they could be a ‘potential access tool for a user’. Participant 3 largely agrees, noting in particular that it was a ‘clever way of looking deeply at content from across the dataset’. Despite this, they do indicate that the usefulness of the visualisation would be dependent on the search strategy, citing the idiom ‘garbage in, garbage out’.

The distinction between usefulness at different levels of privacy awareness was the least pronounced for this set of visualisations, with most participants engaging equally at all levels. A potential reason for this is that, even with the sensitive content redacted, or the contextualising window removed, the visualisation still provides targeted insight into the contents of the email, rather than focusing on high-level metadata. It is clear from both the literature review and they participants’ responses to the other visualisations that the content of emails is something that is considered key to both research and practice. This is perhaps due, at least in part, to familiar methodologies utilised throughout the Arts & Humanities and archival sciences, whereby researchers/practitioners will manually search through and engage with archival contents.

4.4 Stage Three

The final stage served to solidify and clarify the observations prevalent in Stage Two. The participants’ responses to the questions are given in Tables 16, 17, 18, 19, 20 in Appendix A. For clarity, and to avoid repetition, the questions are presented as Q1, Q2 and Q3 rather that written out in full. For reference, these are:

  1. Q1

    —Is there anything you would like to add or change in relation to your initial assessment of this visualisation?

  2. Q2

    —How useful is this visualisation for your research or practice? 1 (not useful)–7 (very useful)

  3. Q3

    —Why have you given this rating?

4.4.1 Directed network graph

The responses relating to directed network graphs (depicted in Table 16) are reasonably coherent between the participants at each of the PrivCon levels. To address first the review of responses from Stage Two, Participants 2 and 3 had nothing to add to their thoughts after considering the collated opinions. Participant 1 supplemented their thoughts on PrivCon 1 and 3, noting that PrivCon 1 had a potential knowledge gap, prohibiting understanding and that the level of detail in PrivCon 3, particularly regarding names, rendered it unuseful. Participant 4 reflected on each of the visualisations, noting the high level of sensitive data in PrivCon 0 and suggesting that the redacted nature of PrivCon 1 helped them to ‘rethink how data analysis might support their research’ beyond their usual approaches. For PrivCon 2, they indicate that the visualisation might support ‘useful conclusions about the filmmaker’s creative activity’ and for PrivCon 3 they express an interest in the processes involved in the creation of the visualisation, highlighting a potential knowledge gap that could be acting as a roadblock to understanding.

To turn to the participants’ ratings, as summarised in Table 4, the PrivCon 2 visualisation was rated, on average, most highly, closely followed by Privcon 0. The scores from every participant for both of these visualisations were towards the positive end of the scale. Participant 3 gave the lowest score with a 4, noting that PrivCon 0 was ‘quite hard to follow’ and that, whilst they could envision a usage for PrivCon 2, it was only ‘potentially some use’. PrivCon 1 received an average score of 2.75, so below the midpoint on the scale, although Participant 4 does give the visualisation a 4, noting that ‘it does look useful’ but highlighting that this usefulness is not immediately apparent and requires additional thoughts. Participant 1 describes the visualisation as a ‘snapshot’ suggesting something without much depth, something echoed by Participants 2 and 3 who indicate that the redacted information and the lack of contextual information reduce the extent to which the visualisation can be useful. Interestingly, the standard deviations indicate that participants were also more in agreement over the usefulness of PrivCon 0 and 2 (0.433 and 0.707 standard deviation, respectively) than they were over PrivCon levels 1 and 3 (0.829 and 1.090).

Table 4 A summary of scores for usefulness given in response to the Stage Three of the study for Directed Network graphs

4.4.2 Mountain graph

The responses relating to mountain graphs (depicted in Table 17) are also coherent across the PrivCon levels and are also more positive than for the directed network graphs. With reference to any additional thoughts the participants had regarding these visualisations, only Participants 1 and 4 noted a shift in their thinking. For PrivCon 0, Participant 4 notes that the potential for diachronic analysis is helpful. Participant 1 agrees with the assessment given in Stage Two that this visualisation is problematic for those with colour blindness, but adds that it is easier to read than the network graph.

The scales of usefulness applied to this type of visualisation (Table 5) indicate that PrivCon0 is regarded as the least useful and PrivCon2 the most useful, although all levels are rated as above the midpoint of the scale. Participant 1’s scores for PrivCon 0 and 2 are not explained, but confusingly they rate PrivCon 1 as a 6, noting that it would be ‘very useful’ with names or dates. This response points to a potential misunderstanding of PrivCon 0, which does contain names, but has been given a lower score. All PrivCon levels within this visualisation also contain dates, again suggesting a lack of clarity or a knowledge gap in the provision of the visualisation. Participant 2, across all privacy levels, highlights the visually appealing nature of the design, but again notes the potential for accessibility issues. Despite this, they give the visualisation a 5 across the board. They indicate that there is potential for these to display frequencies more clearly than, specifically, the network graphs and the word trees—a view that is perhaps indicative that visualisations focusing on patterns of relationships are of more use than those keyed towards networks or email content. Of specific PrivCon levels, Participant 2 notes the potential to mediate sensitivity when presenting the data to a wider audience. Participant 3 sees the possibility for integrating this type of visualisation to events/ trends in activity. PrivCon 1 is rated as less useful, with a score of a 4 compared to 5 for the other two levels, as Participant 3 notes that the visualisation works better with context. They also make an excellent point about PrivCon 2, that might be more widely applicable across all areas of category creation—that the categories could be subjective. This is something that could be mitigated through the inclusion of additional individuals in the tagging process. For PrivCon 1, they return once more to the idea that the visualisations which lack textual elements provoke or inspire them to think differently about digital communication. For PrivCon 2, the score given reflects again on the type of categorisation, much as it seemed to for Participant 3. Participant 4 also suggests that a combination of visualisations, particularly this and the word trees, might provide a more useful perspective, especially if they were cross-referenced with regards to content categorisation.

As well as being, on the whole, more highly rated than the directed network graphs, the mountain graph visualisations also elicited a more cohesive response from participants, with the standard deviation sitting between 0.433 and 0.707.

Table 5 A summary of scores for usefulness given in response to the Stage Three of the study for mountain graphs

4.4.3 Scatter plot

The responses relating to scatter plots (Table 18) are harder to compare with the other types of visualisations, as there are only two PrivCon levels represented. Only Participant 1 augmented their original assessment upon reviewing the collated responses, noting of PrivCon 0 that the timelines supported the mapping of activity and individuals. They also suggested that this feature could be cohesive with the paper side of the hybrid archive.

Looking at the usefulness assigned to each visualisation (cf. Table 6), the scatter plots more closely reflect the responses given in response to the directed network graphs, whereby PrivCon 0 is rated as much more useful that PrivCon 1, receiving an average score of 5 and 3.25, respectively. That being said, PrivCon 0 was quite divisive between the participants, with the researchers both rating the visualisation with a 6 and the archivists a 4, leading to an overall standard deviation of 1.000. The reasons the researchers give for mapping this visualisation so highly, the highest score given to any visualisation in fact given that none were rated with a 7, include its usefulness for understanding patterns of behaviour and honing in on ‘key frequency/volumes of contacts’. Sensitivity is again an issue for Participants 4, but they suggest that it might work best as an ‘internal team tool’. This perspective is highlighted by their response to PrivCon 2, where they indicate that the pattern alone is unhelpful, except in the case where it was ‘tied to a specific output e.g. a film or a specific person or contact’. If this more focused approach could be taken, then a redacted version of the visualisation could be published alongside the results, therefore reducing the risk posed by the personal data included in PrivCon 0. Participant 2 explains their score of a 4 by noting that scatter plots can be difficult to interpret due to the quantity of information it contains. They do suggest, however, that this may support a more detailed analysis rather than, for instance, the overview provided by the mountain graph. Intriguingly this comparison is interesting given that both forms of visualisations are intended to display patterns to be found in the relationships evidenced by the archive. This is perhaps something Participant 2 was able to engage with, suggesting a certain cohesion between the visualisation types. Participant 3 admits that they are uncertain how they might use the visualisation, although the seem to see potential within it, perhaps for users or alternative job roles, given the score that they assign.

PrivCon 1 elicited a lower, if less diverse response from participants, with Participants 1, 3, and 4 assigning a 3 and Participant 2 a 4. As with previous visualisations, Participant 1 indicates that the redaction of names is problematic for their interpretation and use of the visualisation. This is something agreed by Participant 3, who notes that potentially this drawback could be mitigated by comparison with additional datasets. Participant two once again ascribes the same score to each visualisation within this set, noting the redaction as a useful possibility rather than a hindrance. When contrasted with their perspective on the directed network graphs, it is possible that the focus of the visualisation on patterns in relationships is indeed the difference in the levels of usefulness that they perceive.

Table 6 A summary of scores for usefulness given in response to the Stage Three of the study for scatter plots

4.4.4 Bar chart

The responses relating to bar charts (depicted in Table 19) are perhaps the most disparate of any within the dataset, eliciting both the highest and lowest scores for usefulness, even in relation to the same PrivCon level. All participants other than Participant 2 augmented their original thoughts upon reviewing. Participant 1, for instance, notes that PrivCon 2 allowed for conception of contact initiated and reciprocated. Participant 3 agreed and liked that the PrivCon 0 visualisation could supplement the catalogue, and that PrivCon 2 was primarily useful in relation to specific research questions. Participant 1 once more notes that the introduction of noise acts as a barrier to using PrivCon 3, something echoed by Participant 4 in an inability to identify the types of data presented in PrivCon 3 compared to PrivCon 2.

As can be seen in Table 7, this series of visualisations follows the pattern seen in relation to directed network graphs and, to a lesser extent, the scatter plots. The PrivCon level rated as the most useful was PrivCon 0, followed by PrivCon 2 and then, on an even level, PrivCon 1 and 3. PrivCon 0 and 2 are also the most cohesive in the participants’ opinions, although the deviations are still quite high with scores of 0.029 and 1.118, respectively. The PrivCon 0 visualisation receives the highest average score, an accolade shared with the PrivCon 2 Mountain graph and the PrivCon 0 Word Tree. This contrasts with initial considerations identified in Stage Two of the study, in which PrivCon 2 received two ‘maybe’s and two ‘no’s compared to PrivCon 0 which received 3 ‘yes’s and one ‘maybe’. The reasoning offered by participants for their scoring of PrivCon 0 includes that the visualisation allows for the user to understand the frequency with which filmmaker contacted people and that there are other practical applications, especially if cross-referenced with other visualisations. Participant 2 highlighted that the form of the visualisation, a bar chart, is fairly familiar and so would have a reduced learning curve making it useful for the majority of users, something that this participant echoes across the spectrum of this type of visualisation, no matter the PrivCon level applied. They add, again, that the option for redaction is useful and that the aggregated visualisation presented for PrivCon 2 ‘can provide a lot of detail’. Other participants were less optimistic about the PrivCon 2 visualisation, noting that it ‘gives some information’ but that ‘other visualisations probably do this job better’.

Participants supplied identical responses for PrivCon 1 and 3, with the lowest score attributed by Participant 1 and the highest by Participant 2. The distinction between these two scores is quite extreme and is in part responsible for the standard deviations of 1.803. Both Participants 2 and 4 judged these visualisation to be equally as useful as that supplied for PrivCon 0, with Participant 2 again citing the familiarity of the design and Participant 4 pinning this level of usefulness on the potential to, at some point, identify the highest frequency contacts in PrivCon 1, and for it to be possible to identify ‘any time critical period in the filmmaker’s life’ for PrivCon 3. Participant 3, conversely, judged PrivCon 1 and 3 to be of equal usefulness to PrivCon 2, indicating that PrivCon 1 lacked contextual information and acknowledging that they could not determine how PrivCon 3 would be helpful. Participant 1 was most critical of these two PrivCon levels, indicating that PrivCon 1 contained no useful information, whereas PrivCon 3 was obscured by the noise, making it hard to read—both of these sentiments echo this participant’s feelings about other visualisations of similar PrivCon levels.

Table 7 A summary of scores for usefulness given in response to the Stage Three of the study for bar charts

4.4.5 Word tree

The responses relating to word trees (depicted in Table 20) follow a pattern not seen in any of the other visualisation types, where the usefulness score decreases as the PrivCon level increases. The participants also offered more and lengthier considerations on the visualisations, based on review of the collated responses from Stage Two. For PrivCon0, they reiterate the importance of the search term and note that connecting it to a ‘more advanced search interface’ and other datasets would be beneficial. Participant 1 goes even further indicating that they would like to see more of the email contents for each search term. Participant 4 focuses in on the content of the visualisation, noting the potential for research into the emotion of the email collection, one of many related and valuable branches of research within the humanities. For PrivCon 1, Participant 1 again notes that redaction reduces the usefulness of the visualisation, something that is advanced by Participant 4, who indicates that, in particular, it could cause issues in tracing threads of conversations. Opinions on PrivCon 2 are slightly divergent with regards to the collated material. Participant 3 notes that the ‘absence of the reading panel makes this much less usable,’ but Participant 4 notes that it is ‘slightly easier to read’ making it more accessible as an interface.

As can be seen in Table 8, the responses to this visualisation are quite cohesive, with all the highest standard deviation to be found for PrivCon 1. As noted in relation to the bar charts (Sect. 4.4.4), the PrivCon 0 word tree has one of the highest average ratings of all the visualisations. Participants highlight that the reasons for this include increasing the discoverability of the information, particularly with the reading panel for quick access. They also highlight that it limits the need for manual search of emails, although there is a caveat to both of these depending on the search term utilised. Participant 2 believed that this visualisation would be ‘easily understood by the majority of out remote users’. Participant 4, once again, returns to the idea of personal and sensitive data, considering that this visualisation might be ‘too revealing’. The feedback from participants relating to PrivCon 1 is similar in nature, again highlighting the possibility for more efficient navigation and discoverability of content. Participant 2 once more notes the usefulness of being able to redact content and Participant 4 considers the role this type of visualisation could play in relation to ‘questions of attribution’, but once more highlights that this would need to be in ‘in concert with other tools’. Interestingly Participant 1 rates the PrivCon 2 visualisation as being equally useful at PrivCon 0, noting that the usefulness depends on the search term. Each other participant rates this visualisation as the lowest of this set, with Participant 2 noting that the lack of a reading pane might be ‘frustrating’, something that is reflected by Participant 3, who notes that the removal of the panel doesn’t bring any advantage and is, in fact, illogical when there is the ‘opportunity to link a finding aid to the content’. Participant 4 reflects that the visualisation would be useful, but may require them to adapt ‘the nature of that research in engaging with the visualisation’.

Table 8 A summary of scores for usefulness given in response to the Stage Three of the study for Word Trees

5 Discussions and implications

5.1 Discussions

On the whole, participants were able to engage creatively and productively with the majority of the visualisations. The responses given by both researchers and archivists indicate that they participants were able to envision how visualisations might support their work should they engage with email collections. At times, this usage was concomitant with existing practice, for instance, supporting existing activities or research questions. At others, the visualisations prompted participants to consider new perspectives on how they might engage with the data. Participant 4 regularly reflected on new areas of thought prompted by surveying the email collections through the use of visualisations, or, in some cases, by the creation of the visualisation itself. Similarly, Participants 2 and 3 noted several possibilities for integrating visualisations into the archival workflow, supplementing the catalogue, or providing a point of access for users. These findings support established thoughtFootnote 12 that visualisations support holistic, exploratory behaviour of data, encouraging a user to engage with existing modes of thought but also facilitating them to gain new insights and therefore, potentially, prompt new approaches and questions in individual subject areas, cross-disciplinary research or professional practice.

In terms of the impact of different levels of privacy awareness on usefulness, the findings demonstrated that although each of the PrivCon levels achieved at least one score of 6, the distribution of the other scores varied quite dramatically and yielded unexpected results. It was postulated in [11] that ‘when considering email data from the perspective of humanities researchers, whose standard methodologies involve the close and usually manual examination of data, the scale of privacy may well be considered inversely related to the degree of useful access’. However, in this empirical approach to investigating the issue, it was demonstrated that the situation is more nuanced than that with the usefulness dependent on the underlying focus of the data and associated analysis as much as the restrictions introduced by the privacy management strategy.

As revealed in [11], PrivCon 1—particularly anonymisation, pseudonymisation and redaction—represents the most popular privacy management strategy employed by those conducting research into email collections, through the use of visualisations. Most participants, however, viewed this redaction as removing key information (e.g. names) that was essential to their work. The sense, for most, was that simply viewing the overarching pattern made by individual data points was insufficient for detailed analysis within an arts & humanities and archival workflow context. To a degree, this might be minimised by the use of a different techniques, such as pseudonymisation, whereby participants would still be able to follow the threads of specific individuals even if that individual was not explicitly named. This option, however, is more risky in terms of the potential for re-identification (cf. [84,85,86,87]). Conversely, Participant 2 acknowledged that the opportunity to redact content was beneficial to allow the wider release of email data. In line with this, Participant 4 revealed a level of anxiety regarding the amount of information available at the lower PrivCon levels, especially as it pertained to disseminating their research. This, therefore, indicated that the higher PrivCon levels might have specific purposes for the public facing side of research or practice, after the data have been surveyed and analysed without the use of a filter. In fact, this follows the pattern found in many of the studies identified as associated with PrivCon 0 datasets in [11]. These papers would facilitate open access to the data for researchers involved (often utilising the participants’ own email collections) and then anonymise, pseudonymise and/or redact content to allow for publication of examples. In terms of active research or practice, however, not only do these approaches provide a lower level of privacy for the data subjects, but they also provide little usability for follow on work.

The results for PrivCon 2 were most strikingly contrary to the expectation of the relationship between privacy awareness and usefulness. Whilst, on the whole, not viewed as being quite as useful as PrivCon 0, visualisations in this category are well regarded by the participants. In one notable instance, the directed network graphs, the PrivCon 2, received a slightly higher score than PrivCon 0. Based on the participants’ responses and proposed usages for this privacy awareness level, it suggests that this higher level of protection concurrently offers a greater range of opportunities for researchers and practitioners to engage with email collections. By grouping data points so that the individual is hidden in a crowd, this type of visualisation offers a summary or intermediary form of analysis that can inform and inspire the user in their work. Such holistic perspectives are increasingly proving valuable within the humanities with the advent of data-driven studies such as those associated with, to name a few areas, distant reading (cf. [95,96,97]), digital humanities (cf. [98,99,100,101]) or machine learning and AI (cf. [102,103,104]). In addition, an email collection results in a large, potentially untenable number of data points. The dataset utilised for this study, for example, was a small sample of the complete email collection (approximately 5.4%) and this, in turn, was a relatively small email collection compared to those that exist in more recent archival datasets ( [3, 105]). Even at the scale presented in this study, participants raised concerns about the level of detail present in some of the visualisations, the network graphs in particular, suggesting that they might become unsustainable if expanded to larger experiments. The introduction of interactive elements (e.g. the ability to zoom, re-centre, include hover over information) is one solution to mitigate these issues, but these demand a higher level of technical skill on the part of the creator of the visualisation, as well as greater hardware and software requirements. The amalgamated nature of PrivCon 2 style visualisations is another possibility, and one with both a high level of usability and privacy awareness.

The final PrivCon level explored in this paper, PrivCon 3, was regularly judged to be the least useful to the participants’ work. The reasoning behind this appears to be, in the first instance, one of a knowledge gap. There were a number of instances throughout the study where participants were uncertain about engaging with the visualisations. In fact, the majority of issues arose from the level of detail and context (or lack there of) for the visualisations. The only issue where the participants consistently exhibited anxiety about their ability to comprehend the visualisation, both at Stage Two and Stage Three, was for PrivCon 3. Here participants expressed the need to more completely understand the processes underlying the generation of noise and how this might impact upon their analysis of the data.

Within these overarching patterns, there were some possible influencing factors or points requiring further investigation. There was one instance where PrivCon 1 was rated more highly than PrivCon 2 and that was in relation to the Word-Trees. This disparity from the overarching pattern is perhaps best accounted for by the removal of the reading panel for PrivCon 2. Similarly, there was evidence of anomalous results for the Mountain Graphs. Each PrivCon level for this set received very similar usefulness results from the participants. The distinction came from Participant 1 who gave a rating of 4 for PrivCon 0 and a 6 for levels 1 and 2. In principle, this is a truly intriguing result; however, when exploring their reasoning behind the score, there appears to have been some confusion given that under PrivCon 1 it is noted that ‘if it had the names/dates it would be very useful’. Each of these graphs does have the date included and the PrivCon 0 graph also has the names, but was given a lower usefulness score. Unfortunately, there is no reason given for the PrivCon 0 score. Additional investigation would be required to facilitate a more concrete analysis.

5.2 Implications

This work has implications primarily for the archival sector, but also for any researcher who might engage with email research data. Archives are increasingly faced with a daunting challenge of managing the ingest, review, and management of data from large scale digital( [16, 83]). Of especial concern is how archives might process increasing archival digital collections and conduct sensitivity reviews in a timescale to allow researchers timely access to the data contained within. The visualisation presented in this paper offers a viable, but adaptive level of privacy protection to the individuals named within an email collection in a rapid, relatively resource-light manner. More than this, these visualisations enable useful access on the part of researchers, supporting them to engage with a collection that, in its present condition, would otherwise remain closed to their work. More specifically, this work has revealed that commonly utilised methods for protecting privacy whilst facilitating access—notably those that fall under PrivCon 1—may not actually be the most advantageous methods to utilise, and instead research suggests that researchers and archivists both favour meaningfully amalgamated perspectives on the data, such as those contained within PrivCon 2.

6 Conclusion

The research presented in this paper represents a strong step towards the integration of privacy-aware visualisations with an archival email collection to facilitate access to content that might otherwise need to be closed due to privacy considerations. The experiment was conducted with a small specialised group of researchers and archivists, each of whom had privileged access to a filmmaker’s email collection. This allowed for the examination of email visualisations from across the spectrum of privacy levels identified in [11], without the need to modify or review the collection in a way which might interfere with the results. The participants offered valuable insights into how the visualisations might support their work, helping to identify the relationship between privacy-aware strategies and their perceived usefulness in relation to their research and/or practice. Their comments and justifications helped to identify email data analysis, visualisation and communication features of note that can facilitate their work.

There are several valuable avenues available for advancing the work presented in this paper. The most immediate would be to replicate the study with a larger participant pool that encompasses a wider range of research and professional expertise. There are certain impediments to this with regards to disclosing potentially personal and sensitive data to a wider user group; however, with proper ethical considerations and a carefully selected dataset, such an experiment could provide essential validation for the results of this study. Incorporating additional datasets would also allow elucidation on the impact of scale that is raised in Sect. 5.1. Beyond this, there is an established consideration that must be investigated with regards to ratings. It can be difficult to accurately compare ratings between individuals (cf. [90, 106]), particularly in situations where the data may be sparse (cf. [107]. Increasing the dataset and applying more rigorous methods of averaging out the responses, for instance similarity metrics such as Pearson’s Correlation or Jaccard similarity (cf. [108,109,110]), would be step forward in understanding this effects. In addition, it would allow for a greater diversity of responses, incorporating not only additional research and practitioner interests, but also participant backgrounds to allow for a more democratic review of the visualisations.

Similarly, the breadth and depth of each PrivCon level and each branch of research have, by necessity, been addressed on a surface level in this study to allow for an overarching perspective on the situation as a whole. The PrivCon levels identified in [11] encompass many more privacy management strategies that could be reflected in an extended study. Equally, the branches of research interests could be deconstructed to explore further facets and a variety of other visualisation designs. Additional investigations, perhaps focusing on the nuances of each area of research or each PrivCon level, would allow for a more nuanced analysis of usefulness. Equally, this more focused investigation would permit opportunities for exploring more detailed privacy management strategies, such as ascertaining the impact of choosing between direct redaction, anonymisation, and pseudonymisation or examining the more advanced techniques involved in PrivCon 3 (e.g. differential privacy).

Another area of interest would be to touch more deeply on the potential of the visualisations to facilitate and even provoke new areas of investigation, or new research questions within established fields. A recent AHRC funding opportunity entitled ‘Embed digital skills in arts and humanities research’Footnote 13 is one indicator amongst many that humanities research is increasingly leveraging the flexibility of the digital format to encourage new areas of thinking. Visualisations sit well within this remit as an approachable and flexible interface for data driven research. The archivist participants in our study were, in fact, able to see the potential in these to act as a mediator for their collections, enhancing discoverability, and accessibility. This is something that has also been touched upon in, for example, [111] and [112] as well as the increasing presence of heritage datasets such as Digital BodleianFootnote 14 and the National Library of Scotland’s Data Foundry.Footnote 15 The research contributed to the wider research community’s understanding of privacy and privacy management engaging more closely with trained archivists to assess how the levels of privacy managed by visualisations compare to that of a thorough sensitivity review. This is likely to augment the viability of incorporating the visualisations into an interface for an email collection, something that could be of tangible benefit to both archive practitioners and users alike.