Introduction

Water utilities are estimated to use ~10.2 EJ/year of primary energy, representing ~1.7–2.7% of the global total1. Global provision of water to end users is estimated to generate annual emissions of 0.3–2.8 GtCO2, accounting for 0.2–2.6% of the total greenhouse gas (GHG) emissions worldwide2. Altogether, the water sector has notable potential for saving energy and reducing GHG emissions. Quantifiable opportunities related to saving energy include between 0.5 and 1.1 GtCO2/year by saving water at the end-use level and between 0.2 and 0.7 GtCO2/year by tapping the energy potential of wastewater2. The potential for energy and emission savings achievable by other operational strategies (e.g., decentralization, nature-based solutions) remains unquantified.

While water utilities contribute to global GHG emissions, the impacts of climate change, in turn, affect water utilities’ core operations. In water-scarce areas, stress on water abstractions increases as surface water runoff may decline by up to 30%3,4,5. Additionally, urbanization is expected to raise water demand in cities by 80% within the next 30 years, leaving an estimated yearly deficit of available water equal to 1386–6764 million m3 and affecting ~440–673 million people worldwide6,7. In other areas, the risk of flooding and thereby caused combined sewer overflow (CSO) is expected to increase by up to 450%, primarily due to the combined effect of more intense and frequent precipitation extremes, number of storm days, and an increase in impervious areas due to urban development8,9,10.

Future climate and changing water demand will thus confront utilities with challenges that are unmanageable with current equipment for monitoring and maintenance of their infrastructure assets and management of their operations7,11. Digital technologies have proven effective in increasing both the resilience and efficiency of water utility core operations. For instance, data gathered by advanced metering infrastructure (AMI) from smart meters at high spatial and temporal resolutions12,13 generate valuable insight into consumer behavior, enabling water demand pattern analysis14 and demand management15. Thereby created consumer feedback helps foster water conservation behavior, with potential for long-term persistence when provided consistently16. Efficiency measures aimed at reducing water demand could also contribute substantially to savings in water-related energy when regarding that water heating at an end-use level may account for as much as 95% percent of regional water-related energy use17.

Additional potential for saving water and energy remains in reducing leakages both in the water distribution network (WDN)18 and at the post-meter level19, and optimizing water distribution and sewage operations by automatic control schemes (e.g., Model Predictive Control (MPC)). The implementation of MPC schemes in WDNs may result in potential energy savings of up to 10%20,21. Similarly, sewage operations may benefit tremendously from the application of MPC, reducing CSO by up to 98.4% of the potential reducible volume22.

Recent policy proposals addressing climate change and water security in the water utility sector promote the exploitation and widespread deployment of digital technologies (i.e., digital transformation23) and outline strategies for their adoption as effective solutions to enhance water utilities’ operational efficiency and resilience24,25,26. In practice, digital transformation requires extensive deployment of sensors, advanced Information and Communication Technology (ICT) infrastructure, and automated system control with smart actuators. Throughout the water utility sector, uptake of such devices is still quite slow, currently resulting, e.g., in relatively low sensor coverage and data availability in most water infrastructure when compared to electrical grids27. For effective mitigation, current analog and stand-alone devices need replacement with digital and ICT-enabled equipment, while the coverage of data collection needs to be increased to unlock the potential of digital technologies in ways that are cost-effective, and thus viable in practice by water utilities.

Possibly hindering the technology adoption process, on the one hand, are quite large replacement time scales and costs in the water utility sector, along with the absence of effective common regulation that fosters the installation of digital devices27. Conversely, technological guidelines with regard to overall lifetime, component replacement, and maintenance intervals are often individually provided by hardware manufacturers. Moreover, current replacement strategies may even refrain from considering the deployment of digital-ready devices due to a lack of regulation but also a missing affinity for innovation in a sector of critical infrastructure where safety is of utmost priority and internal best practices often remain unchanged.

Learning from utilities that already initiated the digital transformation journey, collecting best practices, and overall improving our knowledge of the current progress of this transformative process is fundamental to overcoming the above potential limitations and enabling all involved stakeholders to embrace digital technologies.

However, the status quo of digital transformation and technology uptake remains largely understudied worldwide. Most investigative works focus on the implementation and use of a single digital technology within the context of a case study16,28 or propose high-level policy strategies and frameworks for the general adoption process25,29,30. Comprehensive studies on the general process are rare and limited to specific subsections of the urban water cycle31. To the best of our knowledge, a study that investigates the overall uptake of digital technologies in the water utility sector is unavailable to date.

In this explorative study, we present the results of a globally conducted online survey (Smart Water Survey32, see Methods) designed as a structured interview and involving 64 utilities from 28 countries and investigating the following research questions: How is digital transformation impacting the water sector? What are the drivers for such transformation? What are the key-enabling technologies?

Results

A survey of water utilities’ digital transformation

The Smart Water Survey (SWS) maps out a water utility’s operating network and company structure as divided into the following five subdivisions displayed in Fig. 1: (1) water supply & drinking water treatment (WS), (2) water distribution network & operating systems (WD), (3) wastewater & rainwater management (WW), (4) customers & demand management (CD), and (5) data warehouse & IT systems (IT) (see Methods for survey design and definitions). Interactive labels on the schematic in Fig. 1 with examples of digital technologies and practices were provided to the survey participants to contextualize how the broad concept of digitalization can be realized in practice in each subdivision and avoid ambiguity. We investigate three main aspects of digital transformation in each subdivision, including the status of deployment of relevant digital technologies, the driving factors for technology adoption, and future key challenges. Additional questions target descriptive information on the utility’s general structure, size, age, and organization, and the specific use and deployment of particularly interesting digital technologies such as smart meters and smart control elements. Independent sections and questions in the survey are organized such that the potential information bias due to the utility’s perception of digitalization is minimized. A detailed list of all questions is provided in the Supplementary Notes 1.

Fig. 1: Cyber-physical network of a digitalized water utility—simplified schematic.
figure 1

Each numerical label represents one of the following five subdivisions: (1) water supply & drinking water treatment (WS), (2) water distribution network & operation systems (WD), (3) wastewater & rainwater management (WW), (4) customers & demand management (CD), and (5) data warehouse & IT systems (IT). Binary sequences are illustrative only.

A total of 64 utilities from 28 countries submitted their survey answers in a complete form, and all entries have been individually validated and cross-referenced. Figure 2 shows the geographical distribution of all participating utilities. They are fairly spread out across the world, with a denser representation in Europe, but with an overall coverage of all continents. Survey respondents are primarily public utilities, serving predominantly domestic end users, and in operation for longer than 20 years (Supplementary Fig. 1).

Fig. 2: Geographical distribution of water utilities that responded to the Smart Water Survey.
figure 2

Colored circles represent the location of the 64 water utilities that provided complete responses to the survey (after data cleaning). Each circle is placed in the geographical center of a country, with the color bar indicating the number of respondents per country. In total, respondent utilities were from 28 countries worldwide.

Penetration of digital technologies in water utilities

A central aspect of digital transformation is the deployment of digital technologies and their adoption by each utility. We introduce the utility digitalization score (UDS) as an indicator of a utility’s overall digitalization progress ranging from zero (lowest value) to three (highest value), representing stages of the technology innovation process33 (see Methods). Higher UDS represents greater penetration of digital technologies within the respective utility.

Results in Fig. 3 show that all utilities in our sample have commenced the process of digital transformation in at least one subdivision and have adopted/planned digital technologies in their operations. This is reflected by an overall high UDS (median UDS above 1.8 in all subdivisions) and minimum values greater than 0.6 in the WS, WD, and IT subdivisions, indicating that at least one digital technology has been selected for implementation or is already operational.

Fig. 3: Penetration of digital technologies in water utilities indicated by the utility digitalization score.
figure 3

The UDS (see Methods) is evaluated for all water utility respondents and is reported for each water utility subdivision: a WS; b WD; c WW; d CD; and e IT. Values of UDS equal to 0 indicate low digitalization progress and values of 3 indicate high digitalization progress. The central white dot in each violin plot represents the median; the lower and upper limits of the thick black marker correspond to the 25th and 75th percentiles, and the upper (lower) whiskers extend to 1.5 (−1.5) times the interquartile range or max. (min.) values, respectively.

Comparative analysis and statistical significance assessment (see Methods) show that utilities have invested the most effort in digitalizing their WD subdivisions, which exhibit the greatest penetration of digital technologies with a median UDS of 2.20 and lower (Q25) and upper (Q75) quantiles of 1.85 and 2.80 (see Supplementary Tables 1 and 2). In contrast to WW operations (Q50 = 1.83, Q25 = 1.17, Q75 = 2.33), WDNs convey potable water directly to customers entailing a special focus on water quality and supply service reliability. While strict regulation and risk-aversion may prevent the testing of engineering solutions in WDNs, our results suggest that digital and data-driven approaches represent rather low-risk alternatives for utilities. Conversely, similar risks do not apply to sewer systems to the same extent, possibly delaying the development of digital solutions in the WW subdivision. The lowest UDS are observed for the CD subdivision, with a median value of 1.83 (Q25 = 1.17 and Q75 = 2.04). This may be a consequence of only recent advancements in water demand and consumer behavior studies and less established business models for smart metering technologies15. However, the overall small spread of UDS median values suggests that digitalization is tackled in all subdivisions.

Finally, we did not find any significant correlation between the UDS and a utility’s descriptive characteristics or its country’s socio-economic context. While utility size (WDN pipe length, number of customers, and relative population served over the country’s total) and utility experience often emerged as important predictors to model UDS, a rather a poor model fit (R2 lower than 0.5) prevents us from claiming that significant relations to the UDS exist (see “Feature selection to identify UDS predictors” in the Methods and Supplementary Figs. 2 and 3).

Drivers of digital transformation

Advancing digital transformation in the water sector is only possible when the utilities’ motives for action are understood. While the motivations for the uptake of digital technologies may depend greatly on case-specific conditions and personal mindsets, we chose to categorize the potential drivers of digital transformation examined in this study into three groups: hydroclimatic factors (HCL), primarily including floods and droughts; economic factors (ECO), including cost benefits and competitive advantages; and factors attributed to government regulation (GOV), including restrictive regulation as well as incentives & subsidies. Figure 4 reports the distribution of these groups for each utility subdivision as stated by the surveyed water utilities in relation to the UDS. Additionally, the relative frequency of each driving element is annotated in the colored boxes of each subplot.

Fig. 4: Drivers of digital transformation reported by the Smart Water Survey respondent water utilities and their relation to the utility digitalization score.
figure 4

Drivers stated by surveyed utilities are represented by three different groups (hydroclimatic factors—HCL; economic factors—ECO; government regulation—GOV). The relative occurrence of each driver group is annotated in the colored boxes. Estimates for all distributions are performed employing kernel density estimation (KDE) for each driver group and subdivision: a WS; b WD; c WW; d CD; and e IT. The relative frequencies of driving factors do not sum up to 100% in c due to rounding effects and d due the low representation of HCL factors, which account for only 2% in the CD subdivision.

Overall, economic factors were found to have the largest influence on digital transformation (66% of occurrences across all subdivisions). Government regulation and hydroclimatic factors are driving elements in 26 and 8% of the cases, respectively. The development in the IT subdivision (Fig. 4e) is predominantly driven by economic factors (85%). This is also the case for the WD (70%, Fig. 4b) and the CD subdivisions (67%, Fig. 4d). A less distinct picture is presented in the WS and WW subdivisions (Fig. 4a, c), where the influence of both government regulation and hydroclimatic factors on digital transformation increases. The major drivers among hydroclimatic factors are droughts in the WS subdivision (18%) and floods in the WW subdivision (21%).

However, while mainly economic factors motivate utilities to tackle their digital transformation and start such a journey, we did not find empirical evidence supporting the claim that these factors also influence the successful implementation of their digitalization strategy. Figure 4 already suggests that there is no clearly emerging pattern linking the distribution of HCL, ECO, and GOV drivers with the UDS achieved by a utility. This is confirmed more thoroughly by the analysis of variances (ANOVA, see Methods and Supplementary Table 3) which suggests that no particular driving element is significantly dominating a utility’s progress of digital transformation, reflected by its UDS, when considering a confidence level of 95%. The gap between motivations and actual technology adoption may depend on various factors that are not captured in the three types of determinants considered in this study, including internal processes, company structure, or even individual vision, understanding, and capability of utility managers.

Key-enabling technologies for digital transformation

To identify the key-enabling technologies for the digital transformation of water utilities, we analyzed the penetration of individual digital technologies across all water utilities in the sample of respondents by computing the technology availability score (TAS; see Methods). The TAS computes the unweighted average of the digitalization score for a single digital technology across all utilities, hence, indicating its availability among the entire set of utilities. Unlike the UDS, the TAS focuses on the deployment of individual technologies rather than a utility’s overall digitalization progress. The resulting scores are displayed in Fig. 5. Widely adopted digital technologies with higher TAS may function as entry points to digital transformation. Conversely, the adoption of digital technology with lower TAS could pose greater hurdles to utilities and, therefore, require more experience and maturity in digital transformation.

Fig. 5: Penetration of digital technologies in different water utility subdivisions.
figure 5

The technology availability score (TAS) is computed for individual technologies and refers to their application to individual subdivisions in the entire water utility sector: a WS; b WD; c WW; d CD; and e IT. TAS values of 0 indicate low penetration and availability of a given technology, and values of 3 indicate high penetration and availability (see Methods. For more details on the 95% confidence intervals for the TAS, see Supplementary Table 4).

Figure 5 confirms our previous findings on the overall progress in the digitalization of utilities across their subdivisions. WD subdivision appears the most digitalized and related technologies seem already deployed and operational throughout most utilities. These include automated controls (TAS = 2.50), leakage detection (TAS = 2.50), and device failure detection (TAS = 2.47) algorithms, which are among the most widespread digital technologies within the sample of utilities. Conversely, digital technologies in the CD subdivision remain primarily in the planning stage or are only starting to be deployed. Digital portals (TAS = 2.11) and smart meters (TAS = 1.84) in combination with automated meter reading (AMR) (TAS = 2.03) appear to be more in the testing and initial deployment phase. The implementation of End-use disaggregation (TAS = 1.02) especially is planned at a later stage or not considered at the time of the survey. Unlike other subdivisions, the range of TAS scores in the WS subdivision is rather large. Geographic Information Systems (GIS) (TAS = 2.73) and on-site offline (TAS = 2.34) and online (TAS = 2.63) sensing technologies are in operation with mostly all utilities, while remote sensing technologies with drones (TAS = 1.06) and satellite imagery (TAS = 0.92) are instead planned at later stages. Overall differences and TAS trends across subdivisions suggest that digital innovation is being embraced gradually in the water sector rather than in a disruptive manner.

Discussion

Impact of digital transformation in the water utility sector

Our survey and subsequent analyses provide three key insights into the current state of digital transformation of the water utility sector. Firstly, all utilities in our sample have started digitalizing their urban water cycle and digital transformation has already impacted utilities in all geographical regions regardless of specific circumstances and challenges they are facing. Secondly, we provide a data-based outline of key digital technologies that have been implemented, which considers the degree of penetration, enabling utilities to make informed decisions about their strategy and help them prioritize the implementation of these technologies. The introduction of digital technologies to water supply and distribution systems is the entry point that leads to the further adoption of key-enabling technologies in the entire urban water system. Thirdly, our data indicate that digitalization efforts in all subdivisions are driven mainly by prospective economic benefits rather than government regulation or hydroclimatic factors, while local differences not grasped by our survey might exist. While a utility’s motivation to engage in digital transformation is influenced by these drivers, no empirical evidence was found that the same factors also determine a utility’s progress of digital transformation in practice and deployment of specific technologies. Disentangling stated from revealed preferences may be subject to future monitoring of the actual uptake of digital technologies in the water utility sector.

Study limitations

All utilities surveyed in our sample regarded digital technologies as potential solutions valuable for current and future technical and hydroclimatic challenges. For instance, digital technologies are expected to have great potential for improving efficiency, especially in the CD subdivision, e.g., through intensified consumer outreach and feedback resulting in improved demand management, despite its low current penetration rate. This is also true for drinking water monitoring and surveillance technologies, wastewater treatment, and infrastructure and operating systems.

However, we are aware that utilities which responded to our survey invitation may be more familiar with digitalization and further advanced in their digital transformation than others. Conversely, many utilities that have not commenced their digital transformation process or face restricted financial capacity to start the process might have refrained from sharing their perspective on digitalization. This type of bias, referred to as the non-response or selection bias34, was likely not avoidable within the framework of this survey that relied on voluntary subject participation. While being potentially accessible to all utilities worldwide (the Smart Water Survey was accessible via the World Wide Web and translated into multiple languages), we cannot claim global coverage and statistical representation of the global water utility sector. With an overall final sample size of 64 utilities that responded out of the several hundreds of thousands of utilities existing worldwide, our study was unable to achieve greater coverage of water utilities worldwide, despite multiple targeted outreach campaigns (see Methods). As a result, we neither claim that our survey is representative of all types of water utilities (utility size, ownership, etc.), nor that our results have global representation, as the absolute values of our numerical results might be affected by the above-mentioned voluntary response bias and the number of represented countries and utilities. While further research needs to be done to disentangle this potential bias effect, this explorative work provides useful insight into the process of digital transformation in water utilities, based on a quantitative analysis of survey responses. Each hypothesis and claim based on the survey responses was formally tested to ensure that, within the sample of 64 interviewed utilities, resulting trends and findings are proven with statistical significance.

Furthermore, within any cooperation, the adoption of new, digital technologies is subject to a complicated decision-making process. Our investigation focussed on an initial understanding of driving elements of digital transformation within three hierarchical categories. While our resulting data reveal general preferences, they do not allow for a more detailed analysis regarding both the causality of stated driving elements and technological uptake as well as the specificity of the decision-making process. For instance, if a utility in our sample indicated both that droughts were driving their efforts to digitalize and that they installed smart meters, then this correlation may be observed but it does not infer causality. The decision-making process in this case regarding the installation of smart meters may also be subdivided into more specific aspects such as the reason for the replacement (e.g., end-of-lifetime or targeted replacement campaign) and its original initiation (e.g., by the utility or by the hardware supplier). Further disentanglement of the driving elements of digital transformation in the water utility sector is needed to generate more specific knowledge and a better understanding of their causal relations.

Future research

Digital transformation of water utilities is a crucial stepping-stone toward a more efficient and sustainable, and thus, climate-resilient urban water cycle. In this explorative work, we shed light on this process that is so far only briefly investigated in the literature. Below, we discuss possible topics to follow up on our research, to further enhance the knowledge of digital transformation in the water utility sector.

Altogether, the availability and implementation of digital technologies require broad acceptance and support from all involved parties, including utility management and personnel, regulators, and consumers. While our investigation focuses on the utility side, we find that current regulation is not really a predominant factor in utilities’ decision-making process regarding digital transformation. On the contrary, an unadjusted regulatory policy may rather hinder the progress of digital transformation31. In the absence of common regulation on the installation of digital technologies, water utilities may simply refrain from upgrading to digital technologies while the adoption of new, digital technologies is currently stuttering due to long replacement times27 and lack of individual affinity for innovation. However, the present literature does not provide further detailed insight into current equipment replacement strategies. Further research may better uncover the link between general replacement strategies and digital transformation in the water utility sector.

Furthermore, such common regulation on the installation of digital technologies would certainly encourage the uptake of digital technologies and assert industry decision-makers. Future research may investigate, from a policy and motivational angle, how to bring regulators and utilities to the table to discuss digital solutions and the vetting of new technologies in the context of specific problems. To ensure that future technology development may be targeted and impact-oriented, we further encourage the inclusion of tech providers in this discussion.

Additionally, the replacement of analog with digital technologies introduces new risks to water utilities’ operations regarding cyber security. While ongoing research is investigating threats and risks based on observed incidents, along with technological solutions for stress testing and security35,36,37, it is also crucial to understand how these risks are perceived by water utility personnel and which reservations may be encountered regarding the uptake of digital technologies. Ultimately, possible security concerns could be analyzed in order to develop problem-specific remedial strategies, consisting for example of either further technological upgrades in terms of application-specific cyber security or targeted capacity-building measures.

Moreover, in an additional follow-up conversation with a selected set of highly digitalized utilities in our sample or respondents, it was indicated that their company’s corporate and leadership mindset significantly influenced and accelerated digital transformation. While fostering innovation at top-level management certainly drives innovation, relying only on a unilateral hierarchical approach to digital transformation may obstruct innovation altogether. Future research may further uncover the role of individual leadership mindset within the context of technology uptake in the water utility sector while also considering the role of both top-down and bottom-up engagement for successful digital transformation.

Further research may also be conducted on understanding how consumer-centric solutions can contribute to the digital transformation journey. Ultimately, consumer demand drives a water utility’s business and, thus, also their decision-making. However, current utility business models focus mainly on quantity-driven revenue and, hence, are rendering conservation undesirable38. Future studies may investigate alternative business models for utilities that empower consumers to strive for water conservation.

Finally, existing studies have already proven that the widespread deployment of smart meters enables better communication between utilities and their customers, encouraging long-term conservation behavior16. Unfortunately, however, consumer decision-making is driven by more than just rational considerations. To achieve conservation goals where needed in the water sector, social norms need to be adjusted long-term and sustainably for conservation to become a socially desirable behavior39,40. Further work may place an increased focus on the human aspect of decision-making regarding water conservation.

Methods

Smart Water Survey and data processing

The Smart Water Survey described here was designed by the authors as an online survey accessible to all water utilities. After a preliminary test run with a limited number of water utilities and minor adjustments based on received feedback, the survey was published online at http://smartwatersurvey.com and was continuously available for completion by water utilities during the period January 2020–December 2020.

The Smart Water Survey consists of 46 questions about the water utility and its digital transformation progress and is further divided into five subsections focusing on water utility subdivisions: (1) water supply & drinking water treatment (WS), (2) water distribution network & operating systems (WD), (3) wastewater & rainwater management (WW), (4) customers & demand management (CD), and (5) data warehouse & IT systems (IT). The 46 main questions that were asked to all participants are reported in the Supplementary Notes 1. Additional specific sub-questions were displayed depending on the specific answers to the main questions.

Multiple advertisement campaigns were organized in 2020 by the authors and the sponsors alike (International Water Association (IWA), Water Europe, and the Smart Water Networks Forum (SWAN)) who helped distribute the survey and increase outreach. Information campaigns were performed primarily via email communication to known water utility contacts, professional networks, water utility associations/national authorities, social media posts on Twitter and LinkedIn, and public information sharing during academic and industry-oriented events.

When the survey was closed at the end of 2020, complete answers from 68 respondents were received. The answers provided by four respondents were excluded during data pre-processing: one because of inconsistencies and gaps in the provided information, three because the answers were provided by non-utility actors (i.e., NGOs, consultancy companies, other). Thus, the final set of analyzed answers consists of survey entries provided by 64 water utilities.

Definition of water utility subdivisions within the SWS

At the core of every water utility’s operation is the management of the urban water cycle. This includes resource management, water treatment, water distribution, wastewater collection, and rainwater/stormwater management41,42. For the purposes of clarity and comprehensibility, we combined resource management and water treatment into the subdivision of water supply & drinking water treatment (WS), extended water distribution systems to include both the piping network and its operating systems into the subdivision of water distribution network & operating systems (WD), and further combined wastewater collection and rainwater/stormwater management into the subdivision wastewater & rainwater management (WW) due the existence of many combined sewage networks.

While at its core is managing the urban water cycle, a utility remains a service provider of water for its customers. In recent years, utilities were increasingly fostered to involve their customers, e.g., through water demand management programs13. Therefore, we further included the subdivision of customers & demand management (CD) in the Smart Water Survey. Lastly, as cooperations, water utilities function through organizational processes and, especially to fully embrace digital technologies, need data management tools and skills. Hence, we included the subdivision of data warehouse & IT systems (IT).

Technology availability and utility digitalization scores

For all technology-related questions in the Smart Water Survey targeting technology penetration or availability, participants were provided with a choice of four categorical answers, indicating increasing levels of penetration/availability of a given technology k for a water utility u: (i) in operation, (ii) implementation ongoing, (iii) planned within five years, or (iv) not planned at the current time. These four options correspond to stages in the technology innovation process33.

A utility u is part of the set of surveyed utilities U and a technology k is part of the set of investigated technologies K. During the analysis of the survey data, we assigned a numerical value to each category and defined the digitalization score (DSu,k) for each combination of water utility and technology. We assigned the following values to the above categorical answers to enable aggregation and quantitative comparison: (i) DSu,k = 3, (ii) DSu,k = 2, (iii) DSu,k = 1, and (iv) DSu,k = 0. The resulting output contains tabular numerical data with utilities represented on one axis and individual technologies on the other axis.

We analyzed the penetration of a single digital technology in the complete set of utilities by aggregating the digitalization scores across all utilities. The resulting technology availability score (TASk) is formulated as follows:

$${{\boldsymbol{TAS}}}_{{\boldsymbol{k}}}=\sum _{{\boldsymbol{u}}{\in }{\boldsymbol{U}}}{{\boldsymbol{DS}}}_{{\boldsymbol{u}}{\boldsymbol{,}}{\boldsymbol{k}}}$$
(1)

We analyzed each utility’s digitalization progress by aggregating the digitalization scores across all its technologies. The resulting utility digitalization score (UDSu) is formulated as follows:

$${{\boldsymbol{UDS}}}_{{\boldsymbol{u}}}=\sum _{{\boldsymbol{k}}\in {\boldsymbol{K}}}{{\boldsymbol{DS}}}_{{\boldsymbol{u}},{\boldsymbol{k}}}$$
(2)

Testing for statistical significance

Testing for normality and homoscedasticity

We analyzed all sets of samples for normality and homoscedasticity in order to select the appropriate test for significance43. Tests for normality were performed using a D’Agostino-Pearson test for normality and homoscedasticity was investigated with the Levene test. The Python code developed for this analysis relies on the SciPy library44.

Welch-ANOVA and post hoc Games-Howell test

We used Welch-ANOVA43 to test (i) for significance in differences of UDS distribution means of utility subdivisions and (ii) for possible influences of the drivers indicated by utilities on their UDS. In both cases, Welch-ANOVA was selected since Levene test and D’Agostino-Pearson test results suggest that the assumption of normality seemed less critical than the assumption of homoscedasticity (see Supplementary Tables 1 and 3). In case (i) of testing for significance in differences of UDS distribution for different utility subdivisions, ANOVA results suggest that the null hypothesis of equal means can be rejected with a confidence level >99.9% (see Supplementary Table 1). Therefore, we conducted a post hoc analysis using a pairwise Games-Howell test45 (see Supplementary Table 2). Results for case (ii) are reported in Supplementary Table 3. The Python code developed for this analysis relies on the Pingouin library46.

Feature selection to identify UDS predictors

To identify possible correlations between the utility digitalization progress, measured by UDS, and characteristics of the utilities and the socio-economic context of the countries where the respondent utilities are operating, we developed a model-based feature selection analysis. Given the vector UDS containing the utility digitalization scores of all utilities in our sample (UDSu) and a set of potential UDS predictors X, we trained and tested random forest (RF) regressors47 to identify the most relevant subset of predictors X’ X to create the best-fit model UDS = f(X’). Here, the set of candidate predictors X contained information on 62 continuous and categorical descriptive variables of utilities’ characteristics (e.g., pipe network length, number of customers, utility age) and socio-economic context (i.e., population density, GDP, and relative country population served by the utility).

First, we processed the categorical variables with one-hot encoding before model development and excluded variables with more than 20% data gaps. Second, we split the dataset in training set (80%) and test (20%). Third, we trained and tested the RF regressors with multiple runs over 10 different random seeds. In each run, we tuned four RF regressor’s hyperparameters (i.e., number of trees, maximum depth of trees, minimum number of samples required for node split, and minimum number of samples required for a leaf node) by exhaustive search over 300 parameter combinations and k-fold cross-validation (k = 3). Fourth, we evaluated the coefficient of determination (R2) on test data to assess the goodness of fit of the RF regressors. Finally, we quantified and visualized the feature importance of the predictors used in the best RF regressor of each run. To do so, we calculated the impurity-based Gini feature importance48 and visualized it for final analysis.

The above procedure for model building, training and validation, and feature importance calculation was replicated both to identify potential predictors of the aggregate UDS across all utility subdivisions, as well as for the UDS of each of the five water utility subdivisions (WS, WD, WW, CD, IT). However, given the poor goodness of fit of the models referred to subdivisions (best R2 coefficients on the test dataset lower than 0.3), only the results for the model referred to aggregate UDS values are reported (Supplementary Figs. 2 and 3).

The Python code developed for this feature selection analysis relies on the Scikit-learn library49.