1 Introduction

The ongoing advancement of the Internet of Things (IoT) enables medical institutions to offer top-notch, practical, and all-encompassing healthcare services. The patient can have a collection of small wireless sensor nodes implanted in them to track their health and gather important physiological data that can be used to diagnose chronic diseases and make emergency medical choices [1]. The elderly can utilize wearable embedded medical sensors to access cutting-edge healthcare services whenever and wherever they choose to enhance their quality of life.

Once the medical IoT network of sensors gathers physiological data, it is collected and converted to an electronic health record (EHR) and sent to the healthcare organization for further analysis and remote monitoring. The EHR must be encrypted before transmission to preserve patients’ privacy and stop third parties from gaining illegal access to it. Additionally, with the exponential growth in the amount of data being generated daily, the issue of data storage and secure access to it is also becoming widespread [2]. The use of a physical storage medium becomes unfeasible due to the lack of security and added cost [3]. The use of a cloud service provider, generally referred to as a CSP, separates the data ownership from the data owner and also comes with an added cost. To address these issues, research on decentralized data storage has gained momentum. To facilitate privacy and security to the patient’s data, the file is first encrypted and then uploaded on a distributed storage medium, like an interplanetary file system, also known as IPFS. On the protected data, the patient applies an access policy to set the authorized attributes and their relationships. Only users with the appropriate attribute secret keys, such as a doctor, nurse, or patient’s family, are allowed to decrypt the ciphertext. This encryption method is referred to as attribute-based encryption or ABE [4].

Fortunately, with the emergence of blockchain, its underlying technology, i.e., smart contracts and distributed storage mechanisms, IPFS can be coupled together to develop an effective distributed storage system. One such widely used distributed storage medium is Ethereum. It is a public blockchain platform, where auto-triggered functions called smart contracts can be created in a decentralized network [5]. The adoption of a decentralized storage medium where data is stored in multiple nodes is a way to greater security and continuous availability. Additionally, flexible attribute-based scheme can offer a smart way to share data in wireless medical sensor networks coupled with blockchain. Sensitive medical data is protected by enabling dynamic access control based on particular attributes of the user. Through the tracking of data access and modifications, the traceability function can be used to hold accountability. This plan improves patient care and healthcare delivery by enabling secure, private, and transparent data transmission among authorized parties in wireless medical sensor networks.

1.1 Research motivation

The employment of a cloud service provider for system configuration and key distribution is crucial to the majority of the current work done for data storage and data exchange. Furthermore, when files are retrieved via HTTP, they are susceptible to being unavailable for a variety of unforeseen causes. As a result, the proposed system is being developed to provide a user-centric approach in which the data owner, rather than using third-party ownership, has complete control over the data that is shared and available throughout the network. The availability of the data is guaranteed to be uninterrupted and constant thanks to a dispersed data storage network.

1.2 Contribution

  1. 1.

    This paper proposes a user-centric approach for data-sharing by combining IPFS and smart contracts. The owner of the data is given complete ownership of his data and the authority to formulate access rights for it. Modified CP-ABE is used for the encryption of the file as reflected in Fig. 1.

  2. 2.

    The framework proposed in this research also allows one to search for a file with a relevant term, which facilitates data retrieval. To keep the model safe and secure, a user revocation feature is also added to avoid illegal access and update the access rights when required.

  3. 3.

    During the decryption phase, most of the work is done on the HN’s device, thus making this approach lightweight for the user. All heavyweight computation like encryption of the file, its decryption, and implementation of CP-ABE is done off-chain using javascript programming to keep the proposed model lightweight and efficient. A portion of the ciphertext is sent to the user for complete decryption when the user’s identity has been confirmed and his characteristics meet the requirements of the access policy.

  4. 4.

    The suggested framework in this paper is extensively compared with some of the similar existing schemes. The simulation results and the comparisons done in the last section prove that the proposed model uses less storage overhead and computational time. In addition, the rest of the sections of the paper are organized as follows. Section 2 is used to discuss some of the relevant work related to data storage and transmission in the IoT environment, followed by the study of some of the common techniques used in this paper in Section 3. Section 4 illustrates the steps involved in the proposed framework with a detailed flowchart and algorithms involved. Section 5 and Section 6 are used to discuss the security parameters of the proposed model followed by a comparative study of the proposed model with the existing ones with respect to computation cost and also storage overhead.

Fig. 1
figure 1

System model adopted by CP-ABE and the proposed model

2 Related work

This segment presents some of the details of the existing work that focuses on the encryption and storage of data generated through sensor devices used in various IoT applications.

Shaikh [6] in his paper explored the current domains of IoT devices in the fields of agriculture, smart cities, healthcare, and industry. The paper also compared the various existing architectures for data storage integrating IoT and blockchain like cloud-based storage, decentralized storage, and hybrid modes. Diro et al. [7] proposed an end-to-end security-providing scheme using fog nodes. The use of fog nodes incurs less computation cost and also occupies less bandwidth of the network. Most of the researchers,  [8,9,10,11] and  [12], relied on a third party to provide effective security to their proposed models. Relying on a centralized entity is susceptible to one potential source of failure or vulnerability if the centralized entity is under attack because of the heterogeneity among the data sources in the IOMT environment and also the volume of data collected regularly.

To replace a centralized entity, Xu et al. [13] introduced a blockchain-based model known as HealthChain, which is used to manage large-scale data generated in a healthcare environment and achieves non-repudiation and non-tampering of IoT data and the resultant diagnosis. Given the rapid growth of the data that is generated regularly in an e-healthcare environment and the limited storage capacity of nodes present in a blockchain network, the authors relied on IPFS instead of cloud service providers. Access control is added to the proposed model to prevent any confidentiality breach and allow only authorized personnel to access the shared data. Han et al. [14] proposed an inner product encryption (IPE)-based scheme for providing fine-grained access control and hiding of the access policy. In addition, access records, ciphertext addresses, and hash values are stored on the blockchain, thereby providing immutable access to these records and guaranteeing the privacy of the data. Naz et al. [15] also proposed a digital sharing platform using Ethereum and IPFS to eliminate the use of a trusted third party. In this scheme, the user is also motivated to leave reviews for the data they paid for. Authenticity and the quality of the data are depicted by these reviews and ratings. Watson analyzer is used to tackle the issue of fake reviews. Similarly, Kumar and Tripathi  [16] in their paper proposed using IPFS to avoid a single point of failure by introducing an extra data storage layer made of a cluster of IPFS nodes to transmit the stored data to a consortium blockchain. IPFS cluster layer is responsible for providing trusted storage and also authentication of the medical IoT nodes and the patients. ECDSA is used as the encryption algorithm to create private and public keys for the IoMT device. The model’s effectiveness is assessed using Ropsten network and metamask, and the results are analyzed based on execution time and storage used. Ren et al. [17] highlighted the constraints attached to the blockchain network storage and suggested the use of identity-based proxy aggregate signature (IBPAS) approach, thereby lowering the communication bandwidth, compressing storage space, and enhancing signature verification efficiency.

Encryption based on the user’s attributes rather than their identification offers a secure way to communicate and store data while supporting precise access control. Multiple researchers have used ABE schemes—either key policy attribute-based encryption (KP-ABE) or ciphertext policy attribute-based encryption (CP-ABE)—to grant granular access rights to the data that is stored. Oliveira et al.  [18] in their paper proposed the use of break glass technology to decrypt part of the medical report available on the cloud in case of an emergency. This access was only relevant for a fixed period, making use of CP-ABE and time-constraint tokens. Choksy et al. [19] proposed another attribute-based access control (ABAC) model. The proposed model supports multi-level access delegation supporting customizable and programmable delegation features. Additionally, the model also supports the mechanism for attribute revocation. Majdoubi et al. [20] proposed a framework based on blockchain for data sharing in an e-healthcare environment called SmartMedChain. This framework allows patients to share their medical data received from the sensor nodes. Doctors can generate a response and create a health report on the basis of this data and upload it for further use. Additionally, a privacy agreement scheme is also enabled to enforce even stricter privacy between the patient and the healthcare provider. Yang et al. [4] proposed a self-adaptive secure storage model for data generated in an IoT environment, with two-fold access control. Their scheme also supported the facility of deduplication, thereby reducing anonymity among the data stored over the public cloud.

Authors of [21] proposed public key encryption with a keyword search model, PEKS based on KP-ABE with prevention from chosen ciphertext attack. Sun et al. [22] proposed a modified vector transformation technique to change the format of the access policy and the attribute sets used in ABE. However, the scheme failed to be lightweight as compared to the proposed model. Zhang, Zheng, and Deng [23] also proposed a CP-ABE-based scheme called PASH for smart healthcare. PASH is made to be well-suitable for large attribute sets. Similarly, Ogundoyin and Kamil [24] proposed PAASH, a privacy-preserving authentication supporting the facility of aggregate signatures without the use of certificates for fine-grained access control.

Nevertheless, the existing approaches have several inefficiencies, like prohibitive ciphertext size and heavy operations for encryption and decoding. The challenges of maintaining the privacy of the attributes, enabling highly effective and fine-grained access control, and providing low-cost encryption and decryption overhead remain unresolved in the research work discussed above. Also, several researchers have concentrated on using the decentralized approach for the transmission and storage of medical health records, security and latency improvements remain essential, with less computation cost and storage ahead, making it ideal for an IoT architecture. To achieve lightweight and secure fine-grained access control, a method utilizing modified CP-ABE and a large user attribute set has been developed in this research paper. This approach aims to address the aforementioned shortcomings.

3 Preliminaries

This section presents the background knowledge of some of the common terms used in the proposed architecture.

3.1 Attribute-based encryption

ABE is an access control policy that prefers the use of attributes of a user to encrypt the data rather than relying solely on the identity of the user. Thus, any person satisfying a group of attributes is eligible to decrypt the data without the need for a one-to-one key for each user. ABE schemes broadly fall into two categories, KP-ABE and CP-ABE. In the former scheme, the access policy is connected to the private keys of the user, and the attribute set is connected to the ciphertext created. On the contrary, in CP-ABE, the access policy is directly connected to the ciphertext created, and the attribute set is linked to the user’s private key, giving the data owner complete control over who can and cannot decrypt their data. It comes with four algorithms:

Setup(\(\lambda )\rightarrow (PP,MSK) \). This algorithm inputs the system’s security parameters and produces a public parameter and a master secret key.

Encrypt (PP,\(\rho \),data_file)\(\rightarrow \) CT. This algorithm accepts the public parameters, the access policy \(\rho \), and the data_file as input to produce the encrypted ciphertext.

KeyGen(PP,MSK,S) \(\rightarrow \) SK. This algorithm accepts the public parameters, the master secret key created in the first algorithm, and the attribute set S as input to produce the private key for the user.

Decrypt (PP, CT, SK) \(\rightarrow data\_file\). This algorithm accepts the public parameter, the ciphertext, and the secret key as input to decrypt the ciphertext and produces the data_file.

3.2 Bilinear groups

\(G_1\), \(G_2\), and \(G_T\) represent three cyclic groups of prime order p. Let g be the generator of G1 and h be the generator of G2. The bilinear map e: \(G_{1} \times G_{2} \rightarrow G_{T}\) has the following properties:

  1. 1.

    Bilinearity when \(g \in G_1\) and \(h \in G_2\) and \(i, j \in Z_{p}\), we have \(e(g^{i},h^{j})=e(g,h)^{ij}\)

  2. 2.

    Non-degeneracy: \(e(g^{i},h^{j}) \ne 1\)

  3. 3.

    Computability: Group operation e(u, v) can be easily computed, where \(u\in G_1\) and \(v \in G_2\)

3.3 Assumptions made

  • Assumption 1: Decisional Bilinear Diffie-Hellman: G denotes the bilinear group of prime order p, and let g be its generator. The parameters i,j,k \(\in Z^{*}_{p}\) are chosen randomly. If an adversary gains access to \((g,g^{i},g^{j},g^{k})\), its extremely hard for \(\mathcal {A}\) to discern \(e(g,g)^{ijk} \in G_{T}\) from an random element from \(G_T\)

  • Assumption 2: q-Strong Diffie Hellman (q-SDH): For a randomly chosen \(x\in Z_p\) and \(g_1\in G_1\) and \(g_2\in G_2\), the q-SDH problem is given \((g_{1},g_{1}^{x},g_{1}^{x^2}.....g_{1}^{x^p},g_{2},\) \(g_{2}^{x}) \in G_1^{(q+1)}\times G_{2} ^{2}\) to produce the output \((c,g_{1}^{1/(x+c}) \in Z_p \times G\). It is extremely hard for an adversary to obtain \((c,g_{1}^{1/(x+c}) \in Z_p \times G\), for a random value of x.

Fig. 2
figure 2

Flowchart representing the steps involved in the proposed model

Table 1 Notations used

4 Proposed system architecture

4.1 System architecture

As represented in Fig. 2, the proposed model consists of three major components: a hospital node, data users, a decentralized storage medium and data users. Table 1 lists some of the most frequently used abbreviations throughout the paper. The functionalities of these components are discussed below.

  • Hospital node (HN): This represents the legal owner of the data that is collected with the help of sensor nodes adhered to the patient’s body. The HN is responsible for transmitting the collected data to a mobile device and transforming it into an EHR [25]. The HN is also responsible for selecting an appropriate keyword representing the EHR for further retrieval of the file. This keyword and the EHR are further encrypted with respect to a specific access policy. Additionally, he or she is also in charge of creating the PP and MSK and distributing the keys to the registered DU.

  • Data user (DU): A data user can be any healthcare worker who needs to access the encrypted EHR for timely diagnosis. Attached to each DU is a group of attributes that are used to access and further decrypt the EHR. DU makes use of the trapdoor algorithm, described in later sections to retrieve the encrypted file through their mobile devices.

  • Storage medium: In our proposed model, both the Ethereum blockchain and IPFS are used as essential mediums for data storage and data sharing. IPFS though not directly linked to Ethereum can be used along with it for data storage in a decentralized environment. It makes use of a distributed Hash table, called DHT, for data storage and access. Once a file is uploaded on the IPFS, it returns a content identifier (CID), which acts as the fingerprint of the file stored on the network. This CID can be later used to access the file.

4.2 System overview

This section illustrates our proposed model and is used to describe its design in detail.

The steps given below illustrate the workflow of the proposed model.

  1. 1.

    Sensor nodes connected to the patient’s body are used to collect the information related to various vitals of the patient and form an EHR.

  2. 2.

    To initialize the process, the hospital node (HN) creates three smart contracts, Upload_SC, Data_SC, and User_SC for file upload, data sharing, and authentication. Upload_SC handles the encryption of the patient’s file and uploads it to the IPFS. We have used JavaScript and the web3.js library for off-chain encryption of the file and then used Upload_SC to upload the encrypted file to the IPFS. In return, it transmits \(href_{location}\) to the Data_SC. Data_SC is used to streamline the encryption process, decryption, and data retrieval. It also handles all the necessary steps needed to implement user revocation and tracking. User_SC handles user registration, maintaining their attributes and their authentication. Steps 1 and 2 in the flowchart depicted in Fig. 2 indicate these steps.

  3. 3.

    HN creates the system’s master secret key and public parameter by initiating the Global_Setup algorithm and transmits the system parameters to Data_SC for further processing.

  4. 4.

    For a medical professional, represented as DU to be able to register within the network and access the file, they invoke the User_SC and input their Ethereum account id and the public communication key associated with it. Step 3 in the flowchart represents this step.

  5. 5.

    Subsequently, the HN allocates a pair of public and private keys for each successfully registered user and stores the key in Data_SC for further communication, along with the corresponding user id, uid. Step 4 in the flowchart represents this step.

  6. 6.

    The HN further selects an appropriate keyword related to the content of the file for further retrieval of the file. The \(href_{location}\) along with the keyword and the access policy set by the HN is encrypted using a ciphertext policy attribute-based encryption algorithm and added to the blockchain network using in Data_SC. Step 5 in the flowchart represents this step.

  7. 7.

    For a DU to access the EHR of any patient, they need to search for the said file using the Trapdoor query. Data_SC invokes the User_SC to check if the user is authenticated or not. Once authenticated, DU can request the ciphertext using the keyword and obtain the partially decrypted file if their set of attributes satisfies the access policy set by the HN. Steps 6 and 7 in the flowchart represent this step.

  8. 8.

    Upon receiving the components of the ciphertext, DU makes use of their attributes to derive the \(href_{location}\) and then downloads the file from the IPFS using this CID. Step 8 in the flowchart represents this step.

  9. 9.

    To keep track of which DU is accessing which file and when or to check which DU accessed a particular file last, the \(K_2\) component of the public key of the user can be used by HN. The details of \(K_2\) are discussed in the following section, under the KeyGen algorithm.

  10. 10.

    If a DU is found to be corrupted or using their key for wrongdoing, the user revocation mechanism provided within the proposed model can be used to block the user and add their identity to the revocation list stored inside the smart contract.

4.3 Detailed design

This section is used to present the proposed model in detail with a thorough explanation of all the algorithms used in the model.

4.3.1 Initialization phase

The HN of the hospital takes \((1^\lambda )\) as input and deploys the smart contract to initiate the first algorithm, called Global_Setup. The public parameter (PP) along with the master secret key (MSK) is formulated using this algorithm. The public parameter is stored in \(Data\_SC\) for further use.

  • Global_Setup\((1^\lambda )\rightarrow (PP,MSK)\). This algorithm accepts the security parameter of the system \(1^\lambda \) as input and produces PP and MSK as output. The public parameter is open to access by anyone and is used alongside the master secret key for key generation of the registered users and also for the formation of the ciphertext. Algorithm 1 describes the process in detail. Here, g represents the generator of group G and \(\kappa \) represents the key space available.

Algorithm 1
figure a

Global setup algorithm.

4.3.2 Registration phase

The DU invokes the \(User\_SC\) for registration of the user within the network. Once \(User\_SC\) is invoked, the HN is responsible for creating a pair of public and private keys for each registered user by initiating the KeyGen algorithm, which takes the Ethereum account id and attribute set, S as input along with MSK and PP parameters created in the initialization phase.

  • KeyGen\((PP,MSK,uid,S) \rightarrow (PK_{uid,S},SK_{uid,S} ) \). As illustrated in the figure, the HN initiates the KeyGen algorithm by accepting the Ethereum account id of each user and their public corresponding communication key. For each attribute set, S is represented as \(\{\nu _1, \nu _2,....\nu _k\}\subseteq Z^*_p\), choose parameters \(r, \varnothing , \varrho , v^{'},v^{''},v^{'''} \subseteq Z^*_p\). The HN calculates \(\delta = SEnc_{k_1} (uid)\). Further, the private and public key pairs are calculated as \(K_1 = g^{\frac{\alpha -ar}{\lambda +\delta }}\), \(K_2 = \delta \), \(K_{3,i} = g^{{r(\nu _i + \iota )}^{-1}}\), \(K_4 = \varrho \), \(\aleph _1 = (K^{\varrho }_1)^{v^{'}}\), \(\aleph _2 = I^{v^{'''}}_0\), \(\aleph _{3,i} = (K^{\varrho }_{3,i})^{v^{''}}\),

    $$ SK_{uid,S} = (K_1, K_2, \{K_{3,i}\}_{i\in [\kappa ]},K_4,v^{'},v^{''},v^{'''} ), $$
    $$ PK_{uid,S} = (\aleph _1, \aleph _2, \{\aleph _{3,i}\}_{i\in [\kappa ]} ) $$

Whenever a user registers within an Ethereum account, it also generates a pair of private and public communication keys, \(sk_{com}\), \(pk_{com}\). Next, the HN calculates

$$\begin{aligned} SK_{uid} = AEnc_{pk_{com}} (SK_{uid,S},PK_{uid,S}) \end{aligned}$$

and invokes the User_SC to store \(SK_{uid}\) alongwith the associated uid in it for further use. Both keys are sent to the user after being further encrypted with their public communication key. Only a legitimate user with access to their private key can decrypt this message and access their pair of keys.

4.3.3 Encryption and uploading phase

Once the HN is ready to upload and share the file, he needs to first encrypt the file before uploading it to the IPFS for better privacy. The file is encrypted with \(pk_{uid,S}\) which was shared with the user in the above phase. Then, HN selects a suitable keyword, KW corresponding to their file which will be later used to search for the same file. Subsequently, the HN also decides the access policy for their file, thus making the entire scheme user-centric. The response received from the IPFS and the KW is further encrypted using this access policy.

  • \(EncFile(PP,File,(M,\rho ),KW)\rightarrow CT\). The file is first uploaded to the IPFS. In response, the IPFS returns the location of the file, \(href_{location}\), which is given back to the HN over a secure private channel for further processing. The algorithm further calculates \(CID= H_{1}(href_{location}) \) to reduce the size of the location returned from the IPFS. M denotes the matrix of l*n dimensions, and \(\rho \) denotes the rows of M to certain attributes. For computing the final CT, the owner selects \(x\in Z^*_p\) and a vector, \(\upsilon = (x,y_{2},...y_{n})^{T} \in Z^n_p \). Calculate \(x_i =M_i \cdot \upsilon \), where \(M_i\) denotes the vector representing the ith row of the matrix, M. The HN further selects \(x^{'},x^{''} \in Z^*_p\) and calculates CT as follows:

    $$ E_{0} = href_{location} \cdot I^x, E_{1} = g^{x}, E_{2}=s^{x} $$
    $$ E_{3,i} = \frac{\rho (i)x_i}{[x^{'}H(KW)]}, E^{'}_{3,i} = \frac{x_i}{[x^{''}H(KW)]} $$
    $$ E_4= I_0^H(kW)I^{x/H(KW)}, E_5 = (g^a)^{x{'}}, E_6 = (f^a) ^ {x{''}}. $$

    In the end, the access policy \((M,\rho )\) and \(CT =(E_0,E_1,E_2,\{E_{3,i},E^{'}_{3,i}\}_{i\in [l]},E_4, E_5,E_6,CID)\) is send to the Data_SC for addition in the blockchain network.

4.3.4 Keyword trapdoor generation

Using this algorithm, a keyword-based token is generated, \(token_{KW}\), corresponding to the keyword selected by the HN representing the file uploaded.

  • Trapdoor\((KW,SK_{uid})\rightarrow token_{KW}\). The HN uses this algorithm to generate the search token by taking the KW selected by the HN and the key of DU stored inside the \(Data\_SC\). Algorithm 2 describes the details of this algorithm in detail.

Algorithm 2
figure b

Trapdoor algorithm.

4.3.5 Retrieval phase

This phase is used to search for the required document based on the keyword given by the DU. The DU chooses a keyword depending on the file he wants to retrieve. For this, he invokes the \(Data\_SC\) for data retrieval.

  • Trapdoor\((KW^{'},SK_{uid}^{'})\rightarrow token^{'}_{KW}\). When a DU wants to access a file, he selects a keyword corresponding to that file and sends the keyword to the HN. The HN invokes the trapdoor algorithm again to generate \(token^{'}_{KW}\). The stored record is searched in Data_SC, and if a corresponding \(token_{KW}\) is found, the CT of that file is retrieved from the blockchain network.

  • Search \((token^{'}_{KW}) \rightarrow CT/\perp \) This algorithm verifies whether the access policy, \((M,\rho )\), corresponding to each stored CT returned with respect to \(token^{'}_{KW}\) satisfies with respect to the attribute set, S of the DU or not. If not verified, it returns \(\perp \). In the following algorithm, \(P\subseteq [l]\) is represented as a set \(\rho (i)\) which belongs to S. j represents a group of constants where \({j_i \in Z_p}_{i\in P}\).

  • PartialDecyrption\((CT,token_{KW},PK_{uid,S})\rightarrow CT_{part}\) \(/\perp \). Once the Search algorithm returns the CT, the HN computes \(CT_{part} = (E_0, \Gamma _2, \Lambda _2)\), where \(\Gamma _2= \Gamma ^{D_0}\) and \(\Lambda _2 = \Lambda ^{D^{'}_3}\). If the Search algorithm returns \(\perp \), this algorithm will also return \(\perp \).

Algorithm 3
figure c

Search algorithm

4.3.6 Decryption phase

For this final stage, DU uses CID to download the file from the IPFS and uses his private attribute key, \(sk_{uid,S}\), to decrypt the file.

  • \(Dec(CT_{part},SK{uid}) \rightarrow File/ \perp \) The user uses this algorithm to retrieve the \(href_{location}\). It calculates,

    $$\begin{aligned} href_{location}= \frac{E_0}{\Gamma _{2}/\Lambda _2^{1/uK_4}} \end{aligned}$$

    This \(href_{location}\) is given to the IPFS to get the file in return.

4.3.7 User and permission revocation phase

If a healthcare worker leaves an organization or the HN decides to revoke access to certain attributes, this algorithm is executed to provide fine-grained access control. A revocation list is maintained inside the \(User\_SC\) to store the uid of all the revoked users, thereby making the process easier without affecting the keys of any other DUs. The parameter \(K_2\) containing the encrypted uid of each registered user is added by the HN to the revocation list. The Revoke algorithm is used for this purpose. \(Revoke(PP,File,uid,S_{i}(M,\rho ),KW)\rightarrow CT^{'}\). This algorithm works similarly to the encryption algorithm except that it uses an additional uid of the DU and the attribute \(S_{i}\) to be revoked. It also adds the given uid to the revocation list, RL. This way, whenever an HN uses the Trapdoor algorithm to search for a particular file if the \(K_2\) parameter is found in the RL, the Trapdoor algorithm will return \(\perp \) as \(token_{KW}^{'}\) contains the parameter \(D_2\) which is equal to \(K_2\) present in the RL.

  • \(Track(MSK,SK_{uid,S})\rightarrow uid/\perp \) This algorithm can be used to keep track of the users who accessed a particular file. Considering that the secret key used to decrypt a file has a parameter \(K_2\) which denotes the uid of the user, it can be easily used to track the user’s identity. Additionally, the uid is also added in \(K_1\) parameter, thus making it impossible for the user to hide his identity, as \(K_1\) can only be formed with the help of \(\delta \) and \(\lambda \) which are part of MSK and is only known to the HN. Additionally, the Key Verify algorithm given in Section 5 can be used to verify the authenticity of the key shared by the user. Track feature promoting traceability can further be used for enhancing user accountability and transparency while data sharing. The sequence diagram given in Fig. 3 illustrates the steps mentioned above in brief.

Fig. 3
figure 3

Sequence diagram for the proposed scheme

5 Security analysis

This section is used to prove the security and traceability of the proposed model.

Theorem 1

The proposed model is IND-CPA secure if the Decisional Bilinear Diffie-Hellman (DBDH) assumption holds.

Proof

To prove the security of the proposed model, an adversary \(\mathcal {A}\) and a challenger \(\mathcal {C}\) are defined to interact with each other. \(\mathcal {A}\) aims to break the system in polynomial time, and \(\mathcal {C}\) will use the game defined below to solve the DBDH problem. The game is defined below:

  1. 1.

    Init phase: \(\mathcal {C}\) takes an instance of \((g,g^{i},g^{j},g^{k})\) and \(e(g,g)^{ijk} \in G_{T}\) or a random element from \(G_T\) and constructs the system’s public parameter using these components.

  2. 2.

    Query phase: \(\mathcal {C}\) responds to all the queries of \(\mathcal {A}\), relating to the private key and trapdoor generation, and returns the same to \(\mathcal {A}\).

  3. 3.

    Challenge phase: In this phase, \(\mathcal {A}\) sends an access policy \((M,\rho )\), two messages of same length \((m_o,m_1)\), and a pair of keywords \((KW_o,KW_1)\) to the challenger, \(\mathcal {C}\). Following this, \(\mathcal {C}\) randomly encrypts \(m_r\) and \(KW_r\), where \(r\in \{0,1\}\) sends it back to \(\mathcal {A}\).

  4. 4.

    Query phase: \(\mathcal {A}\) continues to query \(\mathcal {C}\).

  5. 5.

    Guess phase: \(\mathcal {A}\) produces a guess \(r^{'}\in \{0,1\}\). If \(r^{'} =r\), \(\mathcal {C}\) outputs true, else false. If the output is true, \(\mathcal {C}\) generates a valid ciphertext at an advantage of

    $$\begin{aligned} \frac{1}{2} + \epsilon , \end{aligned}$$

    where \(\epsilon \) is the probability of \(\mathcal {A}\) of successfully attacking the DBDH problem. On the contrary, if the output is false, \(\mathcal {A}\) loses the game.

As per Section 3.3, if Assumption 1 holds, the proposed model is secure and unbreakable in polynomial time.

Theorem 2

The proposed model is traceable under the q-SDH assumption with \(t^{'}>=t+t_{e} [O(|S|)count_{sk}]\), where \(count_{sk}\) represents the count of the secret keys, \(t_{e}\) represents the exponentiation running time, and \(\vert S\vert \) represents the count of attributes in a set S.

Table 2 Comparison of the proposed model with existing schemes based on security parameters

Proof

If \(\mathcal {A}\) can break the traceability of the proposed model, then \(\mathcal {C}\) can use \(\mathcal {A}\) to solve the q-SDH problem. The game is defined below:

  1. 1.

    Init phase: \(\mathcal {C}\) takes an instance of \((g,g^{i},g^{j},g^{k})\) and \(e(g,g)^{ijk}\in G_{T}\) or a random element from \(G_T\) and uses these elements to construct the public parameter of the system.

  2. 2.

    Query phase: \(\mathcal {C}\) responds to all the secret key queries of \(\mathcal {A}\) and returns the same to \(\mathcal {A}\).

  3. 3.

    Challenge phase: In this phase, \(\mathcal {A}\) sends an altered secret key \(SK^*\) to the challenger, \(\mathcal {C}\). If \(SK^*\) gets verified in algorithm 4 mentioned below, \(\mathcal {A}\) manages to break the traceability of the proposed model. \(\mathcal {C}\) can use the same key to further solve the q-SDH problem.

Since the q-SDH problem is unbreakable in polynomial time, \(\mathcal {A}\) cannot use the altered secret key SK to find a solution to the q-SDH problem with substantial probability.

Algorithm 4
figure d

Key Verify Algorithm

Table 3 Storage overhead acquired by the existing schemes

Table 2 illustrates the comparison of the proposed scheme with some of the similar existing schemes based on confidentiality, use of decentralized storage, facility to search for the required file based on a keyword, access control, user revocation support, and the ability to keep a track on who can access the file and who cannot. \(\perp \) denotes the unavailability of the feature, and \(\checkmark \) represents its presence in the discussed schemes. As seen in the table, our scheme successfully fulfills all these parameters.

  1. 1.

    Data availability: The proposed scheme makes use of the IPFS network for EHR storage, thus resolving the issue of data unavailability which is a common issue with systems based on https. Additionally, using IPFS guarantees less redundancy and reliable data.

  2. 2.

    Data-owner centric approach: As the access policy is set by the data owner which is further used to encrypt the IPFS hash before sharing begins, the hospital node is in full control of who can access the file and who cannot.

  3. 3.

    Complexity: The use of partial decryption supported by the proposed model makes the model lightweight and suitable for small sensor devices. Moreover, since only the encrypted CID is uploaded to the Ethereum network and not the entire file, it decreases the load on the data-sharing model.

  4. 4.

    Tamper-proof: The proposed architecture utilizes smart contracts and blockchain to record all the relevant information related to the data-sharing, thus making the model resistant to any attempt of tampering with the data. All the keys with the corresponding uid of the user are accessed only through the developed smart contracts thereby preventing any illegal access to them.

  5. 5.

    Traceable identity: As discussed earlier in Section 4.3.7, the use of the Track algorithm can help us track which user last decrypted or accessed a file. The model proposed in this paper binds the uid of the user with their secret key, thus making it impossible for the user to hide or change their identity. This not only ensures user accountability, whereby a user cannot deny accessing a particular file but also provides transparency within the system. The log created in Section 4.3.7, known as the revocation list, currently is being used to store the uid of only the revoked users. For future goals, a separate audit log can be used to store the uid of all the users accessing a file. This audit log can be stored in \(Data\_SC\) over the blockchain network. Since all the transactions made on a blockchain network are timestamped and linked to the hash of the previous block, this will create an immutable, tamper-proof log for real-time monitoring of the files shared.

Fig. 4
figure 4

Storage overhead for the compared schemes

Table 4 Computation overhead acquired by the existing schemes

6 Performance analysis

This section compares the storage overhead of the proposed model to some of the existing models in terms of some of the most used parameters, namely the public parameter, the user’s secret key, the ciphertext generated, and the trapdoor. Another comparison is made based on the computation time. The storage overhead is depicted in Table 3 as the total bytes used by each of the above mentioned parameters. First, we define the symbols listed in the table for comparison. \(\vert g \vert \), \(\vert g_{t}\vert \), and \(\vert z_{p}\vert \) represent the bytes occupied by the elements of group G, \(G_T\), and \(Z_p\). T1, T2, and \(T_p\) represent the time taken for modular exponentiation on elements of group G, \(G_T\) and bilinear pairing. Â, Û, and L denote the dimensions of the attribute set, the universal set of the attributes, and the total rows present in the matrix, M, of the access structure, respectively.

Fig. 5
figure 5

Computational overhead for Key Generation algorithm

Fig. 6
figure 6

Computational overhead for Encryption algorithm

Fig. 7
figure 7

Computational overhead for trapdoor generation

  • Public parameter: As seen in Table 2, only our proposed scheme along with [4, 26] and [23] supports a large universal set with an unbounded number of attributes for the public parameter. This helps in expanding the proposed model to a larger number of communicating nodes, thus making the proposed model easily scalable and flexible. On the contrary, for  [22], their public parameter is directly dependent on the universal set thus expanding as the size of the attribute set increases. This also makes them impractical for real-time applications.

  • Secret key: The secret key of all the models depends on the attribute set of the user, Â. The user’s private key in our proposed model occupies Â+1 elements of the group G and five elements of the group \(Z_{p}\). Our proposed model has a smaller secret key size as compared to  [4, 24] and  [23]. Only  [22] private key occupies lesser keyspace than ours. Small key size incurs less overhead on the storage space available in the user’s device, thereby making the proposed scheme an ideal choice for IoT devices having limited resources.

  • Ciphertext: The size of the ciphertexts for all the schemes is directly proportional to the number of rows, L of the matrix, M. In our scheme, the ciphertext occupies four elements of group G, two elements of the group \( G_{T} \), and a double the number of elements as rows of the group \(Z_{p}\). As clear from the graph, our scheme produces the smallest ciphertexts among all the compared schemes.

  • Trapdoor token: Only the proposed scheme from the ones being compared supports the facility of the trapdoor keyword. The size of our trapdoor keyword is significantly less in size as it is only dependent on \(\vert Z_{p} \vert \).

Figure 4 graphically represents the storage occupied by the discussed schemes and clearly shows that our scheme outperforms most of the discussed schemes.

For bilinear cryptography, we implemented the proposed scheme with the help of Python-based Charm framework and Pairing-Based Cryptography, PBC Library\(-\)0.5.14 version. The experiment was done on a Laptop running a Ubuntu 22.04.00-based Windows sub-system, running on an Intel Core i5 processor at 2.6 GHz frequency. An elliptic curve called “SS512” of the expression \(Equation: y^{2} = x^{3} + x\) over the finite field \(F_p\), supporting Type A pairing, was chosen for the implementation. Any element belonging to group \(Z_p\) occupies 20 bytes, elements belonging to \(G_1\) occupy 128 bytes, and the same for elements belonging to group \(G_T\) and \(G_2\) due to its symmetric nature.

Table 4 elaborates on the computation time occupied by the existing schemes along with ours. The notation \(T_1\), \(T_2\), \(T_P\) represents the computation time for exponentiation calculation in group G, \(G_T\), and bilinear pairing. For an 8 GB RAM-based processor with the above-given specifications, \(T_1=9.09 ms\) ms \(T_2=2.65\), \(T_P=18.03 ms\).

Table 5 Smart contract gas consumption record
  • KeyGen: As far as the KeyGen algorithm is concerned, for HN to generate the user’s secret key, \(SK_{uid}\) group G will undergo  +1 operations. Our proposed scheme requires the least computational time for secret key generation. For attribute size of 100, the execution time is 2754.27ms, 963.54ms, 4568.48ms, 4568.48ms, and 918.09ms for  [4, 22, 23, 26] and the proposed model, respectively.

  • Enc: To keep the encryption process lightweight and cost-effective, no pairing-based computation is done in the proposed scheme in this phase. Our scheme utilizes four calculations on group G and three on group \(G_{T}\), thereby making the computation time = 44.31 ms.

  • Dec: Due to the partial decryption facility supported by our proposed scheme, the decryption phase only requires one computation on group G. For [4] and in our scheme, the decryption phase only occupies 9.09 ms irrespective of the attribute set size.

  • Trapdoor: Since none of the compared schemes support trapdoor facility in their work, we have used  [8]’s scheme discussed in Section 2 for the comparison. The proposed scheme only requires seven multiplicative and inversion operations on group \(Z_{p}\). No operation is performed on the group \(G_{1}\), \(G_{2}\), and \(G_{T}\), thus making the computation time 0.0 ms. [8] occupies 7272.00 ms for 100 attributes. Graphs in Figs. 5, 6, and 7 present a graphical comparison based on different attribute sizes.

Fig. 8
figure 8

Account address of Upload_SC

Fig. 9
figure 9

Execution Cost(ETH) of Upload_SC

Fig. 10
figure 10

Snippet of the code for ipfsupload.js

A subsequent comparison analysis within the paper illustrates the security and efficiency of our scheme. In terms of storage overhead, the proposed approach demands a total of 3444 bytes to accommodate the public parameter, secret key, and ciphertext combined. This stands in stark contrast to the 7680 bytes required by  [4], 4632 bytes by [22], 12,032 bytes by [26], and 7936 bytes by [23]. To consider the extra storage needed for the trapdoor in the proposed scheme, the overall storage overhead remains at 3584 bytes, still outperforming the compared schemes. Furthermore, the computational burden for deriving the secret key in our scheme is significantly lower at 66%, 31%, and 79% compared to [4, 22], and  [26], respectively. Similarly, the encryption time of our proposed scheme remains consistent at 46.96 milliseconds, irrespective of the attribute set size, as it solely depends on the time taken for the modular exponentiation on elements of group G and \(G_{T}\). This marks a substantial improvement over other schemes, with our encryption process taking 95%, 89%, 40%, and 96% less time than  [4, 22, 26] and [23], respectively.

6.1 Simulation environment

To assess the performance of the proposed model, a prototype of it was tested on a system with an Intel Core i5 processor, with 8 GB of RAM. Ethereum blockchain was used for simulation, with Remix IDE [27] as a platform to execute the smart contracts. Solidity [28] is used as the core programming language, along with javascript \(web3.js\) [29] to deploy the smart contracts and perform off-chain computations, like the implementation of CP-ABE and encryption and decryption. Ganache [30] is used to set up a personal blockchain environment along with MetaMask [31] as an online wallet. The gas consumption by each smart contract and the corresponding cost are reflected in Table 5.

Figures 8 and 9 reflect the test environment used to deploy the smart contracts, along with the address at which Upload_SC is deployed.

Figure 10 shows the snippet of the javascript code that is used to connect to the smart contract and upload the IPFS hash obtained with off-chain computation to the Upload_SC smart contract. ipfsupload.js is used to encrypt the file received from the HN, encrypt it using the public key of the user, \(pk_uid\), and upload the file to the IPFS. In return, it passes the ipfs hash received from the IPFS to the Upload_SC.

7 Conclusion

In this research, we introduced a lightweight and effective blockchain and IPFS-based fine-grained access control system with a flexible revocation facility. In order to make the scheme lightweight, we only encrypted the CID returned from the IPFS once the file was uploaded and added it to the Ethereum network. Also, the suggested storage solution has two key benefits. First off, it ensures the availability of the data 24*7 given the peer-to-peer nature of the IPFS network. Second, it supports deduplication, thus reducing redundancy among the files stored over the storage medium. Moreover, the proposed technique has a much lower computational cost for extracting the secret key. In a similar vein, the suggested scheme’s encryption time is constant at 46.96 ms regardless of the size of the attribute set and also provides a significant decrease of 85% in the decryption time over compared techniques.

For future goals, the aim is to conduct our research from two perspectives. One is the hiding of the access policy for even better security, and the other is to automate the entire model under the supervision of smart contracts.