CN117951730A - A secure and searchable encryption method based on hash index in the cloud - Google Patents

A secure and searchable encryption method based on hash index in the cloud Download PDF

Info

Publication number
CN117951730A
CN117951730A CN202311732782.3A CN202311732782A CN117951730A CN 117951730 A CN117951730 A CN 117951730A CN 202311732782 A CN202311732782 A CN 202311732782A CN 117951730 A CN117951730 A CN 117951730A
Authority
CN
China
Prior art keywords
weight
encryption
identity information
file
keywords
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311732782.3A
Other languages
Chinese (zh)
Inventor
王昊
王慎卿
李明慧
殷常春
张佳乐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute Of Nanjing University Of Aeronautics And Astronautics
Nanjing University of Aeronautics and Astronautics
Original Assignee
Shenzhen Research Institute Of Nanjing University Of Aeronautics And Astronautics
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute Of Nanjing University Of Aeronautics And Astronautics, Nanjing University of Aeronautics and Astronautics filed Critical Shenzhen Research Institute Of Nanjing University Of Aeronautics And Astronautics
Priority to CN202311732782.3A priority Critical patent/CN117951730A/en
Publication of CN117951730A publication Critical patent/CN117951730A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Automation & Control Theory (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a cloud security searchable encryption method based on hash index, which comprises a data owner and a cloud server, wherein the data owner holds a file set which comprises a plurality of files and a plurality of keywords; the method comprises the following substeps: s1: setting a finite field and an elliptic curve group so as to set hash functions I, II and III; s2: selecting a secret key and generating a shielding secret key; s3: the data owner selects the weight of a keyword and obtains the encryption weight; s4: acquiring a keyword set, establishing an identity information set, and acquiring an identity information random number set; s5: the weight of the identity information is obtained through calculation, encryption is carried out to obtain encryption weight, and a searchable index is established; s6: the cloud server extracts corresponding encryption weights for files in the file set, and builds a binary search tree to store corresponding data; s7: the data user obtains the data of the file, obtains the weight, and then calculates and obtains the maximum weight; s8: and acquiring a file corresponding to the maximum weight.

Description

Cloud security searchable encryption method based on hash index
Technical Field
The invention relates to the technical field of network space security, in particular to a cloud security searchable encryption method based on hash indexes.
Background
Cloud computing provides data storage and computing as a service. The flexibility of access provides data owners (Do) with the convenience of storing and accessing data without regard to storage location, capacity, or maintenance. Data sharing is important because it improves traffic or data quality through higher productivity and better decisions. However, cloud storage also increases the vulnerability of sensitive information, such as health records, financial information, government documents, or sensitive information.
One data owner typically needs to authorize multiple users to securely access their data files. However, the curious nature of cloud servers inevitably places data at risk. Thus, both the data owners and the curious cloud servers belong to different trust zones. In order to control data exposure, sensitive data should be encrypted before outsourcing to the cloud, which must only provide services to legitimate users, without violating data privacy. Encryption, however, increases the challenges of efficient data searching and utilization. In order to provide accessibility to data, a keyword-based search mechanism is a general approach that allows efficient searching of files desired by a user without the need to retrieve all files. Keyword privacy has become a requirement for data privacy.
The importance of cloud service security is to avoid revealing information during or after a search operation. The search process and efficient data utilization may be improved by strict privacy preserving methods such as Searchable Encryption (SE). SE allows searching for encrypted keywords and reveals as little information as possible to the server and protects the privacy of the user data. Many SE schemes based on the Searchable Symmetric Encryption (SSE) scheme and the keyword search (PEKS) scheme have been proposed. The SSE scheme allows only one user to perform a keyword search and access encrypted data. This means that these schemes are fixed in a single user. Then, under the multi-user setting, a multi-user searchable encryption scheme is developed by adopting a keyword search technology. Keyword searches in multi-user settings allow data owners to share their document collections with a set of authorized users. As expected, PEKS provides a better level of security, but has a higher computational overhead, incorrect security definitions, and is in fact unsuitable for large data sets. In most PEKS programs, there is security from internal attacks, such as honest but curious servers that are not considered as insider.
Despite advances in the field of searchable encryption, many problems remain unsolved. First, honest but curious cloud servers or other adversaries may still seek to break privacy, and such schemes do not have sufficient robustness with respect to similarity relevance of terms and files. Second, conventional public key-based schemes have great complexity.
Wang et al propose a novel multi-keyword fuzzy search scheme that can directly perform keyword searches on cloud encrypted data. But this results in inaccurate search results and lacks a tradeoff between effectiveness and security, facing problems associated with boolean expressions. On the other hand Ziqing et al propose a method that uses a balanced binary tree and query vectors with file ranks to conduct a multi-keyword rank search on encrypted cloud data of a multi-owner environment. For binary trees, even if the key is not present in a particular file, space needs to be allocated. This results in high memory consumption. Therefore, further research is needed to ensure the performance, accuracy and memory of searchable encryption in cloud environments. Specifically, the technical scheme of the technology comprises the following steps:
1. Multi-keyword fuzzy search scheme: wang et al propose a novel multi-keyword fuzzy search scheme for effectively supporting fuzzy search of multiple keywords without increasing the index or search complexity, and can directly perform keyword search on encrypted data in the cloud. Fuzzy matching is achieved through algorithm design, rather than expanding index files, multiple key words fuzzy search can be achieved on encrypted cloud data without pre-defining a dictionary. A new idea is introduced to implement multi-keyword (conjunctive keyword) fuzzy search by converting each keyword into its binary vector representation and capturing the similarity of the keywords using euclidean distance.
2. Secure multi-keyword rank search scheme: ziqing et al, in order to solve the problem that previous secure searchable solutions only support searching for data belonging to a single owner, cannot search for multiple data sets outsourced by different data owners. Key management using trusted third parties, generation of indexes and queries using vector space models, and providing keyword weights using a newly designed KDO algorithm that takes into account both relevance and document quality are proposed. The asymmetric scalar product retains encryption methods for encrypting weighted indexes and queries to preserve privacy. The proposed group balanced binary tree index improves the search efficiency by GREEDY DEPTH-First search algorithm.
Disadvantages of the prior art:
However, papers in implementation to achieve fuzzy matching, rather than extending index files, and effectively support multi-keyword fuzzy searches without increasing index or search complexity, but this results in inaccurate search results and lacks a tradeoff between effectiveness and security, facing problems associated with boolean expressions. In order to achieve searching of a multi-data owner dataset in the paper, it is assumed that the data owners are honest and not mutually collusion. If the data owners collude, the security of the proposed plan may be compromised. While attempting to achieve the intended goal, this approach uses trusted agents to create encrypted index and trap gates for data users. This has a vulnerability that is destroyed by the trusted agent. Furthermore, for binary trees, even if the key is not present in a particular file, space needs to be allocated, resulting in high memory consumption. Therefore, further research is needed to ensure the performance, accuracy and memory of searchable encryption in cloud environments.
Disclosure of Invention
The invention aims to provide a cloud security searchable encryption method based on hash index, which overcomes the defects in the prior art.
In order to achieve the above purpose, the present invention provides the following technical solutions:
the application discloses a cloud security searchable encryption method based on hash index, which comprises a data owner and a cloud server, wherein the data owner holds a file set, and the file set comprises a plurality of files and a plurality of keywords;
The method comprises the following substeps:
s1: setting a finite field and an elliptic curve group, so as to set a first hash function, a second hash function and a third hash function;
S2: selecting a key in a finite field, and generating a shielding key through the key;
s3: the data owner randomly selects the weight of a keyword from the file set, and obtains the encryption weight through a hash function I and a shielding key;
s4: the data owner obtains a keyword set through a file set, establishes an identity information set according to the keyword set, and obtains an identity information random number set through three mapping of a hash function;
s5: the data owner calculates and acquires the weight of the identity information, obtains the corresponding encryption weight through encryption, and establishes a searchable index;
s6: the cloud server extracts corresponding encryption weights for each file in the file set through a query vector, and builds a binary search tree to store the identity information and the encryption weights of the corresponding files;
S7: the data user applies for the data owner, acquires the identity information and the encryption weight of the file, decrypts the encryption weight through the shielding key and the hash function II to acquire the weight, and calculates and acquires the maximum weight;
s8: and finding the file corresponding to the maximum weight through the maximum weight.
Preferably, the hash function is mapped into elliptic curve groups through a finite field; mapping the hash function to a finite field through elliptic curve groups; the hash function three is mapped to the finite field by the finite field.
Preferably, the step S3 includes the following substeps:
S31: randomly selecting a weight;
S32: mapping the weight to one elliptic curve point in the elliptic curve group through a hash function I;
S33: and encrypting the elliptic curve points through the shielding key to obtain an encryption weight.
Preferably, the step S4 includes the following substeps:
S41: the data owner selects a plurality of keywords according to the file set and selects a plurality of pseudo keywords;
s42: the owner of the data adds the pseudo keywords into the keywords to form a keyword set;
s43: the data owner selects random identity information for each element in the keyword set to generate an identity information set, wherein the number of the identity information is the same as the sum of the number of the keywords and the number of the pseudo keywords;
s44: and mapping an identity information random number for each identity information in the identity information set through a hash function III to generate an identity information random number set.
Preferably, the step S5 includes the following substeps:
S51: the data owner calculates the identity information weight of each identity information through the identity information set, and encrypts the identity information weight to obtain an identity information encryption weight;
s52: the data owner generates a searchable index through the random number set of the identity information, the encryption weight of the identity information related to the random number set of the identity information and the corresponding file.
Preferably, the step S6 includes the following substeps:
S61: the cloud server acquires a query vector, wherein the query vector comprises identity information, and generates a hash value of a keyword set mapped to an identity information random number set;
s62: extracting the encryption weight of the corresponding identity information random number for each file through the query vector;
S63: a binary search tree is constructed to store the identity information and the encryption weight of the corresponding file.
Preferably, if a plurality of keywords appear in one file, encryption weights are added by homomorphic addition; wherein the moment body is represented as: for each file of the keywords requested by the queried vector, only one node is created in the binary search tree, and when the file has a plurality of requested keywords, encryption weights are accumulated; finally, mapping the keywords and hash tables of all the requests, and adding encryption weights in the binary search tree.
Preferably, the step S7 includes the following substeps:
S71: the data user generates requests for a plurality of keywords through the query vector, and corresponding encryption weights are obtained;
s72: calculating the encryption weight through a shielding key to obtain an elliptic curve point;
S73: calculating elliptic curve points through a hash function II to obtain the weight before encryption;
s74: and calculating and acquiring the maximum weight.
Preferably, the step S74 includes the following substeps:
s741: initializing a maximum variable, and setting the maximum variable to 0;
S742: traversing each node of the binary search tree, and for each node, firstly acquiring the weight before encryption;
s743: judging the weight, and if the weight is greater than 1, giving the value of the weight to the maximum variable; otherwise, continuing;
s744: after the traversal is completed, the value of the maximum variable is the maximum weight.
The application discloses a cloud security searchable encryption device based on a hash index, which comprises a memory and one or more processors, wherein executable codes are stored in the memory, and the one or more processors are used for the cloud security searchable encryption method based on the hash index when executing the executable codes.
The application discloses a computer readable storage medium, which stores a program, and when the program is executed by a processor, the program realizes the cloud security searchable encryption method based on hash index.
The invention has the beneficial effects that:
(1) The invention provides a symmetrical encryption method aiming at the problem of search complexity. The multi-keyword fuzzy search scheme, the search for files should be performed in the form of a linear search because both the storage format and the encryption algorithm are designed in this way. In the method, the hash function can be directly used for acquiring the respective keywords, and the link list is traversed to extract all files with the keywords;
(2) The stackable TF-IDF weight scheme proposed by the invention aiming at the problem of high memory consumption is reduced by 50% in terms of consumed memory amount compared with a safe multi-keyword ranking search scheme;
(3) An important module of the proposed method is the search module. This module is done by the data user by generating a query vector. Experiments were performed on three data sets. The results show that the size of the dataset has no effect on the search request, as the score is pre-computed. The search operation does not need to traverse the entire file system, it only needs to find the hashed index entries and give a total score for the corresponding file. The method does not need to scan the entire file system every time a search is generated, and the requirement of search time only depends on the adding operation of finding the hash index and the score. To analyze the proposed solution more deeply, two main keyword categories are formed. Firstly, the most commonly used keywords, and secondly, the rarely used keywords. Search requests were made for 2,4, 8 and 10 commonly used keywords and 2 and 4 rarely used keywords. The motivation for doing this is to discover how the number of keywords in the request results in more traversal and addition operations. The results show that the data sets of different keywords are different in size, and the results are independent of the data set sizes. Experimental results show that the consumption of search time is directly related to the generality of keywords and the number of required keywords. We can mark with the results that more keywords bring more additional results. Common keywords may invoke more data matching and adding operations than keyword sets with rare keywords.
The features and advantages of the present invention will be described in detail by way of example with reference to the accompanying drawings.
Drawings
FIG. 1 is a schematic flow diagram of a cloud security searchable encryption method based on hash index according to the present invention;
FIG. 2 is a schematic diagram of the main content steps of the present invention;
fig. 3 is a schematic view of the apparatus of the present invention.
Detailed Description
The present invention will be further described in detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention.
Referring to fig. 1-2, an embodiment of the present invention provides a cloud security searchable encryption method based on hash index, in which an existing addition homomorphic encryption algorithm based on hash index and elliptic curve is subjected to deep analysis research on cloud service security problems, so as to research how to reduce computational load problems in cloud service by the hash index-based method. The invention provides a cloud security searchable encryption scheme based on hash index to ensure the security of cloud data. First, using hash-based indexing reduces the computational load of the cloud server and users compared to traditional public key-based schemes. Second, the use of ElGamal addition homomorphic encryption based on elliptic curves avoids the need for complex trap systems and binary query schemes generation and resource intensive processing, as compared to methods using balanced binary trees and query vectors with file ranks. The method evaluates and compares the search time, complexity and index storage complexity of the proposed framework and other most advanced searchable encryption schemes on cloud servers. Therefore, the cloud security searchable encryption method based on the hash index provided by the invention is more suitable for application in a cloud service scene of data outsourcing;
A cloud security searchable encryption method based on hash index comprises a data owner and a cloud server;
wherein: the data owners are contributors and administrators of the data set. And wrapping the data file to a cloud server for use by authorized data users.
(1) First, the data owner encrypts the file using a symmetric encryption method to keep the data secret. The data owner also lists a set of keywords and distributes these keywords to authenticated data users. For a search arrangement, the data owner constructs an index with keywords and associated weight scores. In this scheme, after the data owner selects the keywords, each keyword is given a unique identification in a random manner. The data owner is responsible for calculating the weight scores and encrypting these values (referred to as weight scores) using the EC ELGAMAL method. Finally, the data owner outsources the encrypted data and the encrypted index to the cloud server and shares the key s, the mask key and the ECC parameters with the authorized data user. In this scheme, the data owner mixes some virtual keywords with the actual keyword set.
(2) The cloud server provides a data hosting facility, stores encrypted files and indexes outsourced by data owners, and provides a search facility for matching keyword requests for data users. In the proposed solution, we choose to keep the maximum potential processing and computation at the cloud server side to keep the data users comfortable, we want to allow the minimum computational burden. The proposed solution avoids many operations of the cloud server, as we consider the cloud server as honest but strange. The cloud server stores the encrypted data and the encrypted index, processes the search query and provides matching encrypted data for transmission to the requesting data user. This work believes that the cloud server will not perform any other operations on the data set or the encrypted index;
the cloud server is responsible for searching the hash table according to the query vector Q in the process, extracting the encryption weight, constructing a binary search tree containing the file identifier and the encryption weight, and finally forming a query result required by a user.
(3) The data user receives a list of keywords shared by the data owners and their unique identifications, symmetric keys and other ECC parameters. The data user requests inquiry in the form of inquiry vector and sends the inquiry to the cloud server, decodes after receiving the return value, then requests the required file, and then decrypts with the corresponding symmetric key.
The method comprises the steps that a data owner holds a file set, wherein the file set comprises a plurality of files and a plurality of keywords;
The method comprises the following substeps:
s1: setting a finite field and an elliptic curve group, so as to set a first hash function, a second hash function and a third hash function;
S2: selecting a key in a finite field, and generating a shielding key through the key;
s3: the data owner randomly selects the weight of a keyword from the file set, and obtains the encryption weight through a hash function I and a shielding key;
s4: the data owner obtains a keyword set through a file set, establishes an identity information set according to the keyword set, and obtains an identity information random number set through three mapping of a hash function;
s5: the data owner calculates and acquires the weight of the identity information, obtains the corresponding encryption weight through encryption, and establishes a searchable index;
s6: the cloud server extracts corresponding encryption weights for each file in the file set through a query vector, and builds a binary search tree to store the identity information and the encryption weights of the corresponding files;
S7: the data user applies for the data owner, acquires the identity information and the encryption weight of the file, decrypts the encryption weight through the shielding key and the hash function II to acquire the weight, and calculates and acquires the maximum weight;
s8: and finding the file corresponding to the maximum weight through the maximum weight.
The hash function I is mapped into an elliptic curve group through a finite field; mapping the hash function to a finite field through elliptic curve groups; the hash function three is mapped to the finite field by the finite field.
The S3 comprises the following contents:
First, initialization is performed, in which the input is a security parameter p, and the output is an elliptic curve group G over a finite field F p, hash functions H 1:Fp →g, hash function H 2:G→Fp, and hash function H 3:Fp→Fp. The data owner inputs G, where P is the generation point of group G, and the data owner selects a random number s e F p to form a masking key mk=sp.
The data owner then randomly selects a weight in w e {0,999}, where w represents the tf-idf weight of a key, mapped to an elliptic curve point P M=H1 (M) using the hash function H 1. P M was encrypted using the EC-ElGamal encryption algorithm, resulting in an encrypted tf-idf weight w' =p M +mk. Where mk is the masking key used to encrypt the weights. Here, the weight represents a normalized tf-idf weight ranging between 0-999.
The method comprises the following substeps:
S31: randomly selecting a weight;
S32: mapping the weight to one elliptic curve point in the elliptic curve group through a hash function I;
S33: and encrypting the elliptic curve points through the shielding key to obtain an encryption weight.
The S4 comprises the following contents:
Using EC-ElGamal encryption allows homomorphic addition on the encrypted weights. This allows the cloud server to add multiple encrypted weights. Furthermore, the normalized weights allow for efficient searching on a binary search tree. The data owner comprises a set of files f=f 1,f2,...,fn. The data owner then selects m keywords k=k 1,k2,...,km from F. And d pseudo keywords K d=kd1,kd2,...,kdd are selected to be added to K to form K t=K+Kd. The data owner selects a random identity KID j∈Fp for each k j∈Kt, where 0.ltoreq.j.ltoreq.m+d, hence KID t={kid1,kid2,...,kidm+d, and maps a random number KID j∈Fp for each kid.epsilon.KID t using a hash function H 3. Wherein HKID t={kid1,kid2,...,kidm+d }.
The method comprises the following substeps:
S41: the data owner selects a plurality of keywords according to the file set and selects a plurality of pseudo keywords;
s42: the owner of the data adds the pseudo keywords into the keywords to form a keyword set;
s43: the data owner selects random identity information for each element in the keyword set to generate an identity information set, wherein the number of the identity information is the same as the sum of the number of the keywords and the number of the pseudo keywords;
s44: and mapping an identity information random number for each identity information in the identity information set through a hash function III to generate an identity information random number set.
The S5 comprises the following contents:
The data owner calculates the weights w j for tf-idf for each kid j, and then encrypts w j to get w' j.
Finally, the data owner generates a searchable index i=kid j←{(F1,w′j1),(F2,w′j2),...,(Fn,w′jn using HKID t and his associated encryption weight w' j and file f i. Where 0.ltoreq.j.ltoreq.m+d, where the encryption tf-idf weight w for each key identification is different for each file.
The method comprises the following substeps:
S51: the data owner calculates the identity information weight of each identity information through the identity information set, and encrypts the identity information weight to obtain an identity information encryption weight;
s52: the data owner generates a searchable index through the random number set of the identity information, the encryption weight of the identity information related to the random number set of the identity information and the corresponding file.
The S6 comprises the following contents:
Cloud server fetches query vector containing kid j A hash value kid j←kj is generated. It finds the kid j in the hash table and traverses the linked list, extracting the encryption weight w' ji of kid j for each file f i. It constructs a binary search tree to hold the id of the extracted file f i and its encrypted weight w' ji.
If multiple key identifications occur in one file, then the encrypted tf-idf weight w' will be added using homomorphism addition. Thus, for each file with a requested key, only one node is created in the tree, and if the file has multiple requested keys, then its w' is accumulated. Finally, mapping the keywords of all requests and the hash table, and adding w' in the binary search tree, and the result is ready for the user.
The method comprises the following substeps:
S61: the cloud server acquires a query vector, wherein the query vector comprises identity information, and generates a hash value of a keyword set mapped to an identity information random number set;
s62: extracting the encryption weight of the corresponding identity information random number for each file through the query vector;
S63: a binary search tree is constructed to store the identity information and the encryption weight of the corresponding file.
If a plurality of keywords appear in one file, adding encryption weight through homomorphic addition; wherein the moment body is represented as: for each file of the keywords requested by the queried vector, only one node is created in the binary search tree, and when the file has a plurality of requested keywords, encryption weights are accumulated; finally, mapping the keywords and hash tables of all the requests, and adding encryption weights in the binary search tree.
The S7 comprises the following contents:
data user generates query vector Its request for multiple keywords is generated. Where 0.ltoreq.j.ltoreq.p. Where p represents the total number of keyword identifications for each query may vary. Finally, using EC-ElGamal decryption, P M = w '-mk is calculated first, where is the mk masking key, w = H 2 (w') is calculated, first we select a variable Max and initialize it to 0. Each node in the BST is traversed and w' i is obtained for each file f i. Recovery of P M.wi from w' i with the aid of mk finally recovery from P M using H 2, max is set to w i at the first iteration if w i is assumed to be greater than 1. This process is repeated n times, and finally w i with the largest value is stored in Max. Finally, the data user requests the file f i with the weight w i equal to Max from the server for further retrieval;
Decryption and calculation of weights: for each node stored in the BST, containing a file identifier fi and corresponding encryption weights w', it is first necessary to use a corresponding masking key mk in order to decrypt these weights. Using the masking key mk, the elliptic curve point PM before encryption is recovered from w ', i.e., pm=w' -mk.
Restoring weights from elliptic curve points: the elliptic curve points P M obtained from step 1 are mapped back to the weights wi before encryption using the hash function H2. Specifically, H2 (PM) is calculated to obtain a weight value before encryption.
Calculating the maximum weight: a variable Max is initialized and its initial value is set to 0. Each node in the BST is then traversed, for each node, performing the following steps:
a. The pre-encryption weight wi obtained from step 2.
B. If wi is greater than 1 (because only weights greater than 1 are significant), max is set to wi.
C. and continuously comparing the pre-encryption weight value wi of the current node with Max, and if the pre-encryption weight value wi of the current node is larger than the Max, updating the Max.
Finding the maximum weight file: after the traversal is completed, the Max of the pre-encryption weights of all files is stored in Max. Finally, the file identifier fi with the greatest pre-encryption weight is found, which is the file that needs to be retrieved further by the Data User (DU) to the Cloud Server (CS)
The method comprises the following substeps:
S71: the data user generates requests for a plurality of keywords through the query vector, and corresponding encryption weights are obtained;
s72: calculating the encryption weight through a shielding key to obtain an elliptic curve point;
S73: calculating elliptic curve points through a hash function II to obtain the weight before encryption;
s74: and calculating and acquiring the maximum weight.
The step S74 includes the following sub-steps:
s741: initializing a maximum variable, and setting the maximum variable to 0;
S742: traversing each node of the binary search tree, and for each node, firstly acquiring the weight before encryption;
s743: judging the weight, and if the weight is greater than 1, giving the value of the weight to the maximum variable; otherwise, continuing;
s744: after the traversal is completed, the value of the maximum variable is the maximum weight.
In the invention, the following components are added: aiming at the problem of searching complexity, in the invention, the file is encrypted by adopting a symmetrical encryption method. The security index is a hash of the key Identification (ID) to more quickly and securely match the search request with the user. Equivalent tf-idf are also included. tf-idf is encrypted using EC ELGAMAL-based encryption systems that provide additive homomorphism, i.e., support multiple key requests. After receiving the search request, the server may successfully add together the encrypted tf-idf weights of all the request keys for each file and store the total weights separately. This results in one privacy preserving rank multi-keyword search encryption data. The data user is free to choose any level of files he needs. Many approaches claim to provide top-k files, but in real life top-k files are not always required by the data user.
For the problem of high memory consumption, fromA keyword identification kid is obtained, and a hash value is calculated by using a hash function H 3. If the hash maps to an index of dynamic array A, the linked list to which the corresponding index points is traversed. Each node containing a file and its associated encryption weight is added to the BST. Now, fromIs arrived at, which results in three cases. In case 1 hkid =h 3 (kid) is not mapped in the index of a, which means that the relevant key is not present in f. In case 2, if the new node to which hkid points is contained in a file f that does not exist in the BST, it is added to it. In case 3, if hkid points to a node that contains a file f that already exists in the BST, the encryption weights of the two nodes are added and the weight of f in the BST is updated. Thus, each file is added only once in the BST, and the greater the number of key identifiers mapped to the same file will mean that the higher the combining weight of the file in the BST. Therefore, the memory waste is not caused.
The embodiment of the cloud security searchable encryption device based on the hash index can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 3, a hardware structure diagram of an apparatus with data processing capability where the cloud security searchable encryption apparatus based on hash index of the present invention is located is shown in fig. 3, except for a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 3, where the apparatus with data processing capability in the embodiment is located, generally, according to an actual function of the apparatus with data processing capability, other hardware may be further included, which is not described herein. The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The embodiment of the invention also provides a computer readable storage medium, on which a program is stored, which when executed by a processor, implements the hash index-based cloud security searchable encryption apparatus in the above embodiment.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be an external storage device of any device having data processing capabilities, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), an SD card, a flash memory card (FLASH CARD), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, or alternatives falling within the spirit and principles of the invention.

Claims (9)

1. A cloud security searchable encryption method based on hash index is characterized in that: the method comprises a data owner and a cloud server, wherein the data owner holds a file set, and the file set comprises a plurality of files and a plurality of keywords;
The method comprises the following substeps:
s1: setting a finite field and an elliptic curve group, so as to set a first hash function, a second hash function and a third hash function;
S2: selecting a key in a finite field, and generating a shielding key through the key;
s3: the data owner randomly selects the weight of a keyword from the file set, and obtains the encryption weight through a hash function I and a shielding key;
s4: the data owner obtains a keyword set through a file set, establishes an identity information set according to the keyword set, and obtains an identity information random number set through three mapping of a hash function;
s5: the data owner calculates and acquires the weight of the identity information, obtains the corresponding encryption weight through encryption, and establishes a searchable index;
s6: the cloud server extracts corresponding encryption weights for each file in the file set through a query vector, and builds a binary search tree to store the identity information and the encryption weights of the corresponding files;
S7: the data user applies for the data owner, acquires the identity information and the encryption weight of the file, decrypts the encryption weight through the shielding key and the hash function II to acquire the weight, and calculates and acquires the maximum weight;
s8: and finding the file corresponding to the maximum weight through the maximum weight.
2. The cloud security searchable encryption method based on hash index as claimed in claim 1, wherein: the hash function I is mapped into an elliptic curve group through a finite field; mapping the hash function to a finite field through elliptic curve groups; the hash function three is mapped to the finite field by the finite field.
3. The cloud security searchable encryption method based on hash index as claimed in claim 1, wherein: the step S3 comprises the following substeps:
S31: randomly selecting a weight;
S32: mapping the weight to one elliptic curve point in the elliptic curve group through a hash function I;
S33: and encrypting the elliptic curve points through the shielding key to obtain an encryption weight.
4. The cloud security searchable encryption method based on hash index as claimed in claim 1, wherein: the step S4 comprises the following substeps:
S41: the data owner selects a plurality of keywords according to the file set and selects a plurality of pseudo keywords;
s42: the owner of the data adds the pseudo keywords into the keywords to form a keyword set;
s43: the data owner selects random identity information for each element in the keyword set to generate an identity information set, wherein the number of the identity information is the same as the sum of the number of the keywords and the number of the pseudo keywords;
s44: and mapping an identity information random number for each identity information in the identity information set through a hash function III to generate an identity information random number set.
5. The cloud security searchable encryption method based on hash index as claimed in claim 1, wherein: the step S5 comprises the following substeps:
S51: the data owner calculates the identity information weight of each identity information through the identity information set, and encrypts the identity information weight to obtain an identity information encryption weight;
s52: the data owner generates a searchable index through the random number set of the identity information, the encryption weight of the identity information related to the random number set of the identity information and the corresponding file.
6. The cloud security searchable encryption method based on hash index as claimed in claim 1, wherein: the step S6 comprises the following substeps:
S61: the cloud server acquires a query vector, wherein the query vector comprises identity information, and generates a hash value of a keyword set mapped to an identity information random number set;
s62: extracting the encryption weight of the corresponding identity information random number for each file through the query vector;
S63: a binary search tree is constructed to store the identity information and the encryption weight of the corresponding file.
7. The cloud security searchable encryption method based on hash index as claimed in claim 6, wherein: if a plurality of keywords appear in one file, adding encryption weight through homomorphic addition; wherein the moment body is represented as: for each file of the keywords requested by the queried vector, only one node is created in the binary search tree, and when the file has a plurality of requested keywords, encryption weights are accumulated; finally, mapping the keywords and hash tables of all the requests, and adding encryption weights in the binary search tree.
8. The cloud security searchable encryption method based on hash index as claimed in claim 1, wherein: the step S7 comprises the following substeps:
S71: the data user generates requests for a plurality of keywords through the query vector, and corresponding encryption weights are obtained;
s72: calculating the encryption weight through a shielding key to obtain an elliptic curve point;
S73: calculating elliptic curve points through a hash function II to obtain the weight before encryption;
s74: and calculating and acquiring the maximum weight.
9. The cloud security searchable encryption method based on hash index as claimed in claim 8, wherein: the step S74 includes the following sub-steps:
s741: initializing a maximum variable, and setting the maximum variable to 0;
S742: traversing each node of the binary search tree, and for each node, firstly acquiring the weight before encryption;
s743: judging the weight, and if the weight is greater than 1, giving the value of the weight to the maximum variable; otherwise, continuing;
s744: after the traversal is completed, the value of the maximum variable is the maximum weight.
CN202311732782.3A 2023-12-15 2023-12-15 A secure and searchable encryption method based on hash index in the cloud Pending CN117951730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311732782.3A CN117951730A (en) 2023-12-15 2023-12-15 A secure and searchable encryption method based on hash index in the cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311732782.3A CN117951730A (en) 2023-12-15 2023-12-15 A secure and searchable encryption method based on hash index in the cloud

Publications (1)

Publication Number Publication Date
CN117951730A true CN117951730A (en) 2024-04-30

Family

ID=90796988

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311732782.3A Pending CN117951730A (en) 2023-12-15 2023-12-15 A secure and searchable encryption method based on hash index in the cloud

Country Status (1)

Country Link
CN (1) CN117951730A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118264403A (en) * 2024-05-30 2024-06-28 山东渤聚通云计算有限公司 Data security processing method applied to edge computing intelligent gateway
CN119066684A (en) * 2024-11-01 2024-12-03 企飞大数据(山东)有限公司 A file data encryption method for enterprise office systems

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118264403A (en) * 2024-05-30 2024-06-28 山东渤聚通云计算有限公司 Data security processing method applied to edge computing intelligent gateway
CN119066684A (en) * 2024-11-01 2024-12-03 企飞大数据(山东)有限公司 A file data encryption method for enterprise office systems

Similar Documents

Publication Publication Date Title
Chen et al. {SANNS}: Scaling up secure approximate {k-Nearest} neighbors search
Yuan et al. SEISA: Secure and efficient encrypted image search with access control
CN108494768B (en) A ciphertext search method and system supporting access control
Andola et al. Searchable encryption on the cloud: a survey
WO2022099495A1 (en) Ciphertext search method, system, and device in cloud computing environment
Salam et al. Implementation of searchable symmetric encryption for privacy-preserving keyword search on cloud storage
CN112332979B (en) Ciphertext search method, system and equipment in cloud computing environment
Sun et al. Privacy-preserving keyword search over encrypted data in cloud computing
Hozhabr et al. Dynamic secure multi-keyword ranked search over encrypted cloud data
Peng et al. LS-RQ: A lightweight and forward-secure range query on geographically encrypted data
CN117951730A (en) A secure and searchable encryption method based on hash index in the cloud
CN109740378A (en) A security pair index construction and retrieval method against keyword privacy leakage
Yu et al. Privacy-preserving multikeyword similarity search over outsourced cloud data
Shekhawat et al. Privacy-preserving techniques for big data analysis in cloud
Chamili et al. Searchable encryption: a review
Kim et al. Privacy-preserving parallel kNN classification algorithm using index-based filtering in cloud computing
Varri et al. Practical verifiable multi-keyword attribute-based searchable signcryption in cloud storage
Meharwade et al. Efficient keyword search over encrypted cloud data
CN119788424B (en) Image retrieval method and system supporting sharing of multiple data sources
Wang et al. Enabling efficient approximate nearest neighbor search for outsourced database in cloud computing
CN112632297B (en) A secure space text skyline query method based on encrypted index
Zhang et al. A practical privacy-preserving nearest neighbor searching method over encrypted spatial data: J. Zhang, C. Li
Cheng et al. Enabling secure and efficient kNN query processing over encrypted spatial data in the cloud
Yan et al. Secure and efficient big data deduplication in fog computing: J. Yan et al.
Shan et al. Fuzzy keyword search over encrypted cloud data with dynamic fine-grained access control

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination