Today, efficient data processing is a fundamental and vital issue
for almost every scientific, academic, or business organization. Therefore the organizations
end up installing and managing database management systems to satisfy different data processing
needs. Although it is possible to purchase the necessary hardware, deploy database products,
establish network connectivity, and hire the professional people who run the system, as a
traditional solution, this has been getting increasingly expensive and impractical as
the database systems and problems become larger and more complicated.
Advances in the networking technologies have triggered one of the key industry responses, the
"software as a service" initiative, also referred to as the application service provider (ASP) model.
To address the above-stated problem, "Database as a service" model inherits all the advantages of the
ASP model, indeed even more, given that a large number of organizations have their own DBMSs. The model
allows organizations to leverage hardware and software solutions provided by the service providers,
without having to develop them on their own, thereby freeing them to concentrate on their core businesses.
The objective of this research is to explore viability of database-as-a-service (DAS) model. The first step towards that objective, we feel is implementation and provisioning of such a service. The primary challenge posed by the DAS model is the issue of data privacy. In the database-service-provider model, user's data resides on the premises of the provider. Both corporations and individuals view their data as a very valuable asset. A service provider would need to implement sufficient security measures to guarantee data privacy. One key issue is how much privacy is enough? Clearly, protection from hackers who might break into the service provider's system is required. However, we argue that this is not enough since clients will also desire privacy protection from service providers themselves. Any data privacy solution will have to utilize encryption which, as usual, comes with a certain cost. A fundamental question is whether encryption is too costly thus making the database service provider model infeasible.Our research investigates various aspects of this challenge by devising new techniques, which ensure the privacy of the data. With the above general goals, we are conducting research along the following directions:
- Implementation and provisioning of Database-as-a-Service Model. To build a better understanding of the challenges introduced by DAS model, an instantiation of the model should be implemented and deployed with the usage of real clients.
- Integration of data encryption with database systems to protect data against outside malicious attacks and to limit the liability of the service provider. Data encryption is a solution to ensure the privacy of the data in the databases against adversaries. However, encryption techniques have significant performance implications on query processing in databases. Alternatives for integrating encryption techniques with databases should be investigated and trade-offs should be made.
- Developing mathematical and statistical measures of Data Privacy for various privacy preserving schemes.
- Development of techniques to protect the privacy of user data from the database service providers themselves. If the service providers themselves are not trusted, the protecting the privacy of users' data is much more challenging issue. Research should be conducted to address this challenge in the following directions:
- Design of new data storage model. An encrypted data storage model should be designed. The model should enable query processing directly over encrypted databases to ensure the privacy from database providers.
- Development of new query processing techniques. Based on the encrypted data storage model, novel query processing techniques need to be developed. Those techniques allow query processing directly over encrypted data thereby hiding the information from the service providers. Decryption is performed only by the client, who is the rightful owner of the data.
- Development of new techniques for special query types. Some specific query classes particularly introduce challenges in encrypted database environments. Typical examples for those are aggregation queries and text pattern matching queries. Widely used encryption techniques fail to provide capabilities to support such query types. However, if those query types are not handled specifically; they introduce very significant performance overheads sometimes even jeopardizing the feasibility of the system. Therefore, first, techniques need to be adapted and/or devised to support those types of queries, and second, query processing techniques, which make use of those techniques, have to be developed.
- Query optimization in encrypted databases. New techniques changes the way we process queries over encrypted databases. Thus, optimization of these reformulated queries has to be carefully studied. The optimization process should ensure that the users of the system, the clients, can take full advantage of the capabilities promised by DAS model.
- Integrity of the data in encrypted databases. Once data encryption is employed as a solution to data privacy problem, there are other issues in this context. One of the most important of those is ensuring the integrity of the users' data. As a result of both malicious and non-malicious causes the integrity of the data may be compromised. When this happens, the client does not have any mechanism to detect the integrity of the original data. Therefore, new techniques have to be developed to provide clients mechanisms to check the integrity of their data hosted at the service provider side.
- Key management issues in encrypted databases. Another issue to address in the context of encrypted databases is key management. All encryption techniques rely on secure and efficient key management architectures. DAS model puts additional complexity on key management architectures. Generation, registration, storage, and update of encryption keys are essential functions that have to be handled efficiently in DAS model.
- Hakan Hacigumus (IBM, Almaden)
- Bala Iyer (IBM, Silicon Valley Lab)
Bijit Hore, Sharad Mehrotra, Gene Tsudik, "A Privacy-Preserving Index for Range Queries ", International Conference on Very Large Databases (VLDB 2004), Toronto, Canada 2004.
- Hakan Hacigumus, Sharad Mehrotra, "Performance-Conscious Key Management In Encrypted Databases", IFIP WG 11.3 Working Conference on Data and Application Security, Sitges, Spain 2004.
- Ravi Jammalamadaka, Sharad Mehrotra, "Querying Encrypted XML Documents ", UCI Technical report TR-DB-04-03.
Bala Iyer, Sharad Mehrotra, Einar Mykletun, Gene Tsudik and Yonghua Wu, "A Framework for Efficient Storage in RDBMS ", International Conference of Extending Database Technology (EDBT) , Greece 2004 .
H. Hacigumus, B. Iyer, and S. Mehrotra, " Efficient Execution of Aggregation Queries over Encrypted Databases", International Conference on Database Systems for Advanced Applications (DASFAA), Jeju, South Korea, 2004
Einar Mykletun, Maithili Narasimha and Gene Tsudik, "Authentication and Integrity of Outsourced Databases ", Network and Distributed System Security (NDSS 2004), San Diego, Feb 2004 .
Einar Mykletun, Maithili Narasimha and Gene Tsudik, "Signature Bouquets: Immutability for Aggregated/Condensed Signatures", In Submission.
H. Hacigumus, B. Iyer, and S. Mehrotra, " Ensuring Integrity of Encrypted Databases in Database as a Service Model", IFIP Conference on Data and Applications Security, Estes Park Colorado , 2003
- H. Hacigumus, B. Iyer, and S. Mehrotra, " Encrypted Database Integrity in Database Service Provider Model", IFIP WCC, Workshop on Certification and Security in E-Services (CSES), Montreal , Canada , 2002
H. Hacigumus, B. Iyer, C. Li, and S. Mehrotra, " Executing SQL over Encrypted Data in Database Service Provider Model", ACM SIGMOD Conference on Management of Data, Wisconsin , Madison , 2002
H. Hacigumus, B. Iyer, and S. Mehrotra , " Providing Database as a Service", IEEE International Conference on Data Engineering (ICDE), San Jose , California , 2002
H. Hacigumus, B. Iyer, and S. Mehrotra , " NetDB2: Database Service Provision" (Demo/Poster), 2000 CASCON, Toronto , Canada , 2000
Presentations and Talks
- "Privacy and Integrity in Outsourced Databases" (Distinguished Invited Talk at CERIAS, Purdue University), April 2004.
- "QA Framework for Efficient Storage Security in RDBMS " (EDBT), March 2004.
- "Authentication and Integrity of Outsourced Databases " (NDSS 2004).
Funding is provided by the National Science
Foundation under Grant No. IIS-0220069.
2002-2003 Project Report (IIS-0086124) [pdf]