PrivGene: Differentially private model fitting using genetic algorithms

Jun Zhang, Xiaokui Xiao, Yin Yang, Zhenjie Zhang, Marianne Winslett

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Citations (Scopus)


ε-differential privacy is rapidly emerging as the state-of-the-art scheme for protecting individuals' privacy in published analysis results over sensitive data. The main idea is to perform random perturbations on the analysis results, such that any individual's presence in the data has negligible impact on the randomized results. This paper focuses on analysis tasks that involve model fitting, i.e., finding the parameters of a statistical model that best fit the dataset. For such tasks, the quality of the differentially private results depends upon both the effectiveness of the model fitting algorithm, and the amount of perturbations required to satisfy the privacy guarantees. Most previous studies start from a state-of-the-art, non-private model fitting algorithm, and develop a differentially private version. Unfortunately, many model fitting algorithms require intensive perturbations to satisfy ε-differential privacy, leading to poor overall result quality. Motivated by this, we propose PrivGene, a general-purpose differentially private model fitting solution based on genetic algorithms (GA). PrivGene needs significantly less perturbations than previous methods, and it achieves higher overall result quality, even for model fitting tasks where GA is not the first choice without privacy considerations. Further, PrivGene performs the random perturbations using a novel technique called the enhanced exponential mechanism, which improves over the exponential mechanism [26] by exploiting the special properties of model fitting tasks. As case studies, we apply PrivGene to three common analysis tasks involving model fitting: logistic regression, SVM classification, and k-means clustering. Extensive experiments using real data confirm the high result quality of PrivGene, and its superiority over existing methods.

Original languageEnglish
Title of host publicationSIGMOD 2013 - International Conference on Management of Data
Number of pages12
Publication statusPublished - 2013
Externally publishedYes
Event2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013 - New York, NY, United States
Duration: 22 Jun 201327 Jun 2013


Other2013 ACM SIGMOD Conference on Management of Data, SIGMOD 2013
CountryUnited States
CityNew York, NY



  • Differential privacy
  • Genetic algorithms
  • Model fitting

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Zhang, J., Xiao, X., Yang, Y., Zhang, Z., & Winslett, M. (2013). PrivGene: Differentially private model fitting using genetic algorithms. In SIGMOD 2013 - International Conference on Management of Data (pp. 665-676)