The quantum chemical search for novel materials and the issue of data processing

The InfoMol project

Hans P. Lüthi, Stefan Heinen, Gisbert Schneider, Andreas Glöss, Martin P. Brändle, Rollin A. King, Edward Pyzer-Knapp, Fahhad Alharbi, Sabre Kais

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

Original languageEnglish
JournalJournal of Computational Science
DOIs
Publication statusAccepted/In press - 15 Jul 2015

Fingerprint

XML
Quantum chemistry
Output
Electronic data interchange
Benchmarking
Screening
Availability
Network protocols
Quantum Chemistry
Drug Design
Data Exchange
Harvesting
First-principles
Use Case
Data Model
Modeling and Simulation
Prediction
Modeling

Keywords

  • Data processing
  • Drug design
  • Materials design
  • Modeling and simulation
  • Quantum chemistry

ASJC Scopus subject areas

  • Computer Science(all)
  • Modelling and Simulation
  • Theoretical Computer Science

Cite this

The quantum chemical search for novel materials and the issue of data processing : The InfoMol project. / Lüthi, Hans P.; Heinen, Stefan; Schneider, Gisbert; Glöss, Andreas; Brändle, Martin P.; King, Rollin A.; Pyzer-Knapp, Edward; Alharbi, Fahhad; Kais, Sabre.

In: Journal of Computational Science, 15.07.2015.

Research output: Contribution to journalArticle

Lüthi, Hans P. ; Heinen, Stefan ; Schneider, Gisbert ; Glöss, Andreas ; Brändle, Martin P. ; King, Rollin A. ; Pyzer-Knapp, Edward ; Alharbi, Fahhad ; Kais, Sabre. / The quantum chemical search for novel materials and the issue of data processing : The InfoMol project. In: Journal of Computational Science. 2015.
@article{31abe08312cf4c0d989184a8006f7d4f,
title = "The quantum chemical search for novel materials and the issue of data processing: The InfoMol project",
abstract = "In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.",
keywords = "Data processing, Drug design, Materials design, Modeling and simulation, Quantum chemistry",
author = "L{\"u}thi, {Hans P.} and Stefan Heinen and Gisbert Schneider and Andreas Gl{\"o}ss and Br{\"a}ndle, {Martin P.} and King, {Rollin A.} and Edward Pyzer-Knapp and Fahhad Alharbi and Sabre Kais",
year = "2015",
month = "7",
day = "15",
doi = "10.1016/j.jocs.2015.10.003",
language = "English",
journal = "Journal of Computational Science",
issn = "1877-7503",
publisher = "Elsevier",

}

TY - JOUR

T1 - The quantum chemical search for novel materials and the issue of data processing

T2 - The InfoMol project

AU - Lüthi, Hans P.

AU - Heinen, Stefan

AU - Schneider, Gisbert

AU - Glöss, Andreas

AU - Brändle, Martin P.

AU - King, Rollin A.

AU - Pyzer-Knapp, Edward

AU - Alharbi, Fahhad

AU - Kais, Sabre

PY - 2015/7/15

Y1 - 2015/7/15

N2 - In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

AB - In the search for novel materials, quantum chemical modeling and simulation has taken an important role. Molecular properties are computed on the basis of first-principles methods and screened against pre-defined criteria. Alternatively, the results of these computations are used as source data to enhance the predictions of data-centric models. Whichever modeling strategy is being applied, data-intense steps are involved in the process. One key bottleneck in this regard is the lack of availability of machine-readable output for virtually all quantum chemistry codes. The results of computations need to be extracted manually or using scripts and parsers, instead of directly being written out in machine-readable format to be imported into a database for archival, analysis and exchange. We present two solutions implemented in two selected examples, the TURBOMOLE and PSI4 program packages. Next to the standard output, both codes generate Extensible Markup Language (XML) output files, but in two different ways. The generation of machine-readable output in a structured format can easily be implemented, and, as long as the data can be transformed, the choice of data format is secondary. The concept is illustrated for two different use cases from method benchmarking and drug design. A third illustration addresses the definition of a data processing and exchange protocol for screening libraries of light-harvesting compounds.

KW - Data processing

KW - Drug design

KW - Materials design

KW - Modeling and simulation

KW - Quantum chemistry

UR - http://www.scopus.com/inward/record.url?scp=84948800031&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84948800031&partnerID=8YFLogxK

U2 - 10.1016/j.jocs.2015.10.003

DO - 10.1016/j.jocs.2015.10.003

M3 - Article

JO - Journal of Computational Science

JF - Journal of Computational Science

SN - 1877-7503

ER -