Generating SQL queries using natural language syntactic dependencies and metadata

Alessandra Giordani, Alessandro Moschitti

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

This research concerns with translating natural language questions into SQL queries by exploiting the MySQL framework for both hypothesis construction and thesis verification in the task of question answering. We use linguistic dependencies and metadata to build sets of possible SELECT and WHERE clauses. Then we exploit again the metadata to build FROM clauses enriched with meaningful joins. Finally, we combine all the clauses to get the set of all possible SQL queries, producing an answer to the question. Our algorithm can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Additionally, it proposes a weighting scheme to order all the generated queries in terms of probability of correctness. Our preliminary results are encouraging as they show that our system generates the right SQL query among the first five in the 92% of the cases. This result can be greatly improved by re-ranking the queries with a machine learning methods.

Original languageEnglish
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages164-170
Number of pages7
Volume7337 LNCS
DOIs
Publication statusPublished - 24 Jul 2012
Externally publishedYes
Event17th International Conference on Applications of Natural Language to Information Systems, NLDB 2012 - Groningen, Netherlands
Duration: 26 Jun 201228 Jun 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7337 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other17th International Conference on Applications of Natural Language to Information Systems, NLDB 2012
CountryNetherlands
CityGroningen
Period26/6/1228/6/12

Fingerprint

Query Language
Syntactics
Metadata
Natural Language
Query
Linguistics
Learning systems
Question Answering
Join
Weighting
Correctness
Ranking
Machine Learning
Syntax

Keywords

  • Information Schema
  • Metadata
  • Natural Language Processing
  • Question Answering
  • SQL

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Giordani, A., & Moschitti, A. (2012). Generating SQL queries using natural language syntactic dependencies and metadata. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7337 LNCS, pp. 164-170). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7337 LNCS). https://doi.org/10.1007/978-3-642-31178-9_16

Generating SQL queries using natural language syntactic dependencies and metadata. / Giordani, Alessandra; Moschitti, Alessandro.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7337 LNCS 2012. p. 164-170 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7337 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Giordani, A & Moschitti, A 2012, Generating SQL queries using natural language syntactic dependencies and metadata. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7337 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7337 LNCS, pp. 164-170, 17th International Conference on Applications of Natural Language to Information Systems, NLDB 2012, Groningen, Netherlands, 26/6/12. https://doi.org/10.1007/978-3-642-31178-9_16
Giordani A, Moschitti A. Generating SQL queries using natural language syntactic dependencies and metadata. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7337 LNCS. 2012. p. 164-170. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-31178-9_16
Giordani, Alessandra ; Moschitti, Alessandro. / Generating SQL queries using natural language syntactic dependencies and metadata. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7337 LNCS 2012. pp. 164-170 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{e0ab514379834bb9ad18a75bc6ef0b56,
title = "Generating SQL queries using natural language syntactic dependencies and metadata",
abstract = "This research concerns with translating natural language questions into SQL queries by exploiting the MySQL framework for both hypothesis construction and thesis verification in the task of question answering. We use linguistic dependencies and metadata to build sets of possible SELECT and WHERE clauses. Then we exploit again the metadata to build FROM clauses enriched with meaningful joins. Finally, we combine all the clauses to get the set of all possible SQL queries, producing an answer to the question. Our algorithm can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Additionally, it proposes a weighting scheme to order all the generated queries in terms of probability of correctness. Our preliminary results are encouraging as they show that our system generates the right SQL query among the first five in the 92{\%} of the cases. This result can be greatly improved by re-ranking the queries with a machine learning methods.",
keywords = "Information Schema, Metadata, Natural Language Processing, Question Answering, SQL",
author = "Alessandra Giordani and Alessandro Moschitti",
year = "2012",
month = "7",
day = "24",
doi = "10.1007/978-3-642-31178-9_16",
language = "English",
isbn = "9783642311772",
volume = "7337 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "164--170",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Generating SQL queries using natural language syntactic dependencies and metadata

AU - Giordani, Alessandra

AU - Moschitti, Alessandro

PY - 2012/7/24

Y1 - 2012/7/24

N2 - This research concerns with translating natural language questions into SQL queries by exploiting the MySQL framework for both hypothesis construction and thesis verification in the task of question answering. We use linguistic dependencies and metadata to build sets of possible SELECT and WHERE clauses. Then we exploit again the metadata to build FROM clauses enriched with meaningful joins. Finally, we combine all the clauses to get the set of all possible SQL queries, producing an answer to the question. Our algorithm can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Additionally, it proposes a weighting scheme to order all the generated queries in terms of probability of correctness. Our preliminary results are encouraging as they show that our system generates the right SQL query among the first five in the 92% of the cases. This result can be greatly improved by re-ranking the queries with a machine learning methods.

AB - This research concerns with translating natural language questions into SQL queries by exploiting the MySQL framework for both hypothesis construction and thesis verification in the task of question answering. We use linguistic dependencies and metadata to build sets of possible SELECT and WHERE clauses. Then we exploit again the metadata to build FROM clauses enriched with meaningful joins. Finally, we combine all the clauses to get the set of all possible SQL queries, producing an answer to the question. Our algorithm can be recursively applied to deal with complex questions, requiring nested SELECT instructions. Additionally, it proposes a weighting scheme to order all the generated queries in terms of probability of correctness. Our preliminary results are encouraging as they show that our system generates the right SQL query among the first five in the 92% of the cases. This result can be greatly improved by re-ranking the queries with a machine learning methods.

KW - Information Schema

KW - Metadata

KW - Natural Language Processing

KW - Question Answering

KW - SQL

UR - http://www.scopus.com/inward/record.url?scp=84863995903&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84863995903&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-31178-9_16

DO - 10.1007/978-3-642-31178-9_16

M3 - Conference contribution

SN - 9783642311772

VL - 7337 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 164

EP - 170

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -