SQLCoder-7b-2 is live.
Faster, Smaller, The best Text-to-SQL LLM model.
Overview
SQLCoder is a Text-to-SQL LLM family from Defog.ai (YC W23): a human-level AI analyst for every enterprise user.
Text-to-SQL is a crucial task. This task can convert text (natural language) into SQL statements based on the table schema.
Text-to-SQL enables end users to retrieve data from a database instead of having to write SQL statements.
Benchmark
According to the Defog.ai team's benchmark, SQLCoder-7b-2 and SQLCoder-70b-alpha have accuracy that surpasses that of GPT-4.
They have proposed a Text-to-SQL benchmark named SQL-Eval.
Model card
Limitation
- Text-to-SQL is not the same as Text-to-Analytic. This model should be treated as a translation from text into SQL. Do not treat this model as an analyst or ask it questions like 'How can we improve sales?' or 'Tell me about the insights.'
- Read-access only. 'This model has not been trained to reject malicious requests from users with write access to databases,' according to the Defog.ai team.
- Only PostgreSQL syntax. This model was trained on PostgreSQL syntax, so it is not capable of handling other database syntaxes.
Playground
The playground contain 3 sections.
1.Table schema DDL and relation.
First is the DDL schema.
You could generate the schema from your database and paste it directly. The DDL schema helps the LLM understand the existing tables.
Second is the relation(s).
The relations help the LLM perform joins between tables. They represent the primary keys and foreign keys.
2.Question.
The question section could be the user's question. The user could ask anything that relates to the DDL schema and relations that were provided in the previous section.
3.SQL Statement.
The response from the model is the 'SQL statement'.
The SQL statement alone is not helpful for the end user. You need to execute this SQL statement using a query engine that connects to your database.
After you get the response from the database, you could directly show the result from the SQL query or refine the result and the question using another LLM.
How to use
The DDL schema and relations must be contained in the 'system' role.
The question must be contained in the 'user' role.
Like this example.
[{
'role': 'system',
'content': '{DDL}'
},
{
'role': 'user',
'content': ' What are our top 3 products by revenue in the New York ? '
}]
Provide a runnable code sample in this notebook.
Register : https://app.float16.cloud/register