TLSQL Documentation¶
TLSQL is a system designed to simplify machine learning workflows on structured tabular data. It translates SQL-like statements into standard SQL queries and structured learning task descriptions, enabling data scientists and engineers to focus on model development instead of writing complex SQL or manually managing datasets.
TLSQL works seamlessly with relational databases, data warehouses, and data lakes, enabling end-to-end table-based ML workflows.
Overview¶
TLSQL supports three types of statements that map directly to ML workflows:
PREDICT VALUE: Test set.
TRAIN WITH: Training set.
VALIDATE WITH: Validation set.
The TLSQL Workflow
Quick Start¶
import tlsql
# Single statement: convert one TLSQL to SQL
result = tlsql.convert("PREDICT VALUE(users.Age, CLF) FROM users WHERE users.Gender='F'")
print(result.statement_type) # 'PREDICT'
print(result.target_column) # 'users.Age'
print(result.sql) # generated SQL
Components¶
API:
convert— Convert a single TLSQL statement to SQL.convert_workflow_queries— Convert a workflow (query_list of PREDICT, TRAIN, VALIDATE) to SQL.
Core Components:
Lexer - Tokenizes TLSQL text into tokens.
Parser - Parses tokens into Abstract Syntax Tree (AST).
SQL Generator - Generates standard SQL from AST.
AST Nodes - AST node definitions.
Tokens - Token type definitions.
Exceptions - Exception classes for error handling.