Skip to main content
Version: 0.2.0

Skardi Server

Define SQL queries in YAML and serve them as parameterized HTTP APIs. The server includes a built-in dashboard, automatic parameter inference, and health checks for all data sources.

cargo run --bin skardi-server -- \
--ctx ctx.yaml \
--pipeline pipelines/ \
--port 8080

For full server documentation — context files, pipeline files, access mode, caching, API endpoints, and response format — see docs/server.md.

Server Reference

skardi-server is an HTTP server that loads data sources from a context file, registers SQL pipelines, and serves them as REST endpoints.

Running the Server

cargo run --bin skardi-server -- \
--ctx <path-to-ctx.yaml> \
--pipeline <path-to-pipeline.yaml-or-directory> \
--port 8080
FlagDescription
--ctxPath to the context YAML file that defines data sources
--pipelinePath to a pipeline YAML file or a directory of pipeline files
--portPort to listen on (default: 8080)

Dashboard

Once the server is running, open http://localhost:8080 in your browser to access the pipeline dashboard.

The dashboard lists every registered pipeline as a card showing:

  • Endpoint URL — the POST path to call, with a one-click copy button
  • Parameters — inferred parameter names and types from the pipeline SQL
  • Example request — a ready-to-run curl command for the pipeline
  • Try It — an interactive panel where you can edit the JSON body and execute the pipeline directly from the browser

No configuration required — the dashboard is built into skardi-server and updates automatically when pipelines are loaded.

API Endpoints

EndpointMethodDescription
/GETPipeline dashboard UI
/healthGETService health check
/health/:nameGETPer-pipeline health check (includes data source status)
/pipelinesGETList all registered pipelines
/pipeline/:nameGETGet specific pipeline info
/data_sourceGETList all data sources
/:name/executePOSTExecute a pipeline by name

Context Files

A context file (ctx.yaml) defines the data sources available to your pipelines. Each data source is registered as a table in the query engine.

data_sources:
- name: "products" # Table name used in SQL queries
type: "csv" # Data source type
path: "data/products.csv" # File path or connection string
options: # Type-specific options
has_header: true
delimiter: ","
schema_infer_max_records: 1000
description: "Product catalog"

You can define multiple data sources of different types in a single context file:

data_sources:
- name: "users"
type: "postgres"
connection_string: "postgresql://localhost:5432/mydb?sslmode=disable"
options:
table: "users"
schema: "public"
user_env: "PG_USER"
pass_env: "PG_PASSWORD"

- name: "orders"
type: "csv"
path: "docs/sample_data/orders.csv"
options:
has_header: true
delimiter: ","

Access Mode

By default, all data sources are read-only — only SELECT queries are allowed. To enable write operations (INSERT, UPDATE, DELETE), set access_mode: read_write on the data source. Only postgres, mysql, sqlite, mongo, and redis sources support read_write mode; setting it on other types will produce an error at startup.

data_sources:
- name: "users"
type: "postgres"
connection_string: "postgresql://localhost:5432/mydb?sslmode=disable"
access_mode: read_write # Enable INSERT/UPDATE/DELETE
options:
table: "users"
user_env: "PG_USER"
pass_env: "PG_PASSWORD"

- name: "products"
type: "csv"
path: "data/products.csv"
# access_mode defaults to read_only (CSV doesn't support writes)

If a pipeline attempts a write operation on a read_only source, the server returns an error:

Write operation not allowed on data source 'products'. The data source is configured with 'read_only' access mode.

In-Memory Caching

For file-based sources (csv, parquet, iceberg), you can set enable_cache: true to load the entire dataset into memory at startup. This gives significantly faster query performance at the cost of memory usage.

data_sources:
- name: "products"
type: "csv"
path: "data/products.csv"
enable_cache: true # Load into memory at startup
options:
has_header: true

This is useful for datasets that are queried frequently and fit in memory. The cache is created once at startup and used for all subsequent queries.

Pipeline Files

A pipeline file defines a SQL query with parameter placeholders. Parameters are enclosed in {braces} and automatically extracted. Types and response schemas are inferred from the SQL and table schemas.

metadata:
name: product-search-demo
version: 1.0.0
description: "Product search and filtering"

query: |
SELECT
"Name" as product_name,
"Brand" as brand,
"Price" as price
FROM products
WHERE ({brand} IS NULL OR "Brand" = {brand})
AND ({max_price} IS NULL OR "Price" < {max_price})
ORDER BY "Price" ASC
LIMIT {limit}

Execute with:

curl -X POST http://localhost:8080/product-search-demo/execute \
-H "Content-Type: application/json" \
-d '{"brand": "Apple", "max_price": 500.0, "limit": 10}'

Use the {param} IS NULL OR ... pattern for optional filters — pass null to skip a filter.

Response Format

Success:

{
"success": true,
"data": [{"product_name": "Laptop", "price": 999.99}],
"rows": 1,
"execution_time_ms": 15,
"timestamp": "2025-01-15T12:00:00.000Z"
}

Error:

{
"success": false,
"error": "Missing required parameters: limit",
"error_type": "parameter_validation_error",
"details": {"missing_parameters": ["limit"]},
"timestamp": "2025-01-15T12:00:00.000Z"
}