Skip to main content
Version: 0.3.0

Roadmap

We're building in public. [x] means shipped today, [ ] means open for contribution. Open an issue or hop into Discord on anything unchecked.

1 Federated SQL engine

  • DataFusion single-node federation across CSV, Parquet, JSON, S3 / GCS / Azure, Postgres, MySQL, SQLite, MongoDB, Redis, Iceberg, Lance, SeekDB — all joinable in one query
  • Register by table, or load an entire DB (Postgres / MySQL / SQLite) as a DataFusion catalog — one config line either way

2 Retrieval primitives

  • Vector search — pg_knn (pgvector), sqlite_knn (sqlite-vec), Lance KNN, SeekDB HNSW
  • Full-text search — pg_fts, sqlite_fts, Lance BM25 inverted indexes, SeekDB FULLTEXT
  • Hybrid search — RRF merge of FTS + KNN in plain SQL
  • Inline embeddings — candle() UDF (GGUF / Candle / remote embed APIs) runs directly inside SQL; content + vector stay on the same row atomically
  • ONNX inference — onnx_predict UDF for inline model predictions in SQL
  • Memory primitive — hybrid access + TTL + provenance + consolidation collapsed into one declarative macro

3 Online serving (pipelines)

  • Declarative YAML → parameterized REST endpoint with inferred request / response schema
  • Built-in pipeline dashboard
  • CLI pipeline binding + aliases — skardi run <pipeline> --param=… and user-defined verb aliases (#90)
  • CLI federated SQL — skardi query against files, object stores, datalake formats, and databases with no server required

4 Offline jobs

  • Async batch execution with submit / poll / cancel (#98)
  • Lance dataset destinations with atomic commit + crash recovery
  • SQL-DML destinations (Postgres / MySQL / SQLite)
  • SQLite-backed run ledger with submit-time schema diff

5 Agent-facing bindings

  • REST — every pipeline served as a parameterized HTTP endpoint
  • Shell — every pipeline runnable as a skardi command; works in Claude Code, Cursor, and any agent with a Bash tool
  • Skills generator — skardi skills generate --ctx <ctx.yaml> --out .claude/skills/ emits a skill Markdown per pipeline for Claude Code / Desktop auto-discovery
  • MCP binding — same pipeline YAML projected to MCP tools for non-Claude hosts

6 Governance & lineage

  • Catalog with semantics — NL description on catalog / table / column; an agent-callable describe pipeline
  • Lineage capture — agent_id, session_id, tool_call_id, timestamp on writes; queryable from metadata tables
  • Agent identity passthrough — any binding injects client identity into a SQL context var pipelines can read
  • Snapshot-as-branch / agent checkpoints — Iceberg / Lance-backed; git checkout-like semantics for destructive agent experiments

7 Ops

  • Session auth — drop-in user auth via better-auth backed by SQLite
  • Observability — OpenTelemetry traces / metrics / logs with a pre-configured Grafana stack
  • Docker + pre-built binaries — Linux x86_64 / ARM64, macOS ARM64