Skip to content
Cube vs PlyDB

Cube vs PlyDB

Cube (formerly Cube.js) is a universal semantic layer and agentic analytics platform — it sits between your data sources and every consumer of them, enforcing consistent metric definitions through engineer-authored data models, and caching results for sub-second performance. PlyDB takes a different approach: agents connect to live sources directly and build semantic context themselves through OSI overlays that accumulate across sessions. Both are Apache 2.0 open source and both ship native MCP servers. The core difference is the semantic model: engineer-defined and enforced upfront in Cube, or agent-built and compounding over time in PlyDB.


TL;DR

CubePlyDB
Semantic model approachEngineer-authored YAML/JS — measures, dimensions, and joins defined upfront and enforced consistently across all consumersAgent-authored OSI overlays — schema auto-discovered; agents accumulate business context across sessions that informs future queries
Pre-aggregation & query cachingBuilt-in relational caching engine — automatically builds condensed datasets for sub-second analytics at scaleNo caching layer — queries execute live against source databases
Multi-protocol APIsREST, GraphQL, SQL (Postgres wire protocol), MDX, and MCP — one semantic layer, any consumerMCP and CLI — purpose-built for AI agent access
Built-in AI agentsAI Data Analyst and AI Data Engineer agents — natural language analytics and automated model authoring via Cube Agentic AnalyticsNot applicable — PlyDB is the data layer; AI reasoning lives in your external agent
Agent integration (MCP)Native MCP server — Premium and Enterprise plans only; exposes the semantic layer to MCP-compatible agentsNative MCP & CLI — available in the open-source binary, no paid plan required
Time to first queryRequires cube model authoring before agents or BI tools can query — hours to days depending on data complexityMinutes — auto-discovers schema; agents start querying immediately and build context over time
Live operational DB accessQueries go through the semantic model; pre-aggregations add a refresh cycle between source changes and query resultsDirect — agents query live data with no caching cycle in between
Cross-source queriesSupported via rollup joins across configured data sources — aggregated level, not arbitrary row-level JOINsArbitrary SQL JOIN across any connected source in one query
Semantic context for agentsEngineer-defined measures and dimensions surface to agents via the API — consistent, governed, requires upfront authoringOSI overlays — agents auto-discover schema and write context that persists and compounds across sessions; advisory but substantive
Deployment complexityCube Core requires Kubernetes for production; Cube Cloud is fully managed SaaSSingle binary, one JSON config file — runs anywhere without cluster infrastructure
Open sourceApache 2.0 (Cube Core); Cube Cloud is proprietary SaaSApache 2.0
CostCube Core: free; Cube Cloud: free tier (1K queries/day), CCU-based paid plans; MCP on Premium+Open source — Apache 2.0

What each one does

Cube

Universal Semantic Layer & Agentic Analytics Platform

Cube is an open-source universal semantic layer that sits between your data sources and every consumer of them — BI tools, AI agents, embedded analytics, and custom applications. Data engineers define cubes, dimensions, measures, and joins in YAML or JavaScript; Cube enforces these definitions consistently across its REST, GraphQL, SQL (Postgres wire protocol), MDX, and MCP APIs. Its built-in relational caching engine materializes pre-aggregations for sub-second query performance at high concurrency. Cube Agentic Analytics (GA October 2025) adds AI Data Analyst and AI Data Engineer agents powered by Claude or a bring-your-own LLM. Cube Core is Apache 2.0; the MCP server requires a Cube Cloud Premium or Enterprise plan.

PlyDB

AI Agent Database Gateway

PlyDB is an open-source gateway built from the ground up for AI agents. You declare your data sources in a single JSON config file — PostgreSQL, MySQL, SQLite, S3, files, Google Sheets — and any AI agent connects immediately via native MCP or CLI, with no data modeling required before the first query. PlyDB's semantic context system auto-discovers schema and provides an OSI-format overlay system where agents record institutional knowledge — enum meanings, business rules, domain context — that persists and compounds across sessions. Read-only by design, single binary, and the MCP server ships with the open-source release at no additional cost.


When to use each

Choose Cube when…

  • Metric consistency is non-negotiable — "Revenue" must mean the same thing whether a BI tool, an AI agent, or an embedded app is asking
  • Query performance at scale matters — pre-aggregations serve sub-second responses at high concurrency without hitting the warehouse on every request
  • You need to serve multiple consumer types from one semantic model — REST, GraphQL, SQL, MCP, and embedded analytics all from a single definition
  • You have data engineering resources to author and maintain the cube model — and want AI (Cube Copilot) to assist with that authoring
  • You want built-in AI agents for natural language analytics on top of a governed semantic layer

Choose PlyDB when…

  • You want agents querying live data immediately — no cube model to author, no pre-aggregation refresh cycle between source changes and query results
  • You want semantic context that builds from real agent sessions — OSI overlays accumulate institutional knowledge across conversations without requiring a data engineer to model it first
  • Arbitrary SQL JOINs across multiple sources are required — not just aggregated rollup joins between cubes
  • MCP should be available without a paid plan — Cube's MCP server requires Premium or Enterprise
  • You want a single binary with no cluster infrastructure — Cube Core production deployments require Kubernetes

These tools can complement each other. Teams running Cube for governed BI and multi-consumer analytics can deploy PlyDB alongside it as the agent gateway to operational databases and ad-hoc sources that don't belong in a cube model — sources that still need to be reachable by agents without going through a modeling phase first.