DeepEye

Steerable Self-Driving
Data Agent System

A production-ready data agent system with workflow-centric architecture. One question, connect your data sources, and DeepEye orchestrates complex workflows to generate Data Videos, Dashboards, and Analytical Reports.

SIGMOD Companion '26 · Workflow-Centric Architecture · Production-Ready
DeepEye: One question, three professional outputs

Data is everywhere. Tools still treat it separately.

In real organizations, data lives in databases, Excel files, and internal systems — but most tools still treat these sources separately. People want AI that is not just powerful, but trustworthy, transparent, and auditable.

Gartner AI Spending Report
Market Scale · Gartner, Sep 2025

Worldwide AI Spending Will Total $1.5 Trillion in 2025

AI Application Software is forecast to grow from $83B to $270B by 2026 — a 3× expansion. The intelligent data analysis market is entering hypergrowth.

Read the full report ↗
EU AI Act Article 13
Regulatory Mandate · EU AI Act

High-Risk AI Must Be Transparent & Auditable by Law

Article 13 mandates AI systems be "sufficiently transparent." Black-box AI is no longer compliant — DeepEye's workflow engine meets this requirement.

Read Article 13 ↗
Harvard Business Review
Industry Pain Point · HBR, Sep 2025

AI Is Reinforcing Data Silos — Not Breaking Them

Current AI tools worsen data silos: each department deploys its own AI. DeepEye's unified orchestration breaks down these walls enterprise-wide.

Read the article ↗

Existing tools were not designed for real data work.

There are three big problems. Data is fragmented across formats. Complex tasks overwhelm single-agent systems. And many AI systems are still black boxes — users cannot clearly see or control the process.

The Heterogeneity Gap
Challenge 1

The Heterogeneity Gap

SQL, RAG, and spreadsheet tools exist separately. No single tool orchestrates all three — analysts must manually bridge the gaps.

The Context Explosion
Challenge 2

The Context Explosion

Single-agent systems overload one context window. As complexity grows, models hallucinate and forget constraints, producing unreliable results.

The Black Box Problem
Challenge 3

The Black Box Problem

AI agents are opaque: no plan inspection, no step validation. Sequential execution is slow and untrustworthy. You can't understand why results are wrong.

One question. Three professional outputs.

See how one question becomes a complete workflow — and turns into three professional outputs automatically.

DeepEye Full Demo

Analytical Report

Analytical Report

Executive summary, key findings, and trend analysis.
Not just formatted data — real written intelligence.

  • Executive summary and key findings auto-written and structured
  • Charts and insights connected to system conclusions
  • Draws on databases, Excel files, and JSON sources simultaneously
Auto-written Multi-source Export-ready

Interactive Dashboard

Interactive Dashboard

Filter, drill down, and explore the data directly.
No analyst needed — DeepEye generates it automatically.

  • Real-time filters update all charts instantly
  • Auto-generated trend lines, maps, and visualizations
  • Drill down from summary to individual data points instantly
Zero code Live data Real-time filters

Narrated Data Video

Data Video

Animated charts, narration, and subtitles — automatically.
One of our most distinctive features. Insights become easy to share.

  • Charts animate in real time with auto-generated narration
  • Synchronized subtitles frame-perfectly with audio
  • Professional broadcast-quality visual design
Auto-narrated Animated charts Broadcast quality

Simple for the user. Structured underneath.

The user types a question and connects data sources. DeepEye plans, validates, and executes — transparently, step by step.

DeepEye Workflow Engine
1
Type your question
Bind any data source with @ — databases, Excel files, or JSON data.
2
DeepEye plans the workflow
Decomposes into sub-tasks, builds a DAG, and schedules parallel execution automatically.
3
Validated before it runs
Every node compiled, validated, optimized. Errors caught before reaching your data.
4
Three outputs delivered
Dashboard, report, and video generated in parallel. Ready instantly.
5
You stay in control
Inspect, edit, and re-run any workflow. Drag a node to reorder — lines re-route and output updates instantly.

Three innovations. One coherent system.

DeepEye introduces three key innovations from our SIGMOD 2026 paper, each addressing a fundamental limitation at the systems level.

The Heterogeneity Gap — data silos problem
Innovation I
Unified Multimodal Orchestration

A formal Unified Node Protocol N = ⟨D, I, O, C, Φ⟩ bridges heterogeneous components (AgentNodes/ToolNodes), enabling seamless joint analysis of databases, Excel, JSON, and Knowledge Bases.

Text-to-SQL ↗ Node Protocol Multi-source Join
Context Explosion — single agent overload
Innovation II
Hierarchical Reasoning with Context Isolation

A Memory-Augmented Planner with dual-memory architecture (Working Memory + Knowledge Base) decomposes intents into context-isolated sub-agents, with retrieval-augmented planning and runtime self-correction.

Context Isolation Dual Memory Self-Correction
Black Box — unreliable and slow AI systems
Innovation III
Database-Inspired Workflow Engine

Inspired by DBMS query engines, the Workflow Engine processes DAGs through four phases: Compilation, Validation (static analysis), Optimization (Kahn's Algorithm for parallel scheduling), and Execution.

Compile + Validate Kahn's Algorithm Parallel Execution

Key technical contributions from our SIGMOD 2026 paper.

DeepEye adopts a workflow-centric architecture that bridges flexible LLM reasoning and rigid data engineering, structured into four vertically integrated layers.

Unified Node Protocol
N = ⟨D, I, O, C, Φ⟩
D Node Semantic Description for semantic retrieval and tool selection
I/O Typed Input/Output Ports with schema triplets ⟨Key, Type, Desc⟩
Φ Execution logic: deterministic ToolNodes vs. probabilistic AgentNodes
Two Node Types
ToolNodes
Deterministic operators. O = f(I, C). Idempotent, no reasoning context needed.
AgentNodes
LLM-driven sub-agents with private context window Wlocal and context isolation.
Workflow Engine (4 Phases)
1
Compilation
Parses logical JSON plan into structural DAG Object, resolves variable references, instantiates Node classes.
2
Validation (Static Analysis)
Cycle Detection (DFS), Schema Consistency (type checking), Completeness verification.
3
Optimization (Runtime)
Kahn's Algorithm groups nodes into Execution Layers for automatic parallel execution.
4
Execution
Async scheduler (Celery/Redis) dispatches layers, handles retries, captures logs, triggers self-correction.
DeepEye-SQL: Text-to-SQL Engine
SIGMOD Companion 2026
DeepEye: A Steerable Self-driving Data Agent System
Boyan Li, Yiran Peng, Yupeng Xie, Sirong Lu, Yizhang Zhu, Xing Mu, Xinyu Liu, and Yuyu Luo
SIGMOD Companion '26, May 31–June 05, 2026, Bengaluru, India
Download PDF

DeepEye improves over time.

It stores schema knowledge and successful workflows — reusing experience, retrieving the right context, and becoming smarter with continued use.

Document Memory
📄
Data Memory
Table schemas, field definitions, and business metadata indexed and auto-retrieved for relevant questions.
Schema Context
📌
Schema Context
Table names, field descriptions, and business definitions stored. DeepEye always knows your data.
Reusable Templates
Reusable Templates
Successful workflows saved as templates. One click re-runs instantly.

Not just a demo — built for real deployment.

We are not only thinking about intelligence. We are also thinking about security, control, and enterprise use.

🐳
One-Click Docker Deployment
Deploy the entire stack with one Docker command — on-premise or any cloud.
🔒
Sandboxed Code Execution
All generated code runs in isolated containers — no unauthorized access to the host system or network.
👤
User Authentication & Access Control
Role-based access control — analysts only see authorized data sources, every query attributed to a user.
Parallel Execution Engine
Independent nodes run simultaneously via topological scheduling — turning sequential hours into parallel seconds.
🌐
Unified Multi-Source Integration
Connect PostgreSQL, MySQL, Excel, JSON, and REST APIs in one session — no custom connectors needed.
📋
Fully Auditable Workflow History
Every query, decision, and output is logged with full provenance — every result traceable to its source.

Researched, protected, and validated.

DeepEye has received strong external recognition — through awards, patents, and academic publication.

AI Agent 2025 Best Open Source Project Award
Competition Award
Best Open Source Project
AI Agent 2025
Best Project Award
5 Patents Granted
Intellectual Property
5 Patents Granted
China National Intellectual Property Administration
Autonomous Data Analysis Method & System
DeepEye Demo Paper
Academic Publications
10+ Publications
Published in top-tier AI and database conferences,
including SIGMOD, VLDB, ICLR, ICML and NeurIPS.
HKUST(GZ) DIAL Lab
Data Intelligence and Analytics Lab, HKUST(GZ)