OpenCrow Architecture

A deep dive into the monorepo structure, modules, and internal flow.

Architecture Overview

OpenCrow is a monorepo built using Turbo, consisting of multiple applications and packages. This guide covers how the components interact and the core logic behind the agent execution engine.

Monorepo Structure

apps/app: The Next.js dashboard used by administrators to configure agents, define client-side tools, and view chat histories.
apps/backend: The Node.js (Express) backend server. This is the "brain" of OpenCrow, handling vector embeddings, database interactions, LLM orchestration, and Tool merging.
apps/website: The public-facing landing page and documentation site.
packages/widget: The React widget package that end-users embed into their applications to provide the chat interface.
sample-website: A Next.js demo application demonstrating how the widget functions in a real-world scenario (an e-commerce store).

The Tooling Engine

The core innovation in OpenCrow is its dual-layer tool execution capability.

Server-Side Tools

These are defined by providing a URL to an OpenAPI (Swagger) specification in the dashboard.

The backend downloads and parses the OpenAPI JSON/YAML.
It converts these API definitions into a format Native to the LLM (e.g., Gemini function declarations).
When the LLM decides to use one of these tools, it makes a request to the Backend.
The Backend then acts as a proxy, securely forwarding the request to the target server defined in the OpenAPI spec.

Client-Side Tools

These are custom tools defined directly in the OpenCrow Dashboard using a JSON schema. They are intended to manipulate the UI or state of the user's application.

The backend loads the JSON definitions of these client-side tools and merges them with the server-side tools.
Both sets of tools are passed simultaneously to the LLM context.
If the LLM triggers a Client-Side Tool, the backend recognizes it.
Instead of executing it (because the backend has no access to the browser window), the backend returns a specific payload back to the Widget.
The Widget then matches the tool name to the JavaScript function passed into its props and executes it locally in the user's browser.

Tech Stack Summary

Database: PostgreSQL with Prisma ORM.
Generative AI: Google Generative AI (@google/generative-ai) is the primary LLM engine.
Vector Database: LanceDB is used for document ingestion and retrieval (Knowledge Base).
Caching/State: Redis (via ioredis) handles rate limiting and potentially session states.
Validation: Zod is used for runtime payload validation across the stack.