v1.0 — open source — MIT

The knowledge base
that thinks

Documents go in. Entities get extracted. Relations get mapped. Contradictions get flagged. You ask questions — get grounded answers with source citations.

~
$ curl -X POST /knowledge/search -d '{"query":"architecture decisions"}' # hybrid search: BM25 + vector + RRF fusion + graph expansion + re-ranking [1] Architecture Decisions rrf:0.0328 — doc-ff68cc39 [2] Technology Stack rrf:0.0315 — doc-aaf5084c [3] Team and Organization rrf:0.0303 — doc-3c3971e4 Sources: [1] Architecture Decisions [2] Technology Stack
The big picture

From documents to decisions.
Here's what actually happens.

A B2B team adds their product docs, meeting notes, and architecture decisions. Here's the system in action:

EXAMPLE: Your team adds 50 product documents ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ STEP 1 ▸ INGEST STEP 2 ▸ UNDERSTAND ┌──────────────────────────┐ ┌────────────────────────────────────┐ PRD.md Entities extracted: architecture-v2.md ──▸ [person] Sarah Chen (CTO) meeting-2026-03-15.md index [tech] React, FastAPI, Redis onboarding-guide.md chunk [decision] "Use SQLite over PG" api-reference.md embed [concept] Multi-tenancy ... 45 more └──────────────────────────┘ Relations mapped: Sarah ──created──▸ Architecture v2 Redis ──replaces──▸ Memcached FastAPI ──uses──▸ Python └────────────────────────────────────┘ STEP 3 ▸ DETECT STEP 4 ▸ ASK ┌──────────────────────────┐ ┌────────────────────────────────────┐ Conflicts found: Q: "Who decided on the cache layer and what were the tradeoffs?" PRD says "Redis" but meeting notes A: Sarah Chen decided in the March say "Memcached" meeting to use Redis over Memcached for cluster support. Synthesis compiled: "Redis" → 4 mentions [1] meeting-2026-03-15.md across 3 documents [2] architecture-v2.md → synthesis page ready [3] PRD.md └──────────────────────────┘ └────────────────────────────────────┘ STEP 5 ▸ AGENTS ACT ┌──────────────────────────────────────────────────────────────────────────────┐ Researcher searches KB → finds 3 relevant docs → reads them Researcher delegates to Writer → "draft a summary of cache decisions" Writer creates new document → auto-triggers entity extraction Knowledge Compiler updates synthesis page for "Redis" entity Auditor runs weekly lint → flags the Redis/Memcached conflict as critical └──────────────────────────────────────────────────────────────────────────────┘
Real-world demo

Feen — a real company
running on pixl-kb.

We populated a workspace with every document from Feen, a real Belgian B2B SaaS built by Pixl SRL in Brussels. Product, engineering, sales, legal, operations — 25 markdown files. Then ran the full pipeline and captured what came out.

25Documents
6Categories
362Chunks indexed
50Entities
4Conflicts found
55/55Tests passing

Product — 6 docs

Overview, Invoicing, Document AI, Banking, Pricing (€19 Standard), Roadmap 2026.

Engineering — 6 docs

Architecture, DB schema, AI agents, Peppol integration, CI/CD, March 2026 incident postmortem.

Sales & Marketing — 4 docs

Positioning, sales playbook, competitive analysis, customer research.

Operations — 4 docs

Company overview, monthly VAT process, Q1 2026 board update, Q2 OKRs.

Customer Success — 3 docs

Onboarding checklist, FAQ, Peppol readiness guide.

Legal & Compliance — 2 docs

GDPR & data residency, security policy.

Contradictions the graph surfaced ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ ⚠ CRITICAL Peppol mandate timing Readiness Guide : "Jan 1, 2026 hard deadline" Q2 2026 OKRs : "phased rollout through 2026" auto-flagged on knowledge graph ⚠ WARNING Pricing tier drift Pricing doc : only €19 Standard plan Q1 board update : "€39 Pro tier planned Q3" surfaced in conflicts drawer ⚠ WARNING AI accuracy claim drift Product overview: "99% accuracy" AI Agents doc : "97.8% / 94% benchmarks" marketing vs engineering mismatch ⚠ INFO Entity description drift "Sarah Chen" described as CTO in one doc and engineering lead in another linter auto-detection
Workflow: Research Report ━━━━━━━━━━━━━━━━━━━━━━━━━━ prompt "Analyze NovaTech's competitive position in the CI/CD market" agent researcher status completed tokens 57,524 nodes gather → analyze → report gate human approval passed RAG pipeline (all stages on) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━ chunking header embed all-MiniLM-L6-v2 (384d, local) dense on rerank on (top-5) query T on graph on

Git-backed versioning worked

Every document save became a real git commit in the workspace repo. Diff viewer, restore, and history — same as a real IDE.

Hybrid search returned real scores

RRF-fused BM25 + vector. "How does Feen extract data from a Belgian invoice?" returned the Pricing FAQ with a fused score of 11.52 — correct source, first hit.

4 agents with memory + goals

Document Extractor, Cash Flow Analyst, Peppol Compliance Bot, Customer Success Assistant. Persistent key-value memory, goal tracking with progress bars.

55/55 API tests passed

Full regression suite across auth, workspaces, categories, documents, knowledge, agents, channels, RAG config, evaluation — every endpoint green.

1 embedded app rendered

An HTML financial dashboard showing Q1 2026 MRR €18,240 / ARR €218,880 / 14-month runway — loaded straight inside the console.

The company behind the data

Pixl SRL — Brussels, Belgium

Founded 2024 by Hamza Mounir. Builds Feen, an AI-powered accounting automation platform for Belgian SMEs. €19/month Standard tier, 14-day free trial, GDPR and PSD2 compliant, Peppol-ready for the January 2026 B2B mandate.

FastAPIPostgreSQLNext.js 15React 19GPT-4oGPT-4o VisionCloud RunPeppolUBL XMLPSD2
Legal name
Pixl SRL
VAT
BE 0805.449.693
Stage
Seed · 2024
Product
Feen — €19/mo
AI capabilities

What the AI actually does

Not "AI-powered" marketing fluff. Here's every concrete AI capability, what model runs it, and why.

Capability
How it works
Model
Entity Extraction
Reads document, outputs structured JSON with entities, types, relations, confidence
Haiku
Query Rewriting
Expands ambiguous queries with synonyms and related terms before search
Haiku
Query Decomposition
Splits complex multi-hop questions into 2-4 sub-queries, runs each, merges via RRF
Haiku
HyDE
Generates hypothetical answer → embeds it → searches document-to-document instead of query-to-document
Haiku
Re-ranking
Scores each search result 0-10 for relevance. Batched 8 per call. Re-sorts by score.
Haiku
Self-RAG
Assesses if retrieved context is sufficient/insufficient/irrelevant before generating answer
Haiku
Context Compression
Summarizes low-ranked context chunks when over token budget instead of hard truncation
Haiku
Synthesis Compilation
Reads all mentions of an entity across docs, writes a compiled topic summary page
Haiku
Agent Chat
Multi-turn conversations. 13 tools via function calling. Memory-aware persona.
Sonnet
RAGAS Evaluation
Scores faithfulness, relevance, precision, recall of RAG pipeline output (0-1 each)
Haiku
Dense Embeddings
384-dimensional vectors for semantic search. Normalized. Cosine similarity. Stored as BLOBs.
MiniLM (local)
How search works

Five stages. Each one toggleable.

Every query flows left to right through the pipeline. Disable any stage per workspace via feature flags.

Transform

Classify. Rewrite. Decompose multi-hop. Generate HyDE.

Hybrid Search

BM25 sparse + vector dense. Reciprocal Rank Fusion.

Graph Expand

Entity BFS. Relation traversal. Surface missed docs.

Re-rank

LLM scores 0-10. Batched. Results re-sorted.

Citations

[1][2] refs. Neighbor chunks. Token budget.

Knowledge engine

Not a folder of files.
A living knowledge graph.

Every document feeds the graph. Entities link. Contradictions surface. Synthesis pages compile automatically.

Entity Extraction

8 types via Claude Haiku. Person, technology, concept, decision, organization, process, project, metric. Confidence scores + aliases.

Typed Relations

uses, depends_on, part_of, created_by, replaces, contradicts. Weighted edges with evidence. BFS traversal.

Conflict Detection

Two docs say different things about the same entity? Flagged automatically. Severity: info, warning, critical.

Synthesis Pages

Knowledge Compiler reads all mentions of an entity, calls LLM, writes a compiled summary. Stored in .synthesis/. Updates when sources change.

Knowledge Linter

Stale docs. Orphan entities. Unresolved conflicts. Pending proposals. Low-confidence extractions. Health report on demand.

RAGAS Evaluation

Faithfulness, relevance, precision, recall. Auto-generated Q&A pairs. Benchmark runner. Measure your RAG — don't guess.

Knowledge Graph ━━━━━━━━━━━━━━━━━━━━━━━━━ Pixl ──created_by──▸ Hamza Mounir ├──uses──▸ FastAPI └──part_of──▸ Python ├──uses──▸ SQLite └──uses──▸ FTS5 ├──uses──▸ Claude API ├── Haiku (extraction) └── Sonnet (agents) └──uses──▸ sentence-transformers └── all-MiniLM-L6-v2 384d, local CPU nodes: 38 edges: 44 types: 8
Documents

Write. Version. Search. Import.

Real editor. Real version control. Not a markdown textarea.

# Architecture Decisions ├── git log --oneline │ a2b6be8 refactor: split modules │ 18c9567 feat: hybrid search │ e7a7ec9 feat: re-ranking ├── entities extracted: 14 ├── relations created: 16 └── status: published

Git-backed versioning

Every save is a commit. Full history. Diff viewer. Restore any version. Each workspace has its own git repo — no shared state.

auto-commitdiffrestore
Tiptap Editor ━━━━━━━━━━━━━━━━━ / slash commands heading bullet list code block image table blockquote Formats: .md .pdf .csv Cmd+K command palette Cmd+/ AI panel Cmd+? entity sidebar

Rich editor + viewers

Tiptap WYSIWYG with slash commands. Images, tables, code blocks. Upload PDFs and CSVs — inline viewers render them. Cmd+K command palette for everything.

tiptappdfcsvmarkdown
$ POST /knowledge/ingest/url { "url": "https://docs.example.com/api" } fetch HTML convert to markdown create document run knowledge pipeline extract entities index into FTS5 + embeddings Done. One API call.

URL import pipeline

Paste a URL. System fetches, converts to markdown, creates a document, indexes it, extracts entities, links the graph. One click. SSRF-protected.

importhtml→mdpipeline
Agents

22 agents. 13 tools. Real autonomy.

Not chatbots. Agents with memory, goals, scheduled jobs, and the ability to delegate tasks to each other.

content

Writer

Generate docs from prompts

content

Editor

Review, grammar, consistency

research

Researcher

Deep analysis + citations

research

Analyst

Comparisons + insights

leadership

CEO

Strategy + decisions

leadership

CTO

Architecture + roadmap

product

PM

User stories + specs

automation

Knowledge Compiler

Synthesis from graph

marketing

SEO

Keywords + optimization

sales

Sales

Proposals + outreach

ops

Legal

Contracts + compliance

automation

Auditor

KB health + gaps

>

Function Calling

search_knowledge, read_document, create_document, delegate_to_agent, reflect_and_retrieve — 13 tools via Claude API

>

Memory + Goals

Persistent key-value memory. Goal tracking with progress bars. Context carries across conversations.

>

Task Delegation

Researcher → Writer → Editor. Priority levels. Status tracking. Inter-agent workflows.

>

Cron Jobs

Schedule agents via cron expressions. Daily summaries. Weekly reports. APScheduler backend.

>

Agentic RAG

Self-RAG assesses quality. CRAG rewrites bad queries. Iterative retrieval refines across 3 hops.

>

Streaming Chat

SSE streaming. 6 doc actions: summarize, simplify, translate, expand, find related, generate FAQ.

Platform

Everything else

Workflows, channels, dashboards, themes, auth. The full picture.

workflow

Workflows

YAML templates. Session execution. Human approval gates. Generate → Review → Publish.

@mention

Channels

Slack-like messaging. @mention agents — they respond via LLM. Pin. Reply threads.

dashboard

Knowledge Dashboard

Index status. Lint issues. Conflicts. Proposals. URL import. Evaluation scores. All in one view.

config

RAG Settings

Toggle dense search, re-ranking, query transform, graph retrieval. Chunk strategy. Per-workspace.

theme

10 Themes

Midnight, Aurora, Ember, Forest, Sakura, Meadow, Sky, Lavender, Light, Dark.

auth

Auth + RBAC

JWT. Owner / Admin / Editor / Viewer. Per-workspace API keys. CORS. SSRF protection.

Cmd+K

Command Palette

Search docs, navigate views, run actions. FTS5 results with RRF scores and chunk type badges.

onboard

Onboarding

First-run wizard. Company context. Agent template suggestions. Guided setup.

128Endpoints
32Tables
22Agents
13Tools
55/55Tests pass
Stack

Runs on one machine. No vendor lock-in.

Python

FastAPI

React

TypeScript

SQLite

WAL + FTS5

Claude

Anthropic

MiniLM

384d local

Git

per workspace