Introduction

LatticeDB is the world’s first production-grade hybrid graph/vector database that runs entirely in the browser with zero backend required.

What is LatticeDB?

LatticeDB combines two powerful database paradigms into a single, unified engine:

Vector Search Engine - HNSW-based approximate nearest neighbor search with SIMD acceleration
Graph Database - Full Cypher query language support with BFS/DFS traversal

This hybrid approach enables powerful use cases that neither paradigm can achieve alone, such as:

Finding similar documents AND their relationships
Semantic search with graph-based re-ranking
Knowledge graphs with embedding-based similarity

Key Features

Browser-Native Execution

LatticeDB compiles to WebAssembly and runs entirely in the browser:

Zero server costs - No backend infrastructure required
Sub-millisecond latency - No network round-trips
Privacy by default - Data never leaves the user’s device
Offline-capable - Works without internet connectivity

Extreme Performance

LatticeDB is faster than industry-standard databases:

vs Qdrant (Vector)	Speedup
Search	1.4x faster
Upsert	177x faster
Retrieve	52x faster
Scroll	7.4x faster

vs Neo4j (Graph)	Speedup
Node MATCH	62x faster
Filter queries	5-45x faster
ORDER BY	8x faster

API Compatibility

Qdrant-compatible REST API - Drop-in replacement for existing vector search apps
Cypher query language - Familiar syntax for graph operations
Rust, TypeScript, and Python client libraries

Use Cases

RAG Applications

Build retrieval-augmented generation apps that run entirely in the browser:

User Query → Vector Search → Context Retrieval → LLM Response

Knowledge Graphs

Create and query knowledge graphs with semantic similarity:

MATCH (doc:Document)-[:REFERENCES]->(topic:Topic)
WHERE doc.embedding <-> $query_embedding < 0.5
RETURN doc, topic

Personal AI Assistants

Build privacy-preserving AI assistants where all data stays local:

Chat history with semantic search
Personal knowledge bases
Offline-capable reasoning

Architecture Overview

┌─────────────────────────────────────────────────┐
│                   LatticeDB                      │
├─────────────────────────────────────────────────┤
│  ┌─────────────┐    ┌─────────────┐            │
│  │   Vector    │    │    Graph    │            │
│  │   Engine    │◄──►│   Engine    │            │
│  │  (HNSW)     │    │  (Cypher)   │            │
│  └─────────────┘    └─────────────┘            │
├─────────────────────────────────────────────────┤
│              Unified Storage Layer               │
│  (MemStorage | DiskStorage | OPFS)              │
└─────────────────────────────────────────────────┘

Getting Started

Ready to dive in? Start with the Installation guide.

License

LatticeDB is dual-licensed under MIT and Apache 2.0. Choose whichever license works best for your project.

Installation

LatticeDB can be used in multiple environments: as a Rust library, a standalone server, or in the browser via WebAssembly.

Rust Library

Add LatticeDB to your Cargo.toml:

[dependencies]
lattice-core = "0.1"
lattice-storage = "0.1"

For server applications with HTTP endpoints:

[dependencies]
lattice-server = "0.1"

Standalone Server

Pre-built Binaries

Download pre-built binaries from the releases page:

lattice-db-linux-x64 - Linux (x86_64)
lattice-db-macos-x64 - macOS (Intel)
lattice-db-macos-arm64 - macOS (Apple Silicon)
lattice-db-windows-x64 - Windows (x86_64)

From Source

# Clone the repository
git clone https://github.com/Avarok-Cybersecurity/lattice-db.git
cd lattice-db

# Build release binary
cargo build --release -p lattice-server

# Binary is at target/release/lattice-server

Running the Server

# Start server on default port 6333
./lattice-server

# Or specify a custom port
./lattice-server --port 8080

WASM / Browser

NPM Package

npm install lattice-db

CDN

<script type="module">
  import init, { LatticeDB } from 'https://unpkg.com/lattice-db/lattice_db.js';

  await init();
  const db = new LatticeDB();
</script>

Build from Source

# Install wasm-pack
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh

# Build WASM package
wasm-pack build crates/lattice-server --target web --out-dir pkg --no-default-features --features wasm

Docker

docker run -p 6333:6333 Avarok-Cybersecurity/lattice-db:latest

System Requirements

Native

Rust 1.75+ (for building from source)
64-bit OS (Linux, macOS, or Windows)
2GB RAM minimum (varies by dataset size)

WASM

Modern browser with WebAssembly support:
- Chrome 89+
- Firefox 89+
- Safari 15+
- Edge 89+
SIMD support recommended for best performance (enabled by default in modern browsers)

Feature Flags

LatticeDB uses feature flags for optional functionality:

Feature	Description	Default
`simd`	SIMD-accelerated distance calculations	Enabled
`mmap`	Memory-mapped vector storage	Disabled
`openapi`	OpenAPI/Swagger documentation	Disabled

Enable features in Cargo.toml:

[dependencies]
lattice-core = { version = "0.1", features = ["simd", "mmap"] }

Verify Installation

Native

# Run tests
cargo test --workspace

# Run benchmarks
cargo run -p lattice-bench --release --example quick_vector_bench

WASM

# Run WASM tests in headless Chrome
wasm-pack test --headless --chrome crates/lattice-core

Next Steps

Quick Start - Create your first collection
WASM Browser Setup - Detailed browser integration guide

Quick Start

This guide will help you create your first LatticeDB collection, add vectors and graph data, and run queries.

Creating a Collection

Rust

#![allow(unused)]
fn main() {
use lattice_core::{
    CollectionConfig, CollectionEngine, Distance, HnswConfig, VectorConfig,
};

// Create collection configuration
let config = CollectionConfig::new(
    "my_collection",
    VectorConfig::new(128, Distance::Cosine),  // 128-dimensional vectors
    HnswConfig {
        m: 16,                // Connections per node
        m0: 32,               // Layer 0 connections
        ml: 0.36,             // Level multiplier
        ef: 100,              // Search queue size
        ef_construction: 200, // Build quality
    },
);

// Create the collection engine
let mut engine = CollectionEngine::new(config)?;
}

REST API

curl -X PUT http://localhost:6333/collections/my_collection \
  -H "Content-Type: application/json" \
  -d '{
    "vectors": {
      "size": 128,
      "distance": "Cosine"
    },
    "hnsw_config": {
      "m": 16,
      "ef_construct": 200
    }
  }'

Adding Points (Vectors + Metadata)

Rust

#![allow(unused)]
fn main() {
use lattice_core::Point;

// Create a point with vector and metadata
// Payload values are stored as JSON-encoded bytes
let point = Point::new_vector(
    1,  // Point ID
    vec![0.1, 0.2, 0.3, /* ... 128 dimensions */],
)
.with_field("title", serde_json::to_vec("Introduction to LatticeDB").unwrap())
.with_field("category", serde_json::to_vec("documentation").unwrap());

// Upsert the point
engine.upsert_points(vec![point])?;
}

REST API

curl -X PUT http://localhost:6333/collections/my_collection/points \
  -H "Content-Type: application/json" \
  -d '{
    "points": [
      {
        "id": 1,
        "vector": [0.1, 0.2, 0.3],
        "payload": {
          "title": "Introduction to LatticeDB",
          "category": "documentation"
        }
      }
    ]
  }'

Vector Search

Rust

#![allow(unused)]
fn main() {
use lattice_core::SearchQuery;

// Create a search query
let query = SearchQuery::new(query_vector)
    .with_limit(10)       // Return top 10 results
    .with_ef(100);        // Search quality

// Execute search
let results = engine.search(&query)?;

for result in results {
    println!("ID: {}, Score: {}", result.id, result.score);
}
}

REST API

curl -X POST http://localhost:6333/collections/my_collection/points/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": [0.1, 0.2, 0.3],
    "limit": 10,
    "with_payload": true
  }'

Adding Graph Edges

Rust

#![allow(unused)]
fn main() {
// Add an edge between two points
engine.add_edge(
    1,               // Source point ID
    2,               // Target point ID
    "REFERENCES",    // Relation type
    0.9,             // Edge weight
)?;
}

REST API

curl -X POST http://localhost:6333/collections/my_collection/graph/edges \
  -H "Content-Type: application/json" \
  -d '{
    "source_id": 1,
    "target_id": 2,
    "weight": 0.9,
    "relation": "REFERENCES"
  }'

Cypher Queries

Rust

#![allow(unused)]
fn main() {
use lattice_core::{CypherHandler, DefaultCypherHandler};
use std::collections::HashMap;

let handler = DefaultCypherHandler::new();

// Execute a Cypher query
let result = handler.query(
    "MATCH (n:Document) WHERE n.category = 'documentation' RETURN n.title",
    &mut engine,
    HashMap::new(),
)?;

for row in result.rows {
    println!("{:?}", row);
}
}

REST API

curl -X POST http://localhost:6333/collections/my_collection/graph/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "MATCH (n:Document) WHERE n.category = $cat RETURN n.title",
    "params": {
      "cat": "documentation"
    }
  }'

Hybrid Query Example

Combine vector search with graph traversal:

#![allow(unused)]
fn main() {
// 1. Find similar documents via vector search
let similar = engine.search(&SearchQuery::new(query_vector).with_limit(5))?;

// 2. For each result, find related documents via graph
for result in similar {
    let cypher = format!(
        "MATCH (n)-[:REFERENCES]->(related) WHERE id(n) = {} RETURN related",
        result.id
    );
    let related = handler.query(&cypher, &mut engine, HashMap::new())?;

    println!("Document {} references: {:?}", result.id, related);
}
}

Complete Example

Here’s a complete example that demonstrates the hybrid capabilities:

use lattice_core::*;
use std::collections::HashMap;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create collection
    let config = CollectionConfig::new(
        "knowledge_base",
        VectorConfig::new(128, Distance::Cosine),
        HnswConfig::default_for_dim(128),
    );
    let mut engine = CollectionEngine::new(config)?;
    let handler = DefaultCypherHandler::new();

    // Add documents with embeddings
    for (id, title, embedding) in documents {
        let point = Point::new_vector(id, embedding)
            .with_field("title", serde_json::to_vec(&title).unwrap());
        engine.upsert_points(vec![point])?;
    }

    // Add relationships via Cypher
    handler.query(
        "MATCH (a:Document), (b:Document)
         WHERE a.title = 'Intro' AND b.title = 'Advanced'
         CREATE (a)-[:NEXT]->(b)",
        &mut engine,
        HashMap::new(),
    )?;

    // Hybrid query: similar docs + their neighbors
    let results = engine.search(&SearchQuery::new(query_embedding).with_limit(3))?;

    for result in results {
        let neighbors = handler.query(
            &format!("MATCH (n)-[r]->(m) WHERE id(n) = {} RETURN m, type(r)", result.id),
            &mut engine,
            HashMap::new(),
        )?;
        println!("Doc {}: {:?}", result.id, neighbors);
    }

    Ok(())
}

Next Steps

WASM Browser Setup - Run LatticeDB in the browser
HNSW Index - Understanding vector search
Cypher Query Language - Full Cypher reference

WASM Browser Setup

LatticeDB compiles to WebAssembly (WASM) and runs entirely in the browser. This guide covers integration options and best practices.

Quick Setup

ES Modules

<!DOCTYPE html>
<html>
<head>
  <title>LatticeDB Demo</title>
</head>
<body>
  <script type="module">
    import init, { LatticeDB } from './lattice_db.js';

    async function main() {
      // Initialize WASM module
      await init();

      // Create database instance
      const db = new LatticeDB();

      // Create a collection
      db.createCollection('my_collection', {
        vectors: { size: 128, distance: 'Cosine' }
      });

      // Add points
      db.upsert('my_collection', [
        {
          id: 1,
          vector: Array.from({ length: 128 }, () => 0.1),
          payload: { title: 'Hello World' }
        }
      ]);

      // Search
      const results = db.search(
        'my_collection',
        Array.from({ length: 128 }, () => 0.1),
        10  // limit
      );

      console.log('Results:', results);
    }

    main();
  </script>
</body>
</html>

Service Worker (Planned)

Note: Service Worker transport is planned for a future release. Currently, use the direct LatticeDB API from the main thread or a Web Worker.

Storage

LatticeDB currently uses in-memory storage by default. All data is lost when the page reloads.

Future releases will support:

Origin Private File System (OPFS) for persistent storage
IndexedDB fallback for older browsers

For now, if you need persistence, consider serializing your data to localStorage or IndexedDB separately.

Framework Integration

React

import { useEffect, useState, useRef } from 'react';
import init, { LatticeDB } from 'lattice-db';

function useLatticeDB(collectionName, vectorSize) {
  const dbRef = useRef(null);
  const [ready, setReady] = useState(false);

  useEffect(() => {
    async function initialize() {
      await init();
      const db = new LatticeDB();
      db.createCollection(collectionName, {
        vectors: { size: vectorSize, distance: 'Cosine' }
      });
      dbRef.current = db;
      setReady(true);
    }
    initialize();
  }, [collectionName, vectorSize]);

  return { db: dbRef.current, ready };
}

function SearchComponent() {
  const { db, ready } = useLatticeDB('my_collection', 128);
  const [results, setResults] = useState([]);

  const handleSearch = async (query) => {
    if (!db) return;
    const embedding = await getEmbedding(query); // Your embedding function
    const searchResults = db.search('my_collection', embedding, 10);
    setResults(searchResults);
  };

  if (!ready) return <div>Loading...</div>;

  return (
    <div>
      <input onChange={(e) => handleSearch(e.target.value)} />
      <ul>
        {results.map(r => <li key={r.id}>{r.payload?.title}</li>)}
      </ul>
    </div>
  );
}

Vue

<script setup>
import { ref, onMounted } from 'vue';
import init, { LatticeDB } from 'lattice-db';

const db = ref(null);
const results = ref([]);

onMounted(async () => {
  await init();
  const instance = new LatticeDB();
  instance.createCollection('my_collection', {
    vectors: { size: 128, distance: 'Cosine' }
  });
  db.value = instance;
});

function search(query) {
  const embedding = getEmbedding(query); // Your embedding function
  results.value = db.value.search('my_collection', embedding, 10);
}
</script>

<template>
  <input @input="search($event.target.value)" />
  <ul>
    <li v-for="r in results" :key="r.id">{{ r.payload?.title }}</li>
  </ul>
</template>

Performance Tips

1. Use TypedArrays

Always use Float32Array for vectors:

// Good - zero-copy transfer to WASM
const vector = new Float32Array([0.1, 0.2, 0.3, ...]);

// Bad - requires conversion
const vector = [0.1, 0.2, 0.3, ...];

2. Batch Operations

Batch upserts for better performance:

// Good - single WASM call with multiple points
db.upsert('my_collection', [point1, point2, point3]);

// Bad - multiple WASM calls
for (const point of points) {
  db.upsert('my_collection', [point]);
}

3. Web Workers

Offload heavy operations to a Web Worker:

// worker.js
import init, { LatticeDB } from 'lattice-db';

let db;

self.onmessage = async ({ data }) => {
  if (data.type === 'init') {
    await init();
    db = new LatticeDB();
    db.createCollection(data.collection, data.config);
    self.postMessage({ type: 'ready' });
  }

  if (data.type === 'search') {
    const results = db.search(data.collection, data.vector, data.limit);
    self.postMessage({ type: 'results', results });
  }
};

4. SIMD

WASM SIMD is enabled by default for 4-8x faster distance calculations. Ensure your bundler preserves SIMD instructions:

// vite.config.js
export default {
  optimizeDeps: {
    exclude: ['lattice-db']  // Don't transform WASM
  }
};

Browser Compatibility

Browser	WASM	SIMD	OPFS	Service Workers
Chrome 89+	✅	✅	✅	✅
Firefox 89+	✅	✅	✅	✅
Safari 15+	✅	✅	✅	✅
Edge 89+	✅	✅	✅	✅

Debugging

Memory Usage

Monitor WASM memory consumption:

const stats = db.memoryStats();
console.log(`Vectors: ${stats.vectorBytes} bytes`);
console.log(`Index: ${stats.indexBytes} bytes`);
console.log(`Total: ${stats.totalBytes} bytes`);

Performance Profiling

Use the browser’s Performance API:

performance.mark('search-start');
const results = await db.search(query);
performance.mark('search-end');
performance.measure('search', 'search-start', 'search-end');

const [measure] = performance.getEntriesByName('search');
console.log(`Search took ${measure.duration}ms`);

Next Steps

Architecture Overview - How LatticeDB works internally
Performance Tuning - Optimization strategies

Architecture Overview

LatticeDB is designed from the ground up for cross-platform execution: the same core logic runs identically on native servers and in WebAssembly browsers. This chapter explains the key architectural decisions that make this possible.

High-Level Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Application Layer                         │
├─────────────────────────────────────────────────────────────────┤
│  ┌───────────────────┐         ┌───────────────────┐            │
│  │   REST Handlers   │         │  Cypher Executor  │            │
│  │  (Qdrant-compat)  │         │  (Graph Queries)  │            │
│  └─────────┬─────────┘         └─────────┬─────────┘            │
│            │                             │                       │
│            └──────────┬──────────────────┘                       │
│                       ▼                                          │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    CollectionEngine                          ││
│  │  ┌─────────────────┐       ┌─────────────────┐              ││
│  │  │   HNSW Index    │◄─────►│   Graph Store   │              ││
│  │  │ (Vector Search) │       │   (Adjacency)   │              ││
│  │  └─────────────────┘       └─────────────────┘              ││
│  └─────────────────────────────────────────────────────────────┘│
├─────────────────────────────────────────────────────────────────┤
│                       Abstraction Boundary                       │
│  ┌─────────────────┐         ┌─────────────────┐                │
│  │ LatticeStorage  │         │ LatticeTransport│                │
│  │     (trait)     │         │     (trait)     │                │
│  └────────┬────────┘         └────────┬────────┘                │
├───────────┼────────────────────────────┼────────────────────────┤
│           ▼                            ▼                         │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                  Platform Implementations                    ││
│  │                                                              ││
│  │  Native:                     WASM:                           ││
│  │  ├─ DiskStorage              ├─ OpfsStorage                  ││
│  │  └─ AxumTransport            └─ ServiceWorkerTransport       ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Core Design Principles

1. Zero I/O in Core Logic

The lattice-core crate contains all business logic but never imports I/O primitives:

No std::fs or file operations
No tokio::net or networking
No web_sys or browser APIs

Instead, core logic defines traits (LatticeStorage, LatticeTransport) that abstract these operations. Platform-specific crates provide concrete implementations.

2. Page-Based Storage Model

All persistent data is organized into fixed-size pages:

#![allow(unused)]
fn main() {
// Storage is just pages and metadata
trait LatticeStorage {
    async fn read_page(&self, page_id: u64) -> StorageResult<Page>;
    async fn write_page(&self, page_id: u64, data: &[u8]) -> StorageResult<()>;
    async fn get_meta(&self, key: &str) -> StorageResult<Option<Vec<u8>>>;
    async fn set_meta(&self, key: &str, value: &[u8]) -> StorageResult<()>;
}
}

This maps naturally to different backends:

Backend	Page Access Pattern
Memory	`HashMap<u64, Vec<u8>>`
Disk	`offset = page_id * PAGE_SIZE`
OPFS	Same offset-based access

3. Async-First Design

All storage and transport operations are async:

#![allow(unused)]
fn main() {
// Works with tokio (native) or wasm-bindgen-futures (browser)
let page = storage.read_page(42).await?;
}

Platform differences are handled via conditional compilation:

#![allow(unused)]
fn main() {
// Native: requires Send + Sync for thread safety
#[cfg(not(target_arch = "wasm32"))]
#[async_trait]
pub trait LatticeStorage: Send + Sync { ... }

// WASM: single-threaded, no Send bounds
#[cfg(target_arch = "wasm32")]
#[async_trait(?Send)]
pub trait LatticeStorage { ... }
}

Data Flow

Vector Search Flow

Query Vector
     │
     ▼
┌─────────────┐
│ HNSW Index  │ ← Hierarchical navigation
└─────────────┘
     │
     ▼
┌─────────────┐
│   Layer 0   │ ← Dense candidate selection
└─────────────┘
     │
     ▼
┌─────────────┐
│  Distance   │ ← SIMD-accelerated comparison
│ Calculation │
└─────────────┘
     │
     ▼
┌─────────────┐
│  Top-K      │ ← Priority queue extraction
│  Results    │
└─────────────┘

Hybrid Query Flow

Cypher Query: MATCH (n)-[:SIMILAR]->()
              WHERE n.embedding <-> $query < 0.5
     │
     ▼
┌─────────────────┐
│  Cypher Parser  │
└─────────────────┘
     │
     ▼
┌─────────────────┐
│ Vector Predicate│ ← Recognized as embedding comparison
└─────────────────┘
     │
     ├─────────────────────────┐
     ▼                         ▼
┌─────────────┐        ┌─────────────┐
│ HNSW Search │        │Graph Traverse│
│ (candidates)│        │ (neighbors) │
└─────────────┘        └─────────────┘
     │                         │
     └───────────┬─────────────┘
                 ▼
         ┌─────────────┐
         │   Merge &   │
         │   Filter    │
         └─────────────┘
                 │
                 ▼
         ┌─────────────┐
         │   Results   │
         └─────────────┘

Memory Layout

Dense Vector Storage

Vectors are stored in a flat, cache-friendly layout:

┌────────────────────────────────────────────────────────┐
│ Vector 0: [f32; DIM] │ Vector 1: [f32; DIM] │ ...      │
└────────────────────────────────────────────────────────┘
                           │
                           ▼
           Access: base_ptr + (id * DIM * sizeof(f32))

This enables:

Cache-efficient sequential access during index construction
SIMD-friendly alignment for distance calculations
Zero-copy access via memory mapping (native) or typed arrays (WASM)

HNSW Layer Structure

Layer 2:  [sparse entry points]
              │
              ▼
Layer 1:  [medium density nodes]
              │
              ▼
Layer 0:  [all nodes, dense connections]

Each layer stores:

Node IDs present at that layer
Neighbor lists (connections to other nodes)
Entry points for search initialization

Thread Safety (Native)

On native platforms, LatticeDB uses concurrent data structures:

#![allow(unused)]
fn main() {
// Read-write lock for the HNSW index
let index = RwLock<HnswIndex>;

// Multiple readers can search simultaneously
let guard = index.read().await;
let results = guard.search(&query);

// Single writer for mutations
let mut guard = index.write().await;
guard.insert(point);
}

On WASM, all operations are single-threaded, but the same code works because Rust’s type system handles the Send/Sync bounds at compile time.

Error Handling

All operations return explicit Result types:

#![allow(unused)]
fn main() {
pub enum StorageError {
    PageNotFound { page_id: u64 },
    Io { message: String },
    Serialization { message: String },
    ReadOnly,
    CapacityExceeded,
}

pub type StorageResult<T> = Result<T, StorageError>;
}

Errors are never silently swallowed. The ? operator propagates errors up the call stack, and handlers convert them to appropriate HTTP status codes.

Next Steps

SBIO Pattern - Deep dive into the I/O abstraction pattern
Crate Structure - Detailed breakdown of each crate

SBIO Pattern

Separation of Business Logic and I/O (SBIO) is the architectural pattern that enables LatticeDB to run on both native servers and WebAssembly browsers from a single codebase.

The Problem

Traditional database code mixes business logic with I/O operations:

#![allow(unused)]
fn main() {
// Bad: Business logic directly performs I/O
fn search(&self, query: &[f32]) -> Vec<SearchResult> {
    // Direct file system access - won't work in browser!
    let index_data = std::fs::read("index.bin").unwrap();
    let index: HnswIndex = deserialize(&index_data);

    // Direct network call - different API on each platform
    let embeddings = reqwest::get("http://model-server/embed")
        .await.unwrap();

    index.search(query)
}
}

This code has several problems:

Platform-specific APIs: std::fs doesn’t exist in WASM
Tight coupling: Can’t swap storage backends
Untestable: Hard to mock file system in tests
Error handling: unwrap() hides failures

The Solution

SBIO separates concerns into three layers:

┌─────────────────────────────────────────┐
│        Business Logic (Pure)             │
│   - No I/O imports                       │
│   - Defines traits for dependencies      │
│   - Contains all algorithms              │
├─────────────────────────────────────────┤
│        Abstraction Boundary              │
│   - LatticeStorage trait                 │
│   - LatticeTransport trait               │
├─────────────────────────────────────────┤
│     Platform Implementations             │
│   - DiskStorage / OpfsStorage            │
│   - AxumTransport / ServiceWorker        │
└─────────────────────────────────────────┘

Storage Trait

The LatticeStorage trait abstracts all persistent storage:

#![allow(unused)]
fn main() {
/// Abstract storage interface (SBIO boundary)
#[async_trait]
pub trait LatticeStorage: Send + Sync {
    /// Retrieve metadata by key
    async fn get_meta(&self, key: &str) -> StorageResult<Option<Vec<u8>>>;

    /// Store metadata
    async fn set_meta(&self, key: &str, value: &[u8]) -> StorageResult<()>;

    /// Read a page by ID
    async fn read_page(&self, page_id: u64) -> StorageResult<Page>;

    /// Write a page (create or overwrite)
    async fn write_page(&self, page_id: u64, data: &[u8]) -> StorageResult<()>;

    /// Flush pending writes to durable storage
    async fn sync(&self) -> StorageResult<()>;
}
}

Why Pages?

The page-based model is chosen because it maps naturally to all storage backends:

Backend	Page Implementation
MemStorage	`HashMap<u64, Vec<u8>>` - pages are hash map entries
DiskStorage	File with `offset = page_id * PAGE_SIZE`
OpfsStorage	OPFS file with same offset calculation
IndexedDB	Could use page_id as object store key

This abstraction is low-level enough to be efficient but high-level enough to hide platform differences.

Transport Trait

The LatticeTransport trait abstracts the “server” concept:

#![allow(unused)]
fn main() {
/// Abstract transport interface (SBIO boundary)
#[async_trait]
pub trait LatticeTransport: Send + Sync {
    type Error: std::error::Error + Send + Sync + 'static;

    /// Start serving requests
    async fn serve<H, Fut>(self, handler: H) -> Result<(), Self::Error>
    where
        H: Fn(LatticeRequest) -> Fut + Send + Sync + Clone + 'static,
        Fut: Future<Output = LatticeResponse> + Send + 'static;
}
}

The request/response types are platform-agnostic:

#![allow(unused)]
fn main() {
pub struct LatticeRequest {
    pub method: String,      // GET, POST, PUT, DELETE
    pub path: String,        // /collections/{name}/points
    pub body: Vec<u8>,       // JSON payload
    pub headers: HashMap<String, String>,
}

pub struct LatticeResponse {
    pub status: u16,         // HTTP status code
    pub body: Vec<u8>,       // JSON response
    pub headers: HashMap<String, String>,
}
}

Platform Implementations

Native (Server)

#![allow(unused)]
fn main() {
// DiskStorage: File-based storage using tokio::fs
pub struct DiskStorage {
    data_file: tokio::fs::File,
    meta_file: tokio::fs::File,
}

impl LatticeStorage for DiskStorage {
    async fn read_page(&self, page_id: u64) -> StorageResult<Page> {
        let offset = page_id * PAGE_SIZE;
        self.data_file.seek(SeekFrom::Start(offset)).await?;
        let mut buf = vec![0u8; PAGE_SIZE];
        self.data_file.read_exact(&mut buf).await?;
        Ok(buf)
    }
}

// AxumTransport: HTTP server using Axum
pub struct AxumTransport {
    bind_addr: SocketAddr,
}

impl LatticeTransport for AxumTransport {
    async fn serve<H, Fut>(self, handler: H) -> Result<(), Self::Error> {
        let app = Router::new()
            .fallback(move |req| convert_and_call(handler.clone(), req));
        axum::Server::bind(&self.bind_addr)
            .serve(app.into_make_service())
            .await
    }
}
}

WASM (Browser)

#![allow(unused)]
fn main() {
// OpfsStorage: Origin Private File System
pub struct OpfsStorage {
    root: web_sys::FileSystemDirectoryHandle,
}

impl LatticeStorage for OpfsStorage {
    async fn read_page(&self, page_id: u64) -> StorageResult<Page> {
        let file = self.root.get_file_handle("data.bin").await?;
        let blob = file.get_file().await?;
        let offset = page_id * PAGE_SIZE;
        let slice = blob.slice_with_i32_and_i32(offset, offset + PAGE_SIZE)?;
        let array_buffer = slice.array_buffer().await?;
        Ok(js_sys::Uint8Array::new(&array_buffer).to_vec())
    }
}

// ServiceWorkerTransport: Fetch event interception
pub struct ServiceWorkerTransport;

impl LatticeTransport for ServiceWorkerTransport {
    async fn serve<H, Fut>(self, handler: H) -> Result<(), Self::Error> {
        let closure = Closure::wrap(Box::new(move |event: FetchEvent| {
            let request = convert_fetch_to_lattice(event.request());
            let future = handler(request);
            event.respond_with(&future_to_promise(async move {
                let response = future.await;
                convert_lattice_to_fetch(response)
            }));
        }));

        // Register for fetch events
        js_sys::global()
            .add_event_listener_with_callback("fetch", closure.as_ref())
    }
}
}

WASM Conditional Compilation

The traits have different bounds for native vs WASM:

#![allow(unused)]
fn main() {
// Native: requires Send + Sync for multi-threaded runtime
#[cfg(not(target_arch = "wasm32"))]
#[async_trait]
pub trait LatticeStorage: Send + Sync { ... }

// WASM: single-threaded, no Send bounds needed
#[cfg(target_arch = "wasm32")]
#[async_trait(?Send)]
pub trait LatticeStorage { ... }
}

The ?Send annotation tells async_trait that the futures don’t need to be Send, which is required because JavaScript’s Promise is not Send.

Benefits

1. Testability

Business logic can be tested with MemStorage:

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_search() {
    let storage = MemStorage::new();
    let engine = CollectionEngine::new(config, storage);

    engine.upsert(point).await.unwrap();
    let results = engine.search(&query).await.unwrap();

    assert_eq!(results.len(), 1);
}
}

No file system setup, no cleanup, no flaky tests.

2. Single Codebase

The same CollectionEngine code works everywhere:

#![allow(unused)]
fn main() {
// Server
let engine = CollectionEngine::new(config, DiskStorage::new(path));

// Browser
let engine = CollectionEngine::new(config, OpfsStorage::new());

// Tests
let engine = CollectionEngine::new(config, MemStorage::new());
}

3. Explicit Dependencies

All I/O dependencies are visible in function signatures:

#![allow(unused)]
fn main() {
// Clear: this function needs storage
async fn build_index<S: LatticeStorage>(storage: &S) { ... }

// Hidden: what I/O does this do?
async fn build_index() { ... }  // Bad!
}

4. Error Propagation

Explicit Result types force error handling:

#![allow(unused)]
fn main() {
pub enum StorageError {
    PageNotFound { page_id: u64 },
    Io { message: String },
    Serialization { message: String },
    ReadOnly,
    CapacityExceeded,
}

// Errors propagate via ?
let page = storage.read_page(42)?;
}

Common Patterns

Dependency Injection

#![allow(unused)]
fn main() {
pub struct CollectionEngine<S: LatticeStorage> {
    storage: S,
    index: HnswIndex,
}

impl<S: LatticeStorage> CollectionEngine<S> {
    pub fn new(config: CollectionConfig, storage: S) -> Self {
        Self { storage, index: HnswIndex::new(config) }
    }
}
}

Factory Functions

#![allow(unused)]
fn main() {
// Platform-specific factory
#[cfg(not(target_arch = "wasm32"))]
pub fn create_storage(path: &Path) -> impl LatticeStorage {
    DiskStorage::new(path)
}

#[cfg(target_arch = "wasm32")]
pub fn create_storage() -> impl LatticeStorage {
    OpfsStorage::new()
}
}

Next Steps

Crate Structure - How the codebase is organized
HNSW Index - Vector search implementation

Crate Structure

LatticeDB is organized as a Cargo workspace with multiple crates, each with a specific responsibility. This chapter explains the purpose and contents of each crate.

Workspace Overview

lattice-db/
├── Cargo.toml              # Workspace configuration
├── crates/
│   ├── lattice-core/       # Pure business logic
│   ├── lattice-storage/    # Storage implementations
│   ├── lattice-server/     # HTTP/WASM server
│   └── lattice-bench/      # Benchmarks
└── book/                   # This documentation

Dependency Graph

                    ┌─────────────────┐
                    │  lattice-bench  │
                    │  (benchmarks)   │
                    └────────┬────────┘
                             │
              ┌──────────────┴──────────────┐
              │                             │
              ▼                             ▼
    ┌─────────────────┐           ┌─────────────────┐
    │ lattice-server  │           │ lattice-storage │
    │ (HTTP/WASM API) │           │ (Storage impls) │
    └────────┬────────┘           └────────┬────────┘
             │                             │
             └──────────────┬──────────────┘
                            │
                            ▼
                  ┌─────────────────┐
                  │  lattice-core   │
                  │ (Pure business  │
                  │     logic)      │
                  └─────────────────┘

lattice-core

Purpose: Contains all business logic with zero I/O dependencies.

Modules

Module	Description
`types/`	Data structures: `Point`, `Edge`, `Vector`, configs
`index/`	HNSW algorithm, distance metrics, quantization
`graph/`	Adjacency storage, traversal algorithms
`cypher/`	Cypher parser and query executor
`engine/`	`CollectionEngine` that orchestrates everything
`storage.rs`	`LatticeStorage` trait definition
`transport.rs`	`LatticeTransport` trait definition
`error.rs`	Error types and `Result` aliases

Key Types

#![allow(unused)]
fn main() {
// Re-exported from lattice-core
pub use engine::collection::CollectionEngine;
pub use index::hnsw::HnswIndex;
pub use storage::{LatticeStorage, StorageError};
pub use transport::{LatticeRequest, LatticeResponse, LatticeTransport};
pub use types::collection::{CollectionConfig, Distance, HnswConfig};
pub use types::point::{Edge, Point, PointId, Vector};
pub use types::query::{SearchQuery, SearchResult};
}

Feature Flags

Feature	Description	Default
`simd`	SIMD-accelerated distance calculations	Enabled

Dependencies

lattice-core has minimal dependencies to stay portable:

[dependencies]
async-trait = "0.1"
thiserror = "1.0"
serde = { version = "1.0", features = ["derive"] }

# SIMD for distance calculations (optional)
wide = { version = "0.7", optional = true }

No I/O crates: No tokio, std::fs, reqwest, or web_sys.

lattice-storage

Purpose: Platform-specific LatticeStorage implementations.

Implementations

Type	Platform	Description
`MemStorage`	All	In-memory HashMap, for testing
`DiskStorage`	Native	File-based using `tokio::fs`
`OpfsStorage`	WASM	Browser Origin Private File System

Feature Flags

Feature	Description	Default
`native`	Enables `DiskStorage`	Disabled
`wasm`	Enables `OpfsStorage`	Disabled

Usage

# Server application
[dependencies]
lattice-storage = { version = "0.1", features = ["native"] }

# Browser application
[dependencies]
lattice-storage = { version = "0.1", features = ["wasm"] }

Code Example

#![allow(unused)]
fn main() {
use lattice_storage::MemStorage;

// In-memory storage for testing
let storage = MemStorage::new();
storage.write_page(0, b"hello").await?;
let page = storage.read_page(0).await?;
}

lattice-server

Purpose: HTTP API and transport implementations.

Modules

Module	Description
`dto/`	Data Transfer Objects (JSON serialization)
`handlers/`	Request handlers for each endpoint
`router.rs`	Route matching and dispatch
`axum_transport.rs`	Native HTTP server (Axum)
`service_worker.rs`	WASM fetch event handler
`openapi.rs`	OpenAPI documentation generator

REST API Endpoints

PUT    /collections/{name}              Create collection
GET    /collections/{name}              Get collection info
DELETE /collections/{name}              Delete collection

PUT    /collections/{name}/points       Upsert points
POST   /collections/{name}/points/query Vector search
POST   /collections/{name}/points/scroll Paginated retrieval

POST   /collections/{name}/graph/edges  Add graph edges
POST   /collections/{name}/graph/query  Cypher query

Feature Flags

Feature	Description	Default
`native`	Enables `AxumTransport`	Disabled
`wasm`	Enables `ServiceWorkerTransport`	Disabled
`openapi`	Enables OpenAPI documentation	Disabled

Usage

// Native server
use lattice_server::{axum_transport::AxumTransport, router::*};

#[tokio::main]
async fn main() {
    let state = new_app_state();
    let transport = AxumTransport::new("0.0.0.0:6333");

    transport.serve(move |request| {
        let state = state.clone();
        async move { route(state, request).await }
    }).await.unwrap();
}

lattice-bench

Purpose: Benchmarks comparing LatticeDB to Qdrant and Neo4j.

Benchmarks

Benchmark	Description
`vector_ops`	Vector operations (search, upsert, retrieve, scroll)
`cypher_comparison`	Cypher queries vs Neo4j
`quick_vector_bench`	Fast iteration benchmark

Running Benchmarks

# All benchmarks
cargo bench -p lattice-bench

# Specific benchmark
cargo bench -p lattice-bench --bench vector_ops

# Quick iteration
cargo run -p lattice-bench --release --example quick_vector_bench

Output

Benchmarks use Criterion and output:

Console summary with mean/stddev
HTML reports in target/criterion/
JSON data for CI integration

Building for Different Platforms

Native Server

cargo build --release -p lattice-server --features native

WASM Browser

# Install wasm-pack
cargo install wasm-pack

# Build WASM package
wasm-pack build crates/lattice-server \
    --target web \
    --out-dir pkg \
    --no-default-features \
    --features wasm

All Tests

# Native tests
cargo test --workspace

# WASM tests (requires Chrome)
wasm-pack test --headless --chrome crates/lattice-core

Adding New Features

New Storage Backend

Create implementation in lattice-storage/src/:

#![allow(unused)]
fn main() {
// my_storage.rs
pub struct MyStorage { ... }

impl LatticeStorage for MyStorage {
    async fn read_page(&self, page_id: u64) -> StorageResult<Page> { ... }
    // ... other methods
}
}

Add feature flag in Cargo.toml:

[features]
my-backend = ["some-dependency"]

[dependencies]
some-dependency = { version = "1.0", optional = true }

Conditionally export:

#![allow(unused)]
fn main() {
#[cfg(feature = "my-backend")]
pub mod my_storage;

#[cfg(feature = "my-backend")]
pub use my_storage::MyStorage;
}

New API Endpoint

Add handler in lattice-server/src/handlers/:

#![allow(unused)]
fn main() {
pub async fn my_handler<S: LatticeStorage>(
    state: &AppState<S>,
    request: &LatticeRequest,
) -> LatticeResponse {
    // Handle request
}
}

Add route in router.rs:

#![allow(unused)]
fn main() {
("POST", path) if path.ends_with("/my-endpoint") => {
    my_handler(state, request).await
}
}

Add DTO types if needed in dto/.

Next Steps

HNSW Index - How vector search works
Cypher Language - Graph query implementation

HNSW Index

LatticeDB uses HNSW (Hierarchical Navigable Small World) for approximate nearest neighbor search. This chapter explains how the algorithm works and how to tune it for your use case.

Algorithm Overview

HNSW constructs a multi-layer graph where:

Upper layers are sparse and allow quick navigation to the general region
Lower layers are dense and enable precise local search
Layer 0 contains all points with the maximum connectivity

Layer 2:  ●───────────────────────●  (sparse, fast navigation)
          │                       │
          ▼                       ▼
Layer 1:  ●─────●─────●─────●─────●  (medium density)
          │     │     │     │     │
          ▼     ▼     ▼     ▼     ▼
Layer 0:  ●─●─●─●─●─●─●─●─●─●─●─●─●  (dense, all points)

Search Process

Starting from a random entry point at the top layer:

Find the nearest neighbor in the current layer (greedy search)
Use that node as the entry point for the next layer
Repeat until reaching layer 0

#![allow(unused)]
fn main() {
// Simplified pseudocode
let mut current = entry_point;
for layer in (1..=max_layer).rev() {
    current = find_nearest_in_layer(query, current, layer);
}
}

Phase 2: Layer 0 Search

At layer 0, perform a beam search to find ef candidates:

Maintain a priority queue of candidates (sorted by distance)
Explore neighbors of the closest unvisited candidate
Add promising neighbors to the queue
Stop when the closest candidate is farther than the farthest result

#![allow(unused)]
fn main() {
// Returns top-k results from ef candidates
let candidates = search_layer(query, entry_points, ef, layer=0);
candidates.into_iter().take(k).collect()
}

Configuration

HnswConfig Parameters

#![allow(unused)]
fn main() {
pub struct HnswConfig {
    pub m: usize,               // Connections per node (default: 16)
    pub m0: usize,              // Layer 0 connections (default: 2 * m = 32)
    pub ml: f64,                // Level multiplier (default: 1/ln(m))
    pub ef: usize,              // Search queue size (default: 100)
    pub ef_construction: usize, // Build queue size (default: 200)
}
}

Parameter Guidelines

Parameter	Description	Trade-off
`m`	Connections per node in upper layers	Higher = better recall, more memory
`m0`	Connections per node in layer 0	Higher = better recall, more memory
`ef`	Search queue size	Higher = better recall, slower search
`ef_construction`	Build queue size	Higher = better index quality, slower build

Recommended Values

Dataset Size	m	m0	ef_construction	Notes
< 10K	8	16	100	Low memory, fast build
10K - 100K	16	32	200	Balanced (default)
100K - 1M	24	48	300	Higher recall
> 1M	32	64	400	Maximum recall

Usage

Basic Search

#![allow(unused)]
fn main() {
use lattice_core::{HnswIndex, Distance, HnswConfig};

// Create index with default config
let config = HnswConfig::default();
let mut index = HnswIndex::new(config, Distance::Cosine);

// Insert points
for point in points {
    index.insert(&point);
}

// Search: find 10 nearest neighbors with ef=100
let results = index.search(&query_vector, k=10, ef=100);
}

Batch Search

For multiple queries, batch search is more efficient:

#![allow(unused)]
fn main() {
// Prepare query references
let query_refs: Vec<&[f32]> = queries.iter()
    .map(|q| q.as_slice())
    .collect();

// Parallel batch search (uses rayon on native)
let results = index.search_batch(&query_refs, k=10, ef=100);
}

Adjusting ef at Search Time

ef can be adjusted per-query to trade off speed vs recall:

#![allow(unused)]
fn main() {
// Fast search (lower recall)
let quick_results = index.search(&query, 10, ef=50);

// High-recall search (slower)
let accurate_results = index.search(&query, 10, ef=200);
}

Memory Layout

Dense Vector Storage

Vectors are stored in a flat, contiguous array for cache efficiency:

┌──────────────────────────────────────────────────────┐
│ [v0_d0, v0_d1, ..., v0_dn] [v1_d0, v1_d1, ..., v1_dn] │
└──────────────────────────────────────────────────────┘
     Vector 0 (dim=n)            Vector 1 (dim=n)

Access pattern: &vectors[id * dim .. (id + 1) * dim]

Benefits:

Cache-friendly: Sequential memory access during index construction
SIMD-friendly: Aligned access for vectorized distance calculations
Predictable: O(1) indexed access via dense array

Layer Storage

Each layer stores:

Node list: Points present at this layer
Neighbor lists: Connections for each node

#![allow(unused)]
fn main() {
struct HnswNode {
    id: PointId,
    level: u16,  // Highest layer this node appears in
    // neighbors[layer] = Vec<PointId>
    neighbors: SmallVec<[Vec<PointId>; 4]>,
}
}

Optimizations

1. Shortcut Search (VLDB 2025)

If the best neighbor doesn’t change at a layer, we can skip redundant distance calculations:

#![allow(unused)]
fn main() {
// Track if we improved at each layer
let (new_current, new_dist, improved) =
    search_layer_single_with_shortcut(query, current, current_dist, layer);

if !improved && layer > 1 {
    // Can potentially skip to next layer faster
    continue;
}
}

2. Software Prefetching

Hide memory latency by prefetching future vectors:

#![allow(unused)]
fn main() {
// Prefetch vectors ahead of current iteration
const PREFETCH_DISTANCE: usize = 4;

for (i, &id) in neighbor_ids.iter().enumerate() {
    if i + PREFETCH_DISTANCE < neighbor_ids.len() {
        prefetch_read(vectors.get_ptr(neighbor_ids[i + PREFETCH_DISTANCE]));
    }
    // Calculate distance for current id
    distances.push(calc_distance(query, vectors.get(id)));
}
}

3. Thread-Local Scratch Space

Avoid allocation per search by reusing scratch space:

#![allow(unused)]
fn main() {
thread_local! {
    static SCRATCH: RefCell<SearchScratch> = RefCell::new(SearchScratch::new());
}

fn search_layer(&self, ...) -> Vec<Candidate> {
    SCRATCH.with(|scratch| {
        let mut scratch = scratch.borrow_mut();
        scratch.clear();  // Reuse allocated memory
        // ... perform search ...
    })
}
}

4. Vec-Based Results

Use a Vec instead of BinaryHeap for results, with periodic compaction:

#![allow(unused)]
fn main() {
// Add to results
results.push(candidate);

// Compact when 2x over limit (amortizes sort cost)
if results.len() >= ef * 2 {
    results.sort_unstable_by(|a, b| a.distance.partial_cmp(&b.distance).unwrap());
    results.truncate(ef);
}
}

PQ Acceleration

For very large indexes, Product Quantization can accelerate search:

#![allow(unused)]
fn main() {
// Build PQ accelerator (one-time cost)
let accelerator = index.build_pq_accelerator(
    m: 8,  // 8 subvectors for 128-dim
    training_sample_size: 10000,
);

// Search with PQ-accelerated coarse filtering
let results = index.search_with_pq(
    &query,
    k: 10,
    ef: 100,
    &accelerator,
    rerank_factor: 3,  // Re-rank top 30 candidates
);
}

See Quantization for details.

Performance Characteristics

Operation	Complexity	Notes
Insert	O(log N × m × ef_construction)	Per-point amortized
Search	O(log N × m × ef)	Per-query
Delete	O(m)	Just removes node and edges
Memory	O(N × (dim + m × layers))	Vectors + graph

Typical search latency (128-dim, 100K vectors, cosine):

ef=50: ~50 µs
ef=100: ~100 µs
ef=200: ~200 µs

Next Steps

Distance Metrics - Choosing the right distance function
Quantization - Memory-efficient storage
SIMD Optimization - Hardware acceleration

Distance Metrics

LatticeDB supports multiple distance metrics for vector similarity search. This chapter explains each metric and when to use it.

Available Metrics

Cosine Distance

#![allow(unused)]
fn main() {
Distance::Cosine
}

Measures the angle between vectors, ignoring magnitude:

cosine_distance(a, b) = 1 - (a · b) / (||a|| × ||b||)

Range: [0, 2]

0 = identical direction
1 = orthogonal (90°)
2 = opposite direction

Best for:

Text embeddings (word2vec, BERT, etc.)
Normalized vectors
When magnitude doesn’t matter

Example:

#![allow(unused)]
fn main() {
let calc = DistanceCalculator::new(Distance::Cosine);

let a = vec![1.0, 0.0];
let b = vec![0.707, 0.707];  // 45° angle

let dist = calc.calculate(&a, &b);
// dist ≈ 0.293 (1 - cos(45°))
}

Euclidean Distance (L2)

#![allow(unused)]
fn main() {
Distance::Euclid
}

Standard straight-line distance:

euclidean_distance(a, b) = sqrt(Σ(aᵢ - bᵢ)²)

Range: [0, ∞)

0 = identical vectors

Best for:

Image embeddings
Geographic coordinates
Physical measurements
When absolute differences matter

Example:

#![allow(unused)]
fn main() {
let calc = DistanceCalculator::new(Distance::Euclid);

let a = vec![0.0, 0.0];
let b = vec![3.0, 4.0];

let dist = calc.calculate(&a, &b);
// dist = 5.0 (3-4-5 triangle)
}

Dot Product Distance

#![allow(unused)]
fn main() {
Distance::Dot
}

Negated dot product (so lower = more similar):

dot_distance(a, b) = -(a · b)

Range: (-∞, ∞)

More negative = more similar (higher original dot product)

Best for:

Maximum Inner Product Search (MIPS)
Recommendation systems
Pre-normalized vectors where you want raw similarity scores

Example:

#![allow(unused)]
fn main() {
let calc = DistanceCalculator::new(Distance::Dot);

let a = vec![1.0, 2.0, 3.0];
let b = vec![4.0, 5.0, 6.0];

let dist = calc.calculate(&a, &b);
// dist = -32 (negated: 1*4 + 2*5 + 3*6 = 32)
}

Choosing a Metric

Use Case	Recommended Metric	Reason
Text embeddings	Cosine	Angle-based similarity, magnitude-invariant
Image embeddings	Euclidean	Pixel-level differences
Pre-normalized vectors	Dot	Faster (no normalization needed)
Recommendations	Dot	Higher dot product = higher relevance
Geographic data	Euclidean	Physical distance

Cosine vs Dot Product

If your vectors are unit-normalized (||v|| = 1), cosine and dot product are equivalent:

For unit vectors: cosine_similarity = dot_product
Therefore: cosine_distance = 1 - dot_product

LatticeDB provides a fast path for normalized vectors:

#![allow(unused)]
fn main() {
// Fast cosine distance for pre-normalized vectors (25-30% faster)
let dist = cosine_distance_normalized(&normalized_a, &normalized_b);
}

Implementation Details

Distance Calculator

All distance functions are accessed through DistanceCalculator:

#![allow(unused)]
fn main() {
use lattice_core::{DistanceCalculator, Distance};

let calc = DistanceCalculator::new(Distance::Cosine);

// Single calculation
let dist = calc.calculate(&vec_a, &vec_b);

// Get the metric type
assert_eq!(calc.metric(), Distance::Cosine);
}

Lower is Better

All distance functions return values where lower = more similar:

#![allow(unused)]
fn main() {
// Identical vectors
let same = calc.calculate(&v, &v);
// same ≈ 0.0 (for all metrics)

// Most similar results first
results.sort_by(|a, b| a.distance.partial_cmp(&b.distance).unwrap());
}

This convention enables consistent use with min-heaps in search algorithms.

Dimension Requirements

All vectors must have the same dimension:

#![allow(unused)]
fn main() {
let a = vec![1.0, 2.0];
let b = vec![1.0, 2.0, 3.0];

// Panics in debug builds:
// calc.calculate(&a, &b);  // Dimension mismatch!

// In release builds, behavior is undefined
}

The index validates dimensions at insertion time.

Performance

SIMD Acceleration

Distance calculations are SIMD-accelerated on supported platforms:

Platform	Instruction Set	Vectors Processed
x86_64	AVX2 + FMA	8 floats/cycle
aarch64	NEON	4-16 floats/cycle (4x unrolled)
WASM	Scalar	4 floats (auto-vectorized)

Scalar Fallback

For small vectors or unsupported platforms, scalar code with 4x unrolling:

#![allow(unused)]
fn main() {
// Unrolled for better auto-vectorization
for i in 0..chunks {
    let base = i * 4;
    let d0 = a[base] - b[base];
    let d1 = a[base + 1] - b[base + 1];
    let d2 = a[base + 2] - b[base + 2];
    let d3 = a[base + 3] - b[base + 3];
    sum += d0*d0 + d1*d1 + d2*d2 + d3*d3;
}
}

Benchmark Results

Typical throughput for 128-dimensional vectors:

Metric	Scalar	SIMD (x86)	SIMD (aarch64)
Cosine	120 ns	25 ns	20 ns
Euclidean	100 ns	20 ns	15 ns
Dot	90 ns	18 ns	14 ns

Best Practices

1. Normalize Early

If using cosine distance, normalize vectors once at insertion:

#![allow(unused)]
fn main() {
fn normalize(v: &mut [f32]) {
    let norm: f32 = v.iter().map(|x| x * x).sum::<f32>().sqrt();
    if norm > 0.0 {
        for x in v.iter_mut() {
            *x /= norm;
        }
    }
}

// Normalize before insertion
normalize(&mut embedding);
index.insert(&Point::new_vector(id, embedding));
}

2. Use Consistent Metrics

Always use the same metric for insertion and search:

#![allow(unused)]
fn main() {
// Create index with cosine distance
let mut index = HnswIndex::new(config, Distance::Cosine);

// Insertions use cosine distance internally
index.insert(&point);

// Searches use cosine distance
let results = index.search(&query, k, ef);
}

3. Check Vector Quality

Validate embeddings before insertion:

#![allow(unused)]
fn main() {
fn validate_vector(v: &[f32]) -> bool {
    // Check for NaN/Inf
    if v.iter().any(|&x| x.is_nan() || x.is_infinite()) {
        return false;
    }

    // Check for zero vectors (problematic for cosine)
    let norm: f32 = v.iter().map(|x| x * x).sum();
    if norm < 1e-10 {
        return false;
    }

    true
}
}

Next Steps

SIMD Optimization - Hardware-specific acceleration
Quantization - Memory-efficient distance computation

Quantization

Quantization reduces memory usage and can accelerate distance computation. LatticeDB supports two quantization methods:

Scalar Quantization (SQ): 4x memory reduction (f32 → i8)
Product Quantization (PQ): 32-64x memory reduction

Scalar Quantization

Overview

Scalar quantization maps each f32 value to an i8 using min-max scaling:

quantized[i] = round((original[i] - offset) / scale)
dequantized[i] = quantized[i] * scale + offset

where:
  offset = min(original)
  scale = (max(original) - min(original)) / 255

Memory Savings

Dimension	Original (f32)	Quantized (i8)	Savings
128	512 bytes	136 bytes	3.8x
256	1024 bytes	264 bytes	3.9x
512	2048 bytes	520 bytes	3.9x
1024	4096 bytes	1032 bytes	4.0x

The 8-byte overhead is for scale and offset values.

Usage

#![allow(unused)]
fn main() {
use lattice_core::QuantizedVector;

// Quantize a vector
let original = vec![0.1, 0.5, 0.9, -0.3, 0.0];
let quantized = QuantizedVector::quantize(&original);

// Check memory usage
println!("Original: {} bytes", original.len() * 4);
println!("Quantized: {} bytes", quantized.memory_size());

// Dequantize (lossy)
let recovered = quantized.dequantize();

// Asymmetric distance (quantized DB vector vs f32 query)
let query = vec![0.2, 0.4, 0.8, -0.2, 0.1];
let distance = quantized.dot_distance_asymmetric(&query);
}

Asymmetric Distance

For search, we compute distance between:

Database vectors: Quantized (memory-efficient)
Query vector: Full precision (accurate)

#![allow(unused)]
fn main() {
impl QuantizedVector {
    // Quantized × f32 distance
    pub fn dot_distance_asymmetric(&self, query: &[f32]) -> f32;
    pub fn euclidean_distance_asymmetric(&self, query: &[f32]) -> f32;
    pub fn cosine_distance_asymmetric(&self, query: &[f32]) -> f32;
}
}

Accuracy

Scalar quantization maintains good accuracy for most use cases:

Metric	Recall@10 (128-dim, 100K vectors)
Original (f32)	99.5%
Quantized (i8)	98.2%

The 1-2% recall loss is often acceptable given the 4x memory savings.

Product Quantization (PQ)

Overview

PQ achieves higher compression by:

Splitting vectors into M subvectors
Quantizing each subvector to a cluster centroid
Storing only the cluster index (1 byte per subvector)

Original: [f32; 128] = 512 bytes
Split:    8 subvectors of [f32; 16]
Quantize: 8 cluster indices = 8 bytes
Compression: 64x

Building a PQ Accelerator

#![allow(unused)]
fn main() {
// After building the HNSW index
let accelerator = index.build_pq_accelerator(
    m: 8,                    // 8 subvectors
    training_sample_size: 10000,  // Vectors for training
);

// Check compression
println!("Compression: {}x", accelerator.compression_ratio());
// Output: Compression: 64x (for 128-dim vectors)
}

PQ-Accelerated Search

PQ enables two-phase search:

Coarse filtering: Fast approximate distances using PQ codes
Re-ranking: Exact distances for top candidates

#![allow(unused)]
fn main() {
let results = index.search_with_pq(
    &query,
    k: 10,
    ef: 100,
    &accelerator,
    rerank_factor: 3,  // Re-rank 30 candidates for final 10
);
}

Distance Table Lookup

For each query, PQ builds a distance table:

#![allow(unused)]
fn main() {
// Precompute distances from query to all centroids
let dist_table = accelerator.pq.build_distance_table(&query);

// O(M) distance lookup instead of O(D)
let approx_dist = accelerator.approximate_distance_with_table(&dist_table, point_id);
}

This reduces distance computation from O(D) to O(M), where M << D.

PQ Memory Usage

Dimension	M	Original	PQ Code	Codebook	Total Overhead
128	8	512 bytes/vec	8 bytes/vec	128 KB	64x compression
256	16	1024 bytes/vec	16 bytes/vec	256 KB	64x compression
512	32	2048 bytes/vec	32 bytes/vec	512 KB	64x compression

PQ vs SQ Trade-offs

Aspect	Scalar Quantization	Product Quantization
Compression	4x	32-64x
Accuracy loss	~1-2%	~3-5%
Build time	Fast (O(N))	Slow (k-means training)
Search speed	Same as f32	Faster (O(M) distances)
Memory per vector	N bytes	M bytes
Use case	Moderate memory savings	Maximum compression

Combining Quantization Methods

For very large datasets, combine both methods:

#![allow(unused)]
fn main() {
// 1. Quantize database vectors with SQ (4x savings)
let quantized_vectors: Vec<QuantizedVector> = vectors
    .iter()
    .map(|v| QuantizedVector::quantize(v))
    .collect();

// 2. Build PQ accelerator for fast search (additional 16x in search)
let accelerator = index.build_pq_accelerator(8, 10000);

// 3. Two-phase search:
//    - PQ for coarse filtering (very fast)
//    - SQ for re-ranking (accurate enough)
let candidates = index.search_with_pq(&query, 100, 200, &accelerator, 1);
let results = rerank_with_sq(&candidates, &quantized_vectors, &query, k);
}

When to Use Quantization

Dataset Size	Recommendation
< 100K	No quantization needed
100K - 1M	Scalar quantization
1M - 10M	Scalar + PQ acceleration
> 10M	Full PQ with re-ranking

Best Practices

1. Train PQ on Representative Data

#![allow(unused)]
fn main() {
// Use a sample that represents your data distribution
let training_sample: Vec<Vector> = dataset
    .iter()
    .step_by(dataset.len() / 10000)
    .cloned()
    .collect();

let accelerator = index.build_pq_accelerator(8, training_sample.len());
}

2. Choose M Based on Dimension

#![allow(unused)]
fn main() {
// Rule of thumb: dim / M should be >= 8
let m = match dim {
    d if d <= 64 => 4,
    d if d <= 128 => 8,
    d if d <= 256 => 16,
    d if d <= 512 => 32,
    _ => 64,
};
}

3. Re-rank Enough Candidates

#![allow(unused)]
fn main() {
// Higher rerank_factor = better recall, slower search
let rerank_factor = match recall_target {
    r if r >= 0.99 => 5,
    r if r >= 0.95 => 3,
    _ => 2,
};
}

4. Update PQ Incrementally

#![allow(unused)]
fn main() {
// When adding new vectors
index.insert(&new_point);
accelerator.add(new_point.id, &new_point.vector);

// When removing vectors
index.delete(point_id);
accelerator.remove(point_id);
}

Next Steps

SIMD Optimization - Hardware acceleration for distance calculations
Performance Tuning - Overall optimization strategies

SIMD Optimization

LatticeDB uses Single Instruction, Multiple Data (SIMD) instructions to accelerate distance calculations. This chapter explains the SIMD implementation and how to maximize performance.

Supported Platforms

Platform	Instruction Set	Vectors per Cycle	Feature
x86_64	AVX2 + FMA	8 × f32	`simd`
aarch64	NEON	4-16 × f32 (unrolled)	`simd`
WASM	Scalar (auto-vectorized)	4 × f32	-

Enabling SIMD

SIMD is enabled by default via the simd feature flag:

# Cargo.toml
[dependencies]
lattice-core = { version = "0.1", features = ["simd"] }  # default

To disable SIMD (for debugging or compatibility):

[dependencies]
lattice-core = { version = "0.1", default-features = false }

Runtime Detection

LatticeDB detects SIMD support at runtime and falls back to scalar code if unavailable:

#![allow(unused)]
fn main() {
// x86_64: Check for AVX2 + FMA
#[cfg(all(feature = "simd", target_arch = "x86_64"))]
static SIMD_SUPPORT: OnceLock<bool> = OnceLock::new();

fn has_avx2_fma() -> bool {
    *SIMD_SUPPORT.get_or_init(|| {
        is_x86_feature_detected!("avx2") && is_x86_feature_detected!("fma")
    })
}
}

The check is cached after the first call, so there’s no per-operation overhead.

x86_64 Implementation (AVX2)

Cosine Distance

Processes 8 floats per iteration:

#![allow(unused)]
fn main() {
#[target_feature(enable = "avx2")]
pub unsafe fn cosine_distance_avx2(a: &[f32], b: &[f32]) -> f32 {
    let len = a.len();
    let chunks = len / 8;

    let mut dot_sum = _mm256_setzero_ps();
    let mut norm_a_sum = _mm256_setzero_ps();
    let mut norm_b_sum = _mm256_setzero_ps();

    for i in 0..chunks {
        let base = i * 8;
        let va = _mm256_loadu_ps(a.as_ptr().add(base));
        let vb = _mm256_loadu_ps(b.as_ptr().add(base));

        // Fused multiply-add: dot_sum += va * vb
        dot_sum = _mm256_fmadd_ps(va, vb, dot_sum);
        norm_a_sum = _mm256_fmadd_ps(va, va, norm_a_sum);
        norm_b_sum = _mm256_fmadd_ps(vb, vb, norm_b_sum);
    }

    // Horizontal sum and scalar remainder handling
    let dot = hsum_avx(dot_sum) + scalar_remainder(...);
    let norm_a = hsum_avx(norm_a_sum) + scalar_remainder(...);
    let norm_b = hsum_avx(norm_b_sum) + scalar_remainder(...);

    1.0 - (dot / (norm_a * norm_b).sqrt())
}
}

Euclidean Distance

#![allow(unused)]
fn main() {
#[target_feature(enable = "avx2")]
pub unsafe fn euclidean_distance_avx2(a: &[f32], b: &[f32]) -> f32 {
    let mut sum = _mm256_setzero_ps();

    for i in 0..chunks {
        let va = _mm256_loadu_ps(a.as_ptr().add(i * 8));
        let vb = _mm256_loadu_ps(b.as_ptr().add(i * 8));
        let diff = _mm256_sub_ps(va, vb);
        sum = _mm256_fmadd_ps(diff, diff, sum);  // sum += diff²
    }

    hsum_avx(sum).sqrt()
}
}

Horizontal Sum

Reduces 8 floats to 1:

#![allow(unused)]
fn main() {
#[target_feature(enable = "avx2")]
unsafe fn hsum_avx(v: __m256) -> f32 {
    // [a,b,c,d,e,f,g,h] -> [a+e,b+f,c+g,d+h]
    let vlow = _mm256_castps256_ps128(v);
    let vhigh = _mm256_extractf128_ps(v, 1);
    let sum128 = _mm_add_ps(vlow, vhigh);

    // [a+e,b+f,c+g,d+h] -> [a+e+c+g,b+f+d+h]
    let hi64 = _mm_movehl_ps(sum128, sum128);
    let sum64 = _mm_add_ps(sum128, hi64);

    // Final reduction
    let hi32 = _mm_shuffle_ps(sum64, sum64, 1);
    _mm_cvtss_f32(_mm_add_ss(sum64, hi32))
}
}

aarch64 Implementation (NEON)

4x Unrolling

Apple Silicon (M1/M2) can sustain 4 FMA operations per cycle, so we unroll 4x:

#![allow(unused)]
fn main() {
pub unsafe fn cosine_distance_neon(a: &[f32], b: &[f32]) -> f32 {
    let chunks16 = a.len() / 16;

    // 4 accumulators for pipeline utilization
    let mut dot0 = vdupq_n_f32(0.0);
    let mut dot1 = vdupq_n_f32(0.0);
    let mut dot2 = vdupq_n_f32(0.0);
    let mut dot3 = vdupq_n_f32(0.0);

    for i in 0..chunks16 {
        let base = i * 16;

        // Load 16 floats (4 NEON registers each)
        let va0 = vld1q_f32(a.as_ptr().add(base));
        let va1 = vld1q_f32(a.as_ptr().add(base + 4));
        let va2 = vld1q_f32(a.as_ptr().add(base + 8));
        let va3 = vld1q_f32(a.as_ptr().add(base + 12));

        let vb0 = vld1q_f32(b.as_ptr().add(base));
        let vb1 = vld1q_f32(b.as_ptr().add(base + 4));
        let vb2 = vld1q_f32(b.as_ptr().add(base + 8));
        let vb3 = vld1q_f32(b.as_ptr().add(base + 12));

        // 4 independent FMAs per cycle
        dot0 = vfmaq_f32(dot0, va0, vb0);
        dot1 = vfmaq_f32(dot1, va1, vb1);
        dot2 = vfmaq_f32(dot2, va2, vb2);
        dot3 = vfmaq_f32(dot3, va3, vb3);
    }

    // Combine accumulators
    let sum = vaddq_f32(vaddq_f32(dot0, dot1), vaddq_f32(dot2, dot3));
    vaddvq_f32(sum)  // NEON horizontal sum
}
}

Performance Impact

Platform	Scalar	SIMD	Speedup
x86_64 (AVX2)	120 ns	25 ns	4.8x
M1 (NEON, 1x)	90 ns	22 ns	4.1x
M1 (NEON, 4x)	90 ns	14 ns	6.4x

Scalar Fallback

For small vectors or unsupported platforms:

#![allow(unused)]
fn main() {
fn cosine_distance_scalar(a: &[f32], b: &[f32]) -> f32 {
    // 4x unroll for better compiler auto-vectorization
    let chunks = a.len() / 4;

    for i in 0..chunks {
        let base = i * 4;
        let (a0, a1, a2, a3) = (a[base], a[base+1], a[base+2], a[base+3]);
        let (b0, b1, b2, b3) = (b[base], b[base+1], b[base+2], b[base+3]);

        dot += a0*b0 + a1*b1 + a2*b2 + a3*b3;
        norm_a += a0*a0 + a1*a1 + a2*a2 + a3*a3;
        norm_b += b0*b0 + b1*b1 + b2*b2 + b3*b3;
    }
    // ... handle remainder
}
}

WASM Considerations

No SIMD.js by Default

WebAssembly SIMD is available but requires explicit opt-in:

# .cargo/config.toml
[target.wasm32-unknown-unknown]
rustflags = ["-C", "target-feature=+simd128"]

LatticeDB uses scalar code for WASM to ensure broad browser compatibility.

Browser SIMD Support

Browser	WASM SIMD	Status
Chrome 91+	✅	Stable
Firefox 89+	✅	Stable
Safari 16.4+	✅	Stable
Edge 91+	✅	Stable

Dispatch Logic

The distance functions automatically select the best implementation:

#![allow(unused)]
fn main() {
pub fn cosine_distance(a: &[f32], b: &[f32]) -> f32 {
    // x86_64: Use AVX2 for vectors >= 16 elements
    #[cfg(all(feature = "simd", target_arch = "x86_64"))]
    {
        if a.len() >= 16 && has_avx2_fma() {
            return unsafe { simd_x86::cosine_distance_avx2(a, b) };
        }
    }

    // aarch64: Use NEON for vectors >= 8 elements
    #[cfg(all(feature = "simd", target_arch = "aarch64"))]
    {
        if a.len() >= 8 {
            return unsafe { simd_neon::cosine_distance_neon(a, b) };
        }
    }

    // Fallback
    cosine_distance_scalar(a, b)
}
}

Benchmarking SIMD

Quick Benchmark

cargo run -p lattice-bench --release --example quick_vector_bench

Full Criterion Benchmark

cargo bench -p lattice-bench --bench vector_ops

Compare Scalar vs SIMD

#![allow(unused)]
fn main() {
// Force scalar for comparison
let scalar_time = {
    let start = Instant::now();
    for _ in 0..iterations {
        let _ = cosine_distance_scalar(&a, &b);
    }
    start.elapsed()
};

// Dispatch (uses SIMD if available)
let simd_time = {
    let start = Instant::now();
    for _ in 0..iterations {
        let _ = cosine_distance(&a, &b);
    }
    start.elapsed()
};

println!("Speedup: {:.2}x", scalar_time.as_nanos() / simd_time.as_nanos());
}

Tips for Maximum Performance

1. Use Aligned Vectors

Aligned loads are faster than unaligned:

#![allow(unused)]
fn main() {
// Allocate aligned memory (not directly supported in stable Rust)
// LatticeDB uses unaligned loads for flexibility
}

2. Prefer Power-of-2 Dimensions

#![allow(unused)]
fn main() {
// Good: Multiple of 8 (AVX2) or 16 (NEON 4x)
let dim = 128;  // 16 chunks of 8

// Less efficient: Remainder handling needed
let dim = 100;  // 12 chunks of 8 + 4 remainder
}

3. Batch Operations

Amortize function call overhead:

#![allow(unused)]
fn main() {
// Good: Batch multiple distances
let distances = index.calc_distances_batch(&query, &neighbor_ids);

// Less efficient: Individual calls
for id in neighbor_ids {
    let dist = calc_distance(&query, &vectors[id]);
}
}

4. Keep Vectors Hot

Access vectors sequentially to keep them in cache:

#![allow(unused)]
fn main() {
// Good: Sequential access
for id in 0..n {
    process(vectors.get_by_idx(id));
}

// Less efficient: Random access
for id in random_ids {
    process(vectors.get(id));  // Cache misses
}
}

Next Steps

Performance Tuning - Overall optimization strategies
Benchmarks - Detailed performance numbers

Graph Model

LatticeDB implements a property graph model where nodes (points) can have properties and labeled edges connecting them. This chapter explains the graph data model and how to work with it.

Core Concepts

Nodes (Points)

In LatticeDB, nodes are called Points. Each point has:

ID: Unique 64-bit identifier (PointId)
Vector: Dense embedding for similarity search
Payload: Key-value properties
Labels: Optional node type labels

#![allow(unused)]
fn main() {
use lattice_core::{Point, Edge};

// Create a node with vector and properties
let point = Point::new_vector(1, vec![0.1, 0.2, 0.3])
    .with_payload("name", "Alice")
    .with_payload("age", 30)
    .with_label("Person");
}

Edges

Edges connect nodes with optional:

Weight: f32 similarity/relevance score
Relation type: String label for the relationship

#![allow(unused)]
fn main() {
// Edge from node 1 to node 2
let edge = Edge::new(2, 0.9, "KNOWS");
// Fields: target_id, weight, relation_type
}

Graph Structure

The graph is stored as an adjacency list:

Point 1 → [(Edge to 2), (Edge to 3)]
Point 2 → [(Edge to 3)]
Point 3 → [(Edge to 1)]

This enables efficient:

O(1) neighbor lookup by source node
O(E/N) average edge retrieval
O(1) edge insertion

Working with Graphs

Adding Edges

#![allow(unused)]
fn main() {
use lattice_core::{CollectionEngine, Edge};

// Add a single edge
engine.add_edge(1, Edge::new(2, 0.9, "REFERENCES"))?;

// Add multiple edges
engine.add_edge(1, Edge::new(3, 0.7, "REFERENCES"))?;
engine.add_edge(2, Edge::new(3, 0.8, "CITES"))?;
}

Getting Neighbors

#![allow(unused)]
fn main() {
// Get all outgoing edges from node 1
let neighbors = engine.get_neighbors(1)?;

for edge in neighbors {
    println!("→ {} (weight: {}, type: {})",
        edge.target, edge.weight, edge.relation);
}
}

REST API

# Add an edge
curl -X POST http://localhost:6333/collections/my_collection/graph/edges \
  -H "Content-Type: application/json" \
  -d '{
    "source_id": 1,
    "target_id": 2,
    "weight": 0.9,
    "relation": "REFERENCES"
  }'

# Get neighbors
curl http://localhost:6333/collections/my_collection/graph/neighbors/1

Edge Properties

Weight

Edge weight is a f32 value typically representing:

Similarity: Higher = more similar (0.0 to 1.0)
Relevance: Higher = more relevant
Distance: Lower = closer (depending on use case)

#![allow(unused)]
fn main() {
// High-confidence relationship
Edge::new(2, 0.95, "CONFIRMED_MATCH");

// Lower confidence
Edge::new(3, 0.6, "POSSIBLE_MATCH");
}

Relation Types

Relation types are strings that categorize edges:

#![allow(unused)]
fn main() {
// Document relationships
Edge::new(2, 0.9, "REFERENCES");
Edge::new(3, 0.8, "CITES");
Edge::new(4, 0.7, "RELATED_TO");

// Social relationships
Edge::new(2, 1.0, "KNOWS");
Edge::new(3, 1.0, "WORKS_WITH");

// Hierarchical relationships
Edge::new(2, 1.0, "PARENT_OF");
Edge::new(3, 1.0, "CHILD_OF");
}

Directionality

Edges in LatticeDB are directed:

Node A --[KNOWS]--> Node B

This means:

Edge A→B exists
Edge B→A does NOT automatically exist

To create bidirectional relationships:

#![allow(unused)]
fn main() {
// Bidirectional KNOWS relationship
engine.add_edge(1, Edge::new(2, 1.0, "KNOWS"))?;
engine.add_edge(2, Edge::new(1, 1.0, "KNOWS"))?;
}

Or use Cypher with variable-length patterns:

// Match edges in either direction
MATCH (a)-[:KNOWS]-(b)
WHERE id(a) = 1
RETURN b

Labels (Node Types)

Labels categorize nodes:

#![allow(unused)]
fn main() {
let person = Point::new_vector(1, embedding)
    .with_label("Person")
    .with_label("Employee");  // Multiple labels

let company = Point::new_vector(2, embedding)
    .with_label("Company");
}

Query by label:

-- Find all Person nodes
MATCH (n:Person) RETURN n

-- Find Person nodes that are also Employees
MATCH (n:Person:Employee) RETURN n

Hybrid Queries

The power of LatticeDB is combining vector search with graph traversal:

Vector Search → Graph Expansion

#![allow(unused)]
fn main() {
// 1. Find similar documents via vector search
let similar = engine.search(&SearchQuery::new(query_vector).with_limit(5))?;

// 2. Expand each result via graph traversal
for result in similar {
    let references = engine.get_neighbors_by_type(result.id, "REFERENCES")?;
    println!("Document {} references: {:?}", result.id, references);
}
}

Cypher with Vector Predicates

-- Find similar documents and their references
MATCH (doc:Document)-[:REFERENCES]->(ref:Document)
WHERE doc.embedding <-> $query_embedding < 0.5
RETURN doc.title, ref.title

Graph → Vector Reranking

#![allow(unused)]
fn main() {
// 1. Get candidates from graph traversal
let cypher_results = handler.query(
    "MATCH (n:Person)-[:WORKS_AT]->(c:Company {name: 'Acme'}) RETURN n",
    &mut engine,
    params,
)?;

// 2. Rerank by vector similarity
let mut candidates: Vec<_> = cypher_results.rows
    .iter()
    .filter_map(|row| {
        let id = row.get("n")?.as_node_id()?;
        let vec = engine.get_vector(id)?;
        let dist = distance.calculate(&query_vec, vec);
        Some((id, dist))
    })
    .collect();

candidates.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap());
}

Memory Layout

Adjacency Storage

Edges are stored in a HashMap<PointId, Vec<Edge>>:

┌────────────────────────────────────────┐
│ Node 1 → [Edge(2, 0.9), Edge(3, 0.7)]  │
│ Node 2 → [Edge(3, 0.8)]                │
│ Node 3 → [Edge(1, 0.5)]                │
└────────────────────────────────────────┘

Memory per edge: ~20 bytes (target_id + weight + relation string pointer)

Index Structure

For efficient traversal, LatticeDB maintains:

Forward index: Source → [Edges] (primary storage)
Point lookup: ID → Point (for properties)
Label index: Label → [PointIds] (for label queries)

Best Practices

1. Use Meaningful Relation Types

#![allow(unused)]
fn main() {
// Good: Specific, queryable
Edge::new(2, 0.9, "AUTHORED_BY");
Edge::new(3, 0.8, "PUBLISHED_IN");

// Bad: Generic, hard to filter
Edge::new(2, 0.9, "RELATED");
}

2. Normalize Weights

#![allow(unused)]
fn main() {
// Good: Consistent 0-1 scale
Edge::new(2, 0.95, "HIGH_CONFIDENCE");
Edge::new(3, 0.60, "MEDIUM_CONFIDENCE");

// Bad: Inconsistent scales
Edge::new(2, 100.0, "TYPE_A");
Edge::new(3, 0.6, "TYPE_B");
}

3. Consider Edge Density

High edge counts per node can impact traversal performance:

Edges per Node	Performance	Use Case
1-10	Excellent	Typical relationships
10-100	Good	Dense graphs
100+	Consider filtering	Social networks

4. Batch Edge Operations

#![allow(unused)]
fn main() {
// Good: Batch insert
let edges = vec![
    (1, Edge::new(2, 0.9, "TYPE")),
    (1, Edge::new(3, 0.8, "TYPE")),
    (2, Edge::new(3, 0.7, "TYPE")),
];
for (source, edge) in edges {
    engine.add_edge(source, edge)?;
}

// Even better: Use Cypher CREATE for multiple edges
handler.query(
    "UNWIND $edges AS e CREATE (a)-[r:TYPE {weight: e.weight}]->(b)
     WHERE id(a) = e.source AND id(b) = e.target",
    &mut engine,
    params,
)?;
}

Next Steps

Cypher Query Language - Graph query syntax
Traversal Algorithms - BFS, DFS, and path finding

Cypher Query Language

LatticeDB implements a subset of the openCypher query language for graph operations. This chapter covers the supported syntax and patterns.

Query Structure

A typical Cypher query follows this structure:

MATCH <pattern>
WHERE <predicate>
RETURN <expression>
ORDER BY <expression>
SKIP <number>
LIMIT <number>

MATCH Clause

Node Patterns

Match all nodes:

MATCH (n) RETURN n

Match nodes with a label:

MATCH (n:Person) RETURN n

Match nodes with multiple labels:

MATCH (n:Person:Employee) RETURN n

Match nodes with properties:

MATCH (n:Person {name: "Alice"}) RETURN n

Relationship Patterns

Match outgoing relationships:

MATCH (a)-[r]->(b) RETURN a, r, b

Match with relationship type:

MATCH (a)-[r:KNOWS]->(b) RETURN a, b

Match with multiple relationship types:

MATCH (a)-[r:KNOWS|WORKS_WITH]->(b) RETURN a, b

Match in either direction:

MATCH (a)-[r:KNOWS]-(b) RETURN a, b

Variable-Length Paths

Match paths of specific length:

MATCH (a)-[*2]->(b) RETURN a, b  -- Exactly 2 hops

Match paths within a range:

MATCH (a)-[*1..3]->(b) RETURN a, b  -- 1 to 3 hops

Match paths up to a limit:

MATCH (a)-[*..5]->(b) RETURN a, b  -- Up to 5 hops

WHERE Clause

Comparison Operators

WHERE n.age > 25
WHERE n.age >= 25
WHERE n.age < 30
WHERE n.age <= 30
WHERE n.name = "Alice"
WHERE n.name <> "Bob"

Logical Operators

WHERE n.age > 25 AND n.active = true
WHERE n.role = "admin" OR n.role = "superuser"
WHERE NOT n.deleted

String Matching

WHERE n.name STARTS WITH "Al"
WHERE n.email ENDS WITH "@example.com"
WHERE n.description CONTAINS "important"

List Membership

WHERE n.status IN ["active", "pending"]
WHERE n.category IN $allowed_categories

NULL Checks

WHERE n.email IS NOT NULL
WHERE n.deleted IS NULL

Property Existence

WHERE exists(n.email)
WHERE NOT exists(n.deleted_at)

RETURN Clause

Return Nodes and Properties

RETURN n                    -- Return entire node
RETURN n.name              -- Return single property
RETURN n.name, n.age       -- Return multiple properties
RETURN n.name AS fullName  -- Alias

Aggregations

RETURN count(n)                    -- Count
RETURN count(DISTINCT n.category)  -- Distinct count
RETURN sum(n.amount)               -- Sum
RETURN avg(n.score)                -- Average
RETURN min(n.created_at)           -- Minimum
RETURN max(n.updated_at)           -- Maximum
RETURN collect(n.name)             -- Collect into list

Grouping

MATCH (n:Person)
RETURN n.department, count(n) AS count

ORDER BY, SKIP, LIMIT

Sorting

ORDER BY n.name              -- Ascending (default)
ORDER BY n.name ASC          -- Explicit ascending
ORDER BY n.name DESC         -- Descending
ORDER BY n.last_name, n.first_name  -- Multiple columns

Pagination

SKIP 10 LIMIT 20            -- Skip first 10, return next 20
LIMIT 100                   -- Return first 100

CREATE Clause

Create Nodes

CREATE (n:Person {name: "Alice", age: 30})

Create Relationships

MATCH (a:Person {name: "Alice"}), (b:Person {name: "Bob"})
CREATE (a)-[:KNOWS {since: 2020}]->(b)

Create with RETURN

CREATE (n:Person {name: "Charlie"})
RETURN n

DELETE Clause

Delete Nodes

MATCH (n:Person {name: "DeleteMe"})
DELETE n

Delete Relationships

MATCH (a)-[r:KNOWS]->(b)
WHERE a.name = "Alice" AND b.name = "Bob"
DELETE r

DETACH DELETE (Nodes + Edges)

MATCH (n:Person {name: "DeleteMe"})
DETACH DELETE n  -- Deletes node and all its relationships

SET Clause

Update Properties

MATCH (n:Person {name: "Alice"})
SET n.age = 31

Set Multiple Properties

MATCH (n:Person {name: "Alice"})
SET n.age = 31, n.updated = true

Remove Property

MATCH (n:Person {name: "Alice"})
SET n.temp = NULL  -- Removes the property

Parameters

Use $ prefix for query parameters:

MATCH (n:Person {name: $name})
WHERE n.age > $min_age
RETURN n
LIMIT $limit

Rust Usage

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use lattice_core::CypherValue;

let mut params = HashMap::new();
params.insert("name".to_string(), CypherValue::String("Alice".into()));
params.insert("min_age".to_string(), CypherValue::Int(25));
params.insert("limit".to_string(), CypherValue::Int(10));

let result = handler.query(
    "MATCH (n:Person {name: $name}) WHERE n.age > $min_age RETURN n LIMIT $limit",
    &mut engine,
    params,
)?;
}

REST API

curl -X POST http://localhost:6333/collections/my_collection/graph/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "MATCH (n:Person {name: $name}) RETURN n",
    "params": {
      "name": "Alice"
    }
  }'

Query Examples

Find Connected Nodes

-- Find all people Alice knows
MATCH (alice:Person {name: "Alice"})-[:KNOWS]->(friend)
RETURN friend.name

Find Paths

-- Find path from Alice to Bob (up to 4 hops)
MATCH path = (alice:Person {name: "Alice"})-[*1..4]->(bob:Person {name: "Bob"})
RETURN path

Aggregation Query

-- Count employees by department
MATCH (e:Employee)-[:WORKS_IN]->(d:Department)
RETURN d.name AS department, count(e) AS employee_count
ORDER BY employee_count DESC

Subgraph Extraction

-- Get a subgraph around a node
MATCH (center:Person {name: "Alice"})-[r*1..2]-(connected)
RETURN center, r, connected

Hybrid Vector + Graph

-- Find similar documents and their authors
MATCH (doc:Document)-[:AUTHORED_BY]->(author:Person)
WHERE doc.embedding <-> $query_embedding < 0.5
RETURN doc.title, author.name
ORDER BY doc.embedding <-> $query_embedding
LIMIT 10

Query Execution

Architecture

Query String
    │
    ▼
┌─────────────┐
│   Parser    │  Pest grammar → AST
└─────────────┘
    │
    ▼
┌─────────────┐
│  Planner    │  AST → Logical Plan
└─────────────┘
    │
    ▼
┌─────────────┐
│  Executor   │  Execute against storage
└─────────────┘
    │
    ▼
  Results

Handler API

#![allow(unused)]
fn main() {
use lattice_core::cypher::{CypherHandler, DefaultCypherHandler};

let handler = DefaultCypherHandler::new();

// Execute a query
let result = handler.query(
    "MATCH (n:Person) RETURN n.name",
    &mut engine,
    HashMap::new(),
)?;

// Process results
for row in result.rows {
    if let Some(name) = row.get("name") {
        println!("Name: {:?}", name);
    }
}

// Check execution stats
println!("Rows returned: {}", result.stats.rows_returned);
println!("Execution time: {:?}", result.stats.execution_time);
}

Limitations

Current implementation supports a subset of openCypher:

Feature	Status	Notes
MATCH	✅	Single and multi-pattern
WHERE	✅	Basic predicates
RETURN	✅	Properties, aliases, aggregates
CREATE	✅	Nodes and relationships
DELETE	✅	With DETACH
SET	✅	Property updates
LIMIT/SKIP	✅	Pagination
ORDER BY	✅	Single/multi column
WITH	⚠️	Basic support
UNWIND	⚠️	Limited
MERGE	❌	Not yet implemented
FOREACH	❌	Not yet implemented
CALL	❌	Procedures not supported

Next Steps

Traversal Algorithms - BFS, DFS, and path finding
Graph Model - Understanding the data model

Traversal Algorithms

LatticeDB provides iterators for graph traversal, enabling efficient exploration of connected nodes. This chapter covers BFS, DFS, and path finding.

Traversal Types

Algorithm	Order	Use Case
BFS	Level by level	Shortest paths, nearest neighbors
DFS	Deep first	Exhaustive search, topological sort

Breadth-First Search (BFS)

BFS visits nodes level by level, finding shortest paths first.

Algorithm

Start: [A]
Level 0: A
Level 1: B, C (neighbors of A)
Level 2: D, E (neighbors of B, C)
...

Usage

#![allow(unused)]
fn main() {
use lattice_core::graph::BfsIterator;

// Define neighbor lookup function
let get_neighbors = |node_id| {
    engine.get_neighbors(node_id)
        .map(|edges| edges.iter().map(|e| e.target).collect())
        .unwrap_or_default()
};

// Create BFS iterator
let bfs = BfsIterator::new(
    start_node,    // Starting node ID
    max_depth: 3,  // Maximum traversal depth
    get_neighbors,
);

// Iterate over nodes with depth
for (node_id, depth) in bfs {
    println!("Node {} at depth {}", node_id, depth);
}
}

Example: Find All Nodes Within 2 Hops

#![allow(unused)]
fn main() {
let within_2_hops: Vec<PointId> = BfsIterator::new(start, 2, get_neighbors)
    .map(|(id, _)| id)
    .collect();
}

Example: Level-Grouped Results

#![allow(unused)]
fn main() {
let mut levels: HashMap<usize, Vec<PointId>> = HashMap::new();

for (node_id, depth) in BfsIterator::new(start, 5, get_neighbors) {
    levels.entry(depth).or_default().push(node_id);
}

for level in 0..=5 {
    if let Some(nodes) = levels.get(&level) {
        println!("Level {}: {} nodes", level, nodes.len());
    }
}
}

Depth-First Search (DFS)

DFS explores as far as possible along each branch before backtracking.

Algorithm

Start: [A]
Visit A → B → D (go deep)
Backtrack to B → E
Backtrack to A → C → F
...

Usage

#![allow(unused)]
fn main() {
use lattice_core::graph::DfsIterator;

let dfs = DfsIterator::new(
    start_node,
    max_depth: 10,
    get_neighbors,
);

for (node_id, depth) in dfs {
    println!("Visiting node {} at depth {}", node_id, depth);
}
}

Example: Find All Reachable Nodes

#![allow(unused)]
fn main() {
let reachable: HashSet<PointId> = DfsIterator::new(start, usize::MAX, get_neighbors)
    .map(|(id, _)| id)
    .collect();
}

Example: Path Recording

#![allow(unused)]
fn main() {
let mut paths: Vec<Vec<PointId>> = Vec::new();
let mut current_path = Vec::new();

for (node_id, depth) in DfsIterator::new(start, 5, get_neighbors) {
    // Truncate path to current depth
    current_path.truncate(depth);
    current_path.push(node_id);

    // If leaf node, save path
    if get_neighbors(node_id).is_empty() {
        paths.push(current_path.clone());
    }
}
}

GraphPath

GraphPath represents a sequence of nodes with total weight.

Creation

#![allow(unused)]
fn main() {
use lattice_core::graph::GraphPath;

// Start a path
let path = GraphPath::new(1);
assert_eq!(path.len(), 0);  // No edges yet

// Extend the path
let path = path.extend(2, 0.5);  // Add node 2 with edge weight 0.5
assert_eq!(path.len(), 1);  // One edge

let path = path.extend(3, 0.3);
assert_eq!(path.len(), 2);
assert_eq!(path.total_weight, 0.8);
}

Properties

#![allow(unused)]
fn main() {
let path = GraphPath::new(1)
    .extend(2, 0.5)
    .extend(3, 0.3);

assert_eq!(path.start(), Some(1));
assert_eq!(path.end(), Some(3));
assert_eq!(path.nodes, vec![1, 2, 3]);
assert_eq!(path.total_weight, 0.8);
}

Shortest Path (Dijkstra)

For weighted shortest paths, combine BFS with weight tracking:

#![allow(unused)]
fn main() {
use std::collections::{BinaryHeap, HashMap};
use std::cmp::Reverse;

fn shortest_path(
    start: PointId,
    end: PointId,
    get_edges: impl Fn(PointId) -> Vec<Edge>,
) -> Option<GraphPath> {
    let mut distances: HashMap<PointId, f32> = HashMap::new();
    let mut predecessors: HashMap<PointId, (PointId, f32)> = HashMap::new();
    let mut heap = BinaryHeap::new();

    distances.insert(start, 0.0);
    heap.push(Reverse((0.0f32, start)));

    while let Some(Reverse((dist, node))) = heap.pop() {
        if node == end {
            // Reconstruct path
            return Some(reconstruct_path(start, end, &predecessors));
        }

        if dist > *distances.get(&node).unwrap_or(&f32::MAX) {
            continue;
        }

        for edge in get_edges(node) {
            let new_dist = dist + edge.weight;
            if new_dist < *distances.get(&edge.target).unwrap_or(&f32::MAX) {
                distances.insert(edge.target, new_dist);
                predecessors.insert(edge.target, (node, edge.weight));
                heap.push(Reverse((new_dist, edge.target)));
            }
        }
    }

    None  // No path found
}

fn reconstruct_path(
    start: PointId,
    end: PointId,
    predecessors: &HashMap<PointId, (PointId, f32)>,
) -> GraphPath {
    let mut path = GraphPath::new(end);
    let mut current = end;

    while current != start {
        if let Some(&(pred, weight)) = predecessors.get(&current) {
            path = GraphPath::new(pred).extend(current, weight);
            current = pred;
        } else {
            break;
        }
    }

    // Reverse the path (we built it backwards)
    GraphPath {
        nodes: path.nodes.into_iter().rev().collect(),
        total_weight: path.total_weight,
    }
}
}

Filtered Traversal

Filter edges during traversal:

By Relation Type

#![allow(unused)]
fn main() {
let get_knows_neighbors = |node_id| {
    engine.get_neighbors(node_id)
        .map(|edges| {
            edges.iter()
                .filter(|e| e.relation == "KNOWS")
                .map(|e| e.target)
                .collect()
        })
        .unwrap_or_default()
};

let social_network: Vec<_> = BfsIterator::new(start, 3, get_knows_neighbors).collect();
}

By Weight Threshold

#![allow(unused)]
fn main() {
let get_strong_connections = |node_id| {
    engine.get_neighbors(node_id)
        .map(|edges| {
            edges.iter()
                .filter(|e| e.weight > 0.8)  // Only strong connections
                .map(|e| e.target)
                .collect()
        })
        .unwrap_or_default()
};
}

By Node Property

#![allow(unused)]
fn main() {
let get_active_neighbors = |node_id| {
    engine.get_neighbors(node_id)
        .map(|edges| {
            edges.iter()
                .filter(|e| {
                    // Check if target node is active
                    engine.get_point(e.target)
                        .and_then(|p| p.payload.get("active"))
                        .map(|v| v.as_bool().unwrap_or(false))
                        .unwrap_or(false)
                })
                .map(|e| e.target)
                .collect()
        })
        .unwrap_or_default()
};
}

Cypher Traversal

Cypher queries compile to traversal operations:

Single Hop

MATCH (a:Person)-[:KNOWS]->(b)
WHERE a.name = "Alice"
RETURN b

Compiles to:

#![allow(unused)]
fn main() {
let alice = find_by_label_and_property("Person", "name", "Alice");
let friends = get_neighbors_by_type(alice, "KNOWS");
}

Variable Length

MATCH (a:Person)-[:KNOWS*1..3]->(b)
WHERE a.name = "Alice"
RETURN DISTINCT b

Compiles to:

#![allow(unused)]
fn main() {
let alice = find_by_label_and_property("Person", "name", "Alice");
let mut results = HashSet::new();

for (node, depth) in BfsIterator::new(alice, 3, get_knows_neighbors) {
    if depth >= 1 && depth <= 3 {
        results.insert(node);
    }
}
}

Performance Considerations

Traversal Complexity

Algorithm	Time	Space	Notes
BFS	O(V + E)	O(V)	Queue-based, finds shortest
DFS	O(V + E)	O(V)	Stack-based, memory efficient
Dijkstra	O((V + E) log V)	O(V)	Weighted shortest path

Optimization Tips

1. Limit Depth

#![allow(unused)]
fn main() {
// Good: Bounded traversal
BfsIterator::new(start, 3, get_neighbors)

// Risky: Unbounded can be slow on dense graphs
BfsIterator::new(start, usize::MAX, get_neighbors)
}

2. Early Termination

#![allow(unused)]
fn main() {
// Stop when target found
for (node, _) in BfsIterator::new(start, 10, get_neighbors) {
    if node == target {
        break;
    }
}
}

3. Batch Neighbor Lookups

#![allow(unused)]
fn main() {
// Prefetch neighbors for nodes at current level
let level_nodes: Vec<_> = current_level.collect();
let all_neighbors: HashMap<_, _> = level_nodes.iter()
    .map(|&id| (id, engine.get_neighbors(id)))
    .collect();
}

4. Use Indexes

#![allow(unused)]
fn main() {
// For filtered traversal, use label index
let persons = engine.get_nodes_by_label("Person")?;
for person in persons {
    // Already filtered to Person label
}
}

Next Steps

Graph Model - Understanding nodes and edges
Cypher Language - Declarative graph queries

REST API

LatticeDB provides a Qdrant-compatible REST API for vector operations plus extensions for graph queries. This chapter documents all available endpoints.

Base URL

http://localhost:6333

Default port is 6333 (same as Qdrant for compatibility).

Collections

Create Collection

PUT /collections/{collection_name}

Request Body:

{
  "vectors": {
    "size": 128,
    "distance": "Cosine"
  },
  "hnsw_config": {
    "m": 16,
    "ef_construct": 200
  }
}

Fields:

Field	Required	Description
`vectors.size`	Yes	Vector dimension
`vectors.distance`	Yes	`Cosine`, `Euclid`, or `Dot`
`hnsw_config`	No	HNSW index configuration (uses defaults if omitted)
`hnsw_config.m`	Yes*	Max connections per node (*required if hnsw_config provided)
`hnsw_config.ef_construct`	Yes*	Build-time search queue size (*required if hnsw_config provided)
`hnsw_config.m0`	No	Layer 0 connections (default: 2*m)
`hnsw_config.ml`	No	Level multiplier (default: 1/ln(m))

Response:

{
  "status": "ok",
  "time": 0.001
}

Get Collection Info

GET /collections/{collection_name}

Response:

{
  "status": "ok",
  "result": {
    "name": "my_collection",
    "vectors_count": 10000,
    "config": {
      "vectors": {
        "size": 128,
        "distance": "Cosine"
      },
      "hnsw_config": {
        "m": 16,
        "ef_construct": 200
      }
    }
  }
}

Delete Collection

DELETE /collections/{collection_name}

Response:

{
  "status": "ok",
  "time": 0.002
}

List Collections

GET /collections

Response:

{
  "status": "ok",
  "result": {
    "collections": [
      {"name": "collection_1"},
      {"name": "collection_2"}
    ]
  }
}

Import/Export

Binary import/export for collection backup and migration. Uses rkyv zero-copy serialization for efficient transfer.

Export Collection

GET /collections/{collection_name}/export

Response Headers:

Header	Description
`Content-Type`	`application/octet-stream`
`X-Lattice-Format-Version`	Binary format version (currently `1`)
`X-Lattice-Point-Count`	Number of points in collection
`X-Lattice-Dimension`	Vector dimension
`Content-Disposition`	`attachment; filename="{name}.lattice"`

Response Body: Binary data (rkyv serialized collection)

Import Collection

POST /collections/{collection_name}/import?mode={mode}

Query Parameters:

Parameter	Required	Values	Description
`mode`	Yes	`create`, `replace`, `merge`	Import behavior

Import Modes:

create: Create new collection (fails with 409 if exists)
replace: Drop existing collection and create new
merge: Add points to existing collection (skips duplicates)

Request:

Content-Type: application/octet-stream
Body: Binary data from export

Response:

{
  "status": "ok",
  "result": {
    "points_imported": 1000,
    "points_skipped": 50,
    "dimension": 128,
    "mode": "merge"
  }
}

Error Codes:

400: Invalid mode, corrupted data, or dimension mismatch (merge mode)
404: Collection not found (merge mode only)
409: Collection already exists (create mode)
413: Payload too large (>1GB limit)

cURL Examples:

# Export collection to file
curl http://localhost:6333/collections/docs/export -o backup.lattice

# Import as new collection (create mode)
curl -X POST "http://localhost:6333/collections/new_docs/import?mode=create" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @backup.lattice

# Merge into existing collection
curl -X POST "http://localhost:6333/collections/docs/import?mode=merge" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @backup.lattice

# Replace existing collection
curl -X POST "http://localhost:6333/collections/docs/import?mode=replace" \
  -H "Content-Type: application/octet-stream" \
  --data-binary @backup.lattice

Points (Vectors)

Upsert Points

PUT /collections/{collection_name}/points

Request Body:

{
  "points": [
    {
      "id": 1,
      "vector": [0.1, 0.2, 0.3, ...],
      "payload": {
        "title": "Document 1",
        "category": "tech"
      }
    },
    {
      "id": 2,
      "vector": [0.4, 0.5, 0.6, ...],
      "payload": {
        "title": "Document 2",
        "category": "science"
      }
    }
  ]
}

Response:

{
  "status": "ok",
  "result": {
    "operation_id": 123,
    "status": "completed"
  }
}

Get Points

Retrieve points by their IDs (batch operation).

POST /collections/{collection_name}/points

Request Body:

{
  "ids": [1, 2, 3],
  "with_payload": true,
  "with_vector": false
}

Response:

{
  "status": "ok",
  "result": [
    {
      "id": 1,
      "payload": {
        "title": "Document 1",
        "category": "tech"
      }
    },
    {
      "id": 2,
      "payload": {
        "title": "Document 2",
        "category": "science"
      }
    }
  ]
}

Delete Points

POST /collections/{collection_name}/points/delete

Request Body:

{
  "points": [1, 2, 3]
}

Response:

{
  "status": "ok",
  "result": {
    "operation_id": 124,
    "status": "completed"
  }
}

Search

Vector Search

POST /collections/{collection_name}/points/query

Request Body:

{
  "query": [0.1, 0.2, 0.3, ...],
  "limit": 10,
  "ef": 100,
  "with_payload": true,
  "with_vector": false
}

Parameters:

query: Query vector (required)
limit: Number of results (default: 10)
ef: Search quality parameter (default: 100)
with_payload: Include payload in results (default: true)
with_vector: Include vector in results (default: false)

Response:

{
  "status": "ok",
  "result": [
    {
      "id": 42,
      "score": 0.95,
      "payload": {
        "title": "Most Similar Document"
      }
    },
    {
      "id": 17,
      "score": 0.89,
      "payload": {
        "title": "Second Most Similar"
      }
    }
  ],
  "time": 0.001
}

Filtered Search

POST /collections/{collection_name}/points/query

Request Body:

{
  "query": [0.1, 0.2, 0.3, ...],
  "limit": 10,
  "filter": {
    "must": [
      {
        "key": "category",
        "match": {"value": "tech"}
      }
    ],
    "must_not": [
      {
        "key": "archived",
        "match": {"value": true}
      }
    ]
  }
}

Scroll (Pagination)

POST /collections/{collection_name}/points/scroll

Request Body:

{
  "limit": 100,
  "offset": 0,
  "with_payload": true,
  "with_vector": false,
  "filter": {
    "must": [
      {
        "key": "category",
        "match": {"value": "tech"}
      }
    ]
  }
}

Response:

{
  "status": "ok",
  "result": {
    "points": [...],
    "next_page_offset": 100
  }
}

Batch Search

Execute multiple search queries in a single request for better efficiency.

POST /collections/{collection_name}/points/search/batch

Request Body:

{
  "searches": [
    {
      "vector": [0.1, 0.2, 0.3, ...],
      "limit": 10,
      "with_payload": true
    },
    {
      "vector": [0.4, 0.5, 0.6, ...],
      "limit": 5,
      "params": { "ef": 200 }
    }
  ]
}

Response:

{
  "status": "ok",
  "result": [
    [
      { "id": 42, "score": 0.95, "payload": {...} },
      { "id": 17, "score": 0.89, "payload": {...} }
    ],
    [
      { "id": 8, "score": 0.91, "payload": {...} }
    ]
  ]
}

Graph Operations

Add Edge

POST /collections/{collection_name}/graph/edges

Request Body:

{
  "from_id": 1,
  "to_id": 2,
  "weight": 0.9,
  "relation": "REFERENCES"
}

Response:

{
  "status": "ok",
  "result": {
    "created": true
  }
}

Get Neighbors

GET /collections/{collection_name}/graph/neighbors/{point_id}

Query Parameters:

relation: Filter by relation type (optional)
direction: outgoing (default), incoming, or both

Response:

{
  "status": "ok",
  "result": {
    "neighbors": [
      {
        "id": 2,
        "weight": 0.9,
        "relation": "REFERENCES"
      },
      {
        "id": 3,
        "weight": 0.7,
        "relation": "CITES"
      }
    ]
  }
}

Traverse Graph

Perform BFS/DFS traversal from a starting point.

POST /collections/{collection_name}/graph/traverse

Request Body:

{
  "start_id": 1,
  "max_depth": 3,
  "relations": ["KNOWS", "REFERENCES"]
}

Parameters:

start_id: Starting point ID (required)
max_depth: Maximum traversal depth (required, max: 100)
relations: Filter by relation types (optional, null = all relations)

Response:

{
  "status": "ok",
  "result": {
    "visited": [2, 5, 8, 12],
    "edges": [
      {"from_id": 1, "to_id": 2, "relation": "KNOWS", "weight": 0.95},
      {"from_id": 2, "to_id": 5, "relation": "KNOWS", "weight": 0.87}
    ],
    "max_depth_reached": 2
  }
}

Cypher Query

POST /collections/{collection_name}/graph/query

Request Body:

{
  "query": "MATCH (n:Person) WHERE n.age > $min_age RETURN n.name",
  "parameters": {
    "min_age": 25
  }
}

Response:

{
  "status": "ok",
  "result": {
    "columns": ["n.name"],
    "rows": [
      ["Alice"],
      ["Bob"],
      ["Charlie"]
    ],
    "stats": {
      "nodes_scanned": 100,
      "rows_returned": 3,
      "execution_time_ms": 5
    }
  }
}

Error Responses

All errors return a consistent format:

{
  "status": "error",
  "message": "Collection 'xyz' not found",
  "code": 404
}

Error Codes

Code	Meaning
400	Bad Request - Invalid parameters
404	Not Found - Collection/point doesn’t exist
409	Conflict - Collection already exists
500	Internal Server Error

Headers

Request Headers

Header	Value	Description
`Content-Type`	`application/json`	Required for POST/PUT
`Accept`	`application/json`	Optional

Response Headers

Header	Value
`Content-Type`	`application/json`
`Server-Timing`	RFC 6797 timing: `body;dur=X, handler;dur=Y, total;dur=Z` (microseconds)

Authentication

LatticeDB supports API key and Bearer token authentication. By default, authentication is disabled.

Enabling Authentication

Set one or both environment variables:

# API Key authentication
LATTICE_API_KEYS=key1,key2,key3

# Bearer token authentication
LATTICE_BEARER_TOKENS=token1,token2

Making Authenticated Requests

# Using API Key
curl -H "Authorization: ApiKey your-api-key" \
  http://localhost:6333/collections

# Using Bearer Token
curl -H "Authorization: Bearer your-token" \
  http://localhost:6333/collections

Public Endpoints

These endpoints do not require authentication:

GET / - Root endpoint
GET /health, /healthz - Health check
GET /ready, /readyz - Readiness check

Rate Limiting

By default, LatticeDB has no rate limiting. Enable it for production:

LATTICE_RATE_LIMIT=1  # Any value enables rate limiting

Default limits: 100 requests/second, burst capacity 200

Rate Limit Headers

When rate limiting is enabled, responses include:

Header	Description
`X-RateLimit-Limit`	Requests allowed per second
`X-RateLimit-Remaining`	Requests remaining in current window
`X-RateLimit-Reset`	Seconds until limit resets

429 Too Many Requests is returned when limits are exceeded.

TLS/HTTPS

Enable TLS for encrypted connections:

LATTICE_TLS_CERT=/path/to/cert.pem
LATTICE_TLS_KEY=/path/to/key.pem

Requires building with --features tls.

CORS

CORS is enabled by default for browser access:

Access-Control-Allow-Origin: *
Access-Control-Allow-Methods: GET, POST, PUT, DELETE, OPTIONS
Access-Control-Allow-Headers: Content-Type

Environment Variables Reference

Variable	Description	Default
`LATTICE_HOST`	Server bind address	`0.0.0.0`
`LATTICE_PORT`	Server port	`6333`
`LATTICE_API_KEYS`	Comma-separated API keys for authentication	(disabled)
`LATTICE_BEARER_TOKENS`	Comma-separated Bearer tokens for authentication	(disabled)
`LATTICE_RATE_LIMIT`	Enable rate limiting (any value)	(disabled)
`LATTICE_TLS_CERT`	Path to TLS certificate file	(disabled)
`LATTICE_TLS_KEY`	Path to TLS private key file	(disabled)
`LATTICE_DATA_DIR`	Data persistence directory	`./data`
`LATTICE_LOG_LEVEL`	Logging verbosity (`error`, `warn`, `info`, `debug`, `trace`)	`info`

cURL Examples

Create and Populate

# Create collection
curl -X PUT http://localhost:6333/collections/docs \
  -H "Content-Type: application/json" \
  -d '{"vectors": {"size": 128, "distance": "Cosine"}}'

# Add points
curl -X PUT http://localhost:6333/collections/docs/points \
  -H "Content-Type: application/json" \
  -d '{
    "points": [
      {"id": 1, "vector": [0.1, 0.2, ...], "payload": {"title": "Doc 1"}}
    ]
  }'

# Search
curl -X POST http://localhost:6333/collections/docs/points/query \
  -H "Content-Type: application/json" \
  -d '{"query": [0.1, 0.2, ...], "limit": 5}'

Graph Operations

# Add edge
curl -X POST http://localhost:6333/collections/docs/graph/edges \
  -H "Content-Type: application/json" \
  -d '{"from_id": 1, "to_id": 2, "weight": 0.9, "relation": "REFS"}'

# Cypher query
curl -X POST http://localhost:6333/collections/docs/graph/query \
  -H "Content-Type: application/json" \
  -d '{"query": "MATCH (n) RETURN n LIMIT 10"}'

Next Steps

Rust API - Native Rust integration
TypeScript API - Browser/Node.js client

Rust API

This chapter documents the Rust API for LatticeDB, covering both the core library and server usage.

Installation

# Core library only
[dependencies]
lattice-core = "0.1"

# With storage implementations
lattice-storage = { version = "0.1", features = ["native"] }

# Full server
lattice-server = { version = "0.1", features = ["native"] }

Core Types

Point

#![allow(unused)]
fn main() {
use lattice_core::{Point, Edge};

// Create a point with vector
let point = Point::new_vector(1, vec![0.1, 0.2, 0.3]);

// Add payload (builder pattern)
let point = Point::new_vector(1, vec![0.1, 0.2, 0.3])
    .with_payload("title", "My Document")
    .with_payload("score", 0.95)
    .with_payload("tags", vec!["rust", "database"])
    .with_label("Document");

// Access fields
let id: PointId = point.id;
let vector: &[f32] = &point.vector;
let title: Option<&CypherValue> = point.payload.get("title");
}

Edge

#![allow(unused)]
fn main() {
use lattice_core::Edge;

// Create an edge
let edge = Edge::new(
    target_id: 2,
    weight: 0.9,
    relation: "REFERENCES",
);

// Access fields
let target: PointId = edge.target;
let weight: f32 = edge.weight;
let relation: &str = &edge.relation;
}

Vector

#![allow(unused)]
fn main() {
use lattice_core::Vector;

// Vector is an alias for Vec<f32>
let vector: Vector = vec![0.1, 0.2, 0.3, 0.4];

// Create from iterator
let vector: Vector = (0..128)
    .map(|i| (i as f32) / 128.0)
    .collect();
}

Configuration

CollectionConfig

#![allow(unused)]
fn main() {
use lattice_core::{CollectionConfig, VectorConfig, HnswConfig, Distance};

let config = CollectionConfig::new(
    "my_collection",
    VectorConfig::new(128, Distance::Cosine),
    HnswConfig {
        m: 16,
        m0: 32,
        ml: 0.36,
        ef: 100,
        ef_construction: 200,
    },
);
}

HnswConfig Helpers

#![allow(unused)]
fn main() {
use lattice_core::HnswConfig;

// Default config
let config = HnswConfig::default();

// Recommended ml for given m
let ml = HnswConfig::recommended_ml(16);  // ~0.36

// Config optimized for dimension
let config = HnswConfig::default_for_dim(128);
}

Collection Engine

Creating an Engine

#![allow(unused)]
fn main() {
use lattice_core::CollectionEngine;
use lattice_storage::MemStorage;

// With in-memory storage
let storage = MemStorage::new();
let mut engine = CollectionEngine::new(config, storage)?;
}

Upsert Operations

#![allow(unused)]
fn main() {
// Single point
engine.upsert(point)?;

// Batch upsert
let points = vec![point1, point2, point3];
for point in points {
    engine.upsert(point)?;
}

// Upsert returns info
let result = engine.upsert(point)?;
println!("Upserted point {}, was_insert: {}", result.id, result.was_insert);
}

Search Operations

#![allow(unused)]
fn main() {
use lattice_core::SearchQuery;

// Basic search
let query = SearchQuery::new(query_vector)
    .with_limit(10);

let results = engine.search(&query)?;

for result in results {
    println!("ID: {}, Score: {}", result.id, result.score);
}

// Search with ef parameter
let query = SearchQuery::new(query_vector)
    .with_limit(10)
    .with_ef(200);  // Higher ef = better recall

// Search returning payloads
let results = engine.search_with_payload(&query)?;

for result in results {
    let payload = result.payload.as_ref();
    println!("Title: {:?}", payload.and_then(|p| p.get("title")));
}
}

Retrieve Operations

#![allow(unused)]
fn main() {
// Get single point
let point = engine.get_point(42)?;

// Get multiple points
let points = engine.get_points(&[1, 2, 3])?;

// Check existence
if engine.has_point(42)? {
    println!("Point exists");
}
}

Delete Operations

#![allow(unused)]
fn main() {
// Delete single point
let deleted = engine.delete(42)?;

// Delete multiple points
for id in [1, 2, 3] {
    engine.delete(id)?;
}
}

Scroll (Pagination)

#![allow(unused)]
fn main() {
use lattice_core::ScrollQuery;

let query = ScrollQuery::new()
    .with_limit(100)
    .with_offset(0);

let result = engine.scroll(&query)?;

for point in result.points {
    println!("Point: {}", point.id);
}

// Iterate all points
let mut offset = 0;
loop {
    let result = engine.scroll(
        &ScrollQuery::new().with_limit(100).with_offset(offset)
    )?;

    if result.points.is_empty() {
        break;
    }

    for point in &result.points {
        process(point);
    }

    offset += result.points.len();
}
}

Graph Operations

Adding Edges

#![allow(unused)]
fn main() {
use lattice_core::Edge;

// Add single edge
engine.add_edge(1, Edge::new(2, 0.9, "REFERENCES"))?;

// Add multiple edges from same source
engine.add_edge(1, Edge::new(3, 0.7, "CITES"))?;
engine.add_edge(1, Edge::new(4, 0.8, "RELATED_TO"))?;
}

Getting Neighbors

#![allow(unused)]
fn main() {
// Get all outgoing edges
let neighbors = engine.get_neighbors(1)?;

for edge in neighbors {
    println!("→ {} (weight: {}, type: {})",
        edge.target, edge.weight, edge.relation);
}

// Filter by relation type
let references = engine.get_neighbors_by_type(1, "REFERENCES")?;
}

Graph Traversal

#![allow(unused)]
fn main() {
use lattice_core::graph::{BfsIterator, DfsIterator};

// BFS from node 1, max depth 3
let get_neighbors = |id| {
    engine.get_neighbors(id)
        .map(|edges| edges.iter().map(|e| e.target).collect())
        .unwrap_or_default()
};

for (node_id, depth) in BfsIterator::new(1, 3, get_neighbors) {
    println!("Node {} at depth {}", node_id, depth);
}

// DFS traversal
for (node_id, depth) in DfsIterator::new(1, 5, get_neighbors) {
    println!("Visiting {} at depth {}", node_id, depth);
}
}

Cypher Queries

#![allow(unused)]
fn main() {
use lattice_core::cypher::{CypherHandler, DefaultCypherHandler};
use std::collections::HashMap;

let handler = DefaultCypherHandler::new();

// Simple query
let result = handler.query(
    "MATCH (n:Person) RETURN n.name",
    &mut engine,
    HashMap::new(),
)?;

// Query with parameters
let mut params = HashMap::new();
params.insert("min_age".into(), CypherValue::Int(25));
params.insert("category".into(), CypherValue::String("tech".into()));

let result = handler.query(
    "MATCH (n:Person) WHERE n.age > $min_age RETURN n.name, n.age",
    &mut engine,
    params,
)?;

// Process results
println!("Columns: {:?}", result.columns);
for row in result.rows {
    let name = row.get("name");
    let age = row.get("age");
    println!("{:?}, {:?}", name, age);
}

// Check stats
println!("Execution time: {:?}", result.stats.execution_time);
println!("Rows returned: {}", result.stats.rows_returned);
}

HNSW Index Direct Access

#![allow(unused)]
fn main() {
use lattice_core::{HnswIndex, Distance, HnswConfig};

// Create index directly (without storage)
let config = HnswConfig::default();
let mut index = HnswIndex::new(config, Distance::Cosine);

// Insert points
index.insert(&point);

// Search
let results = index.search(&query_vector, k=10, ef=100);

// Batch search
let queries: Vec<&[f32]> = query_vectors.iter()
    .map(|v| v.as_slice())
    .collect();

let batch_results = index.search_batch(&queries, k=10, ef=100);

// Get statistics
println!("Index size: {}", index.len());
println!("Layer counts: {:?}", index.layer_counts());
println!("Memory: {} bytes", index.vector_memory_bytes());
}

Storage Implementations

MemStorage

#![allow(unused)]
fn main() {
use lattice_storage::MemStorage;

let storage = MemStorage::new();
// Fast, ephemeral, for testing
}

DiskStorage (Native)

#![allow(unused)]
fn main() {
use lattice_storage::DiskStorage;
use std::path::Path;

let storage = DiskStorage::new(Path::new("./data"))?;
// Persistent, uses tokio::fs
}

OpfsStorage (WASM)

#![allow(unused)]
fn main() {
#[cfg(target_arch = "wasm32")]
use lattice_storage::OpfsStorage;

let storage = OpfsStorage::new().await?;
// Browser persistent storage
}

Server Usage

Starting the Server

use lattice_server::{
    axum_transport::AxumTransport,
    router::{new_app_state, route},
};

#[tokio::main]
async fn main() {
    let state = new_app_state();
    let transport = AxumTransport::new("0.0.0.0:6333");

    transport.serve(move |request| {
        let state = state.clone();
        async move { route(state, request).await }
    }).await.unwrap();
}

Custom Handlers

#![allow(unused)]
fn main() {
use lattice_server::handlers;
use lattice_core::{LatticeRequest, LatticeResponse};

async fn custom_route(
    state: &AppState,
    request: &LatticeRequest,
) -> LatticeResponse {
    // Access collection
    let collections = state.collections.read().await;
    let engine = collections.get("my_collection")?;

    // Process request
    let result = engine.search(&query)?;

    // Return response
    LatticeResponse::ok(serde_json::to_vec(&result)?)
}
}

Error Handling

#![allow(unused)]
fn main() {
use lattice_core::{LatticeError, LatticeResult};

fn process() -> LatticeResult<()> {
    let engine = CollectionEngine::new(config, storage)?;

    match engine.get_point(999) {
        Ok(point) => println!("Found: {:?}", point),
        Err(LatticeError::NotFound { .. }) => println!("Not found"),
        Err(e) => return Err(e),
    }

    Ok(())
}

// Error types
match error {
    LatticeError::NotFound { resource, id } => ...,
    LatticeError::InvalidConfig { message } => ...,
    LatticeError::Storage(storage_error) => ...,
    LatticeError::Cypher(cypher_error) => ...,
}
}

Next Steps

TypeScript API - Browser/Node.js client
REST API - HTTP endpoints

TypeScript API

LatticeDB provides a TypeScript/JavaScript client for browser and Node.js environments. The client works with both the WASM build (in-browser) and the REST API (remote server).

Installation

npm install lattice-db

Quick Start

Browser (WASM)

import { LatticeDB } from 'lattice-db';

async function main() {
  // Initialize WASM module and get database instance
  const db = await LatticeDB.init();

  // Create a collection
  db.createCollection('my_collection', {
    vectors: { size: 128, distance: 'Cosine' }
  });

  // Add points
  db.upsert('my_collection', [
    {
      id: 1,
      vector: new Float32Array(128).fill(0.1),
      payload: { title: 'Hello World' }
    }
  ]);

  // Search
  const results = db.search(
    'my_collection',
    new Float32Array(128).fill(0.1),
    10
  );

  console.log('Results:', results);
}

main();

Node.js (REST Client)

import { LatticeClient } from 'lattice-db';

const client = new LatticeClient('http://localhost:6333');

// Create collection
await client.createCollection('my_collection', {
  vectors: { size: 128, distance: 'Cosine' }
});

// Upsert points
await client.upsert('my_collection', {
  points: [
    { id: 1, vector: [...], payload: { title: 'Doc 1' } }
  ]
});

// Search
const results = await client.search('my_collection', {
  query: [...],
  limit: 10
});

WASM API

Initialization

import { LatticeDB } from 'lattice-db';

// Initialize WASM and get database instance
const db = await LatticeDB.init();

// Or with custom WASM path
const db = await LatticeDB.init('/path/to/lattice.wasm');

Creating a Collection

// Create a collection with default HNSW config
db.createCollection('my_collection', {
  vectors: { size: 128, distance: 'Cosine' }
});

// Create with custom HNSW config
db.createCollection('my_collection', {
  vectors: { size: 128, distance: 'Cosine' },
  hnsw_config: { m: 16, ef_construct: 200 }
});

Upsert Points

// Batch upsert (always array)
db.upsert('my_collection', [
  {
    id: 1,
    vector: new Float32Array([0.1, 0.2, ...]),
    payload: { title: 'My Document', tags: ['rust', 'database'] }
  },
  {
    id: 2,
    vector: new Float32Array([0.3, 0.4, ...]),
    payload: { title: 'Another Doc' }
  }
]);

Search

// Basic search
const results = db.search(
  'my_collection',
  queryVector,  // Float32Array
  10            // limit
);

// Search with options (snake_case for WASM binding)
const results = db.search('my_collection', queryVector, 10, {
  with_payload: true,   // Include payload in results (default: true)
  with_vector: false,   // Include vector in results (default: false)
  score_threshold: 0.5  // Optional minimum score filter
});

// Results format
for (const result of results) {
  console.log(`ID: ${result.id}, Score: ${result.score}`);
  console.log(`Payload: ${JSON.stringify(result.payload)}`);
}

Retrieve Points

// Get multiple points by ID
const points = db.getPoints('my_collection',
  BigUint64Array.from([1n, 2n, 3n]),
  true,  // withPayload
  false  // withVector
);

Delete Points

// Delete by IDs
db.deletePoints('my_collection', BigUint64Array.from([1n, 2n, 3n]));

Graph Operations

// Add edge
db.addEdge('my_collection', 1n, 2n, 'REFERENCES', 0.9);

// Traverse graph
const result = db.traverse('my_collection', 1n, 2, ['REFERENCES']);

// Cypher query
const result = db.query(
  'my_collection',
  'MATCH (n:Person) WHERE n.age > $minAge RETURN n.name',
  { minAge: 25 }
);

for (const row of result.rows) {
  console.log(row['n.name']);
}

Collection Management

// List all collections
const collections = db.listCollections();

// Get collection info
const info = db.getCollection('my_collection');
console.log(`Points: ${info.vectors_count}`);

// Delete collection
db.deleteCollection('my_collection');

REST Client

Creating a Client

import { LatticeClient } from 'lattice-db';

const client = new LatticeClient('http://localhost:6333', {
  timeout: 30000,  // Request timeout in ms
  headers: {       // Custom headers
    'Authorization': 'Bearer token'
  }
});

Collections

// Create (REST API uses snake_case)
await client.createCollection('docs', {
  vectors: { size: 128, distance: 'Cosine' },
  hnsw_config: { m: 16, ef_construct: 200 }
});

// Get info
const info = await client.getCollection('docs');
console.log(`Vectors: ${info.vectors_count}`);

// List all
const collections = await client.listCollections();

// Delete
await client.deleteCollection('docs');

Points

// Upsert
await client.upsert('docs', {
  points: [
    { id: 1, vector: [...], payload: { title: 'Doc 1' } }
  ]
});

// Get
const point = await client.getPoint('docs', 1);

// Delete
await client.deletePoints('docs', { points: [1, 2, 3] });

Search

const results = await client.search('docs', {
  query: [...],
  limit: 10,
  filter: {
    must: [
      { key: 'category', match: { value: 'tech' } }
    ]
  },
  with_payload: true  // Note: REST API uses snake_case
});

Scroll

// First page
let result = await client.scroll('docs', {
  limit: 100,
  with_payload: true  // Note: REST API uses snake_case
});

// Subsequent pages
while (result.next_page_offset !== null) {
  result = await client.scroll('docs', {
    limit: 100,
    offset: result.next_page_offset
  });
  // Process result.points
}

Graph

// Add edge
await client.addEdge('docs', {
  sourceId: 1,
  targetId: 2,
  weight: 0.9,
  relation: 'REFERENCES'
});

// Get neighbors
const neighbors = await client.getNeighbors('docs', 1);

// Cypher query
const result = await client.cypherQuery('docs', {
  query: 'MATCH (n:Person) RETURN n.name',
  parameters: {}
});

React Integration

Hook

import { useEffect, useState } from 'react';
import { LatticeDB } from 'lattice-db';

function useLatticeDB() {
  const [db, setDb] = useState<LatticeDB | null>(null);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState<Error | null>(null);

  useEffect(() => {
    async function initialize() {
      try {
        const instance = await LatticeDB.init();
        setDb(instance);
      } catch (e) {
        setError(e as Error);
      } finally {
        setLoading(false);
      }
    }
    initialize();
  }, []);

  return { db, loading, error };
}

Component

function SearchComponent() {
  const { db, loading } = useLatticeDB();
  const [results, setResults] = useState<SearchResult[]>([]);

  // Create collection on first load
  useEffect(() => {
    if (!db) return;
    try {
      db.createCollection('docs', {
        vectors: { size: 128, distance: 'Cosine' }
      });
    } catch {
      // Collection may already exist
    }
  }, [db]);

  const handleSearch = (queryVector: Float32Array) => {
    if (!db) return;
    const searchResults = db.search('docs', Array.from(queryVector), 10);
    setResults(searchResults);
  };

  if (loading) return <div>Loading...</div>;

  return (
    <div>
      <SearchInput onSearch={handleSearch} />
      <ResultsList results={results} />
    </div>
  );
}

Vue Integration

<script setup lang="ts">
import { ref, onMounted } from 'vue';
import { LatticeDB } from 'lattice-db';

const db = ref<LatticeDB | null>(null);
const results = ref<SearchResult[]>([]);

onMounted(async () => {
  db.value = await LatticeDB.init();

  // Create collection (will throw if already exists)
  try {
    db.value.createCollection('docs', {
      vectors: { size: 128, distance: 'Cosine' }
    });
  } catch {
    // Collection may already exist
  }
});

function search(vector: Float32Array) {
  if (!db.value) return;
  results.value = db.value.search('docs', Array.from(vector), 10);
}
</script>

<template>
  <SearchInput @search="search" />
  <ResultsList :results="results" />
</template>

TypeScript Types

// Distance metrics
type DistanceMetric = 'Cosine' | 'Euclid' | 'Dot';

// Collection configuration
interface CollectionConfig {
  vectors: { size: number; distance: DistanceMetric };
  hnsw_config?: { m: number; m0?: number; ef_construct: number };
}

// Point to upsert
interface Point {
  id: number;
  vector: number[];
  payload?: Record<string, unknown>;
}

// Search options
interface SearchOptions {
  with_payload?: boolean;
  with_vector?: boolean;
  score_threshold?: number;
}

// Search result
interface SearchResult {
  id: number;
  score: number;
  payload?: Record<string, unknown>;
  vector?: number[];
}

// Graph traversal result
interface TraversalResult {
  nodes: { id: number; depth: number; payload?: Record<string, unknown> }[];
  edges: { from: number; to: number; relation: string; weight: number }[];
}

// Cypher query result
interface CypherResult {
  columns: string[];
  rows: unknown[][];
}

Error Handling

try {
  db.search('my_collection', queryVector, 10);
} catch (error) {
  if (error instanceof Error) {
    if (error.message.includes('not found')) {
      console.log('Collection not found');
    } else if (error.message.includes('dimension')) {
      console.log('Vector dimension mismatch');
    } else {
      console.log('Error:', error.message);
    }
  }
}

Performance Tips

Batch Upserts

// Good: Single call for multiple points
db.upsert('my_collection', [
  { id: 1, vector: [...], payload: { title: 'Doc 1' } },
  { id: 2, vector: [...], payload: { title: 'Doc 2' } },
  { id: 3, vector: [...], payload: { title: 'Doc 3' } }
]);

// Bad: Multiple calls
for (const point of points) {
  db.upsert('my_collection', [point]);
}

Web Worker

Move LatticeDB to a Web Worker to keep the main thread responsive:

// worker.ts
import { LatticeDB } from 'lattice-db';

let db: LatticeDB;

self.onmessage = async ({ data }) => {
  if (data.type === 'init') {
    db = await LatticeDB.init();
    db.createCollection('docs', {
      vectors: { size: data.vectorSize, distance: 'Cosine' }
    });
    self.postMessage({ type: 'ready' });
  }

  if (data.type === 'search') {
    const results = db.search('docs', data.vector, data.limit);
    self.postMessage({ type: 'results', results });
  }
};

Next Steps

REST API - HTTP endpoints
WASM Browser Setup - Detailed browser guide

Benchmarks

LatticeDB is benchmarked against industry-standard databases: Qdrant for vector operations and Neo4j for graph queries. This chapter presents the benchmark methodology and results.

Summary

LatticeDB In-Memory wins ALL operations against both Qdrant and Neo4j.

Vector Operations (1,000 points, 128 dimensions)

Operation	LatticeDB In-Memory¹	LatticeDB HTTP²	Qdrant HTTP
Search	84 µs	168 µs	330 µs
Upsert	0.76 µs	115 µs	287 µs
Retrieve	2.2 µs	—	306 µs
Scroll	18 µs	—	398 µs

¹ In-memory performance applies to browser/WASM deployments (no network overhead)

² HTTP server uses simd-json, Hyper with pipelining, TCP_NODELAY

Graph Operations: LatticeDB vs Neo4j Bolt

Operation	LatticeDB In-Memory³	LatticeDB HTTP⁴	Neo4j Bolt
match_all	74 µs	85 µs	1,147 µs
match_by_label	72 µs	110 µs	816 µs
match_with_limit	12 µs	72 µs	596 µs
order_by	120 µs	173 µs	889 µs
where_property	619 µs	965 µs	3,136 µs

³ In-memory applies to browser/WASM deployments (no network overhead)

⁴ HTTP server uses Hyper with pipelining, TCP_NODELAY

Benchmark Setup

Hardware

All benchmarks run on:

CPU: Apple M1 Pro (10 cores)
RAM: 16 GB
Storage: NVMe SSD

Dataset

Vector Benchmarks:

10,000 vectors
128 dimensions
Cosine distance
Random normalized vectors

Graph Benchmarks:

1,000 nodes with labels
Varied properties (string, integer)
Cypher queries of increasing complexity

Competitors

Database	Version	Configuration
Qdrant	1.7.x	Docker, default settings
Neo4j	5.x	Docker, Community Edition
LatticeDB	0.1	Native binary, SIMD enabled

Vector Benchmark Details

Search Benchmark

Query: Find 10 nearest neighbors from 10,000 vectors

#![allow(unused)]
fn main() {
// LatticeDB
let results = engine.search(&SearchQuery::new(query_vec).with_limit(10))?;

// Qdrant equivalent
POST /collections/{name}/points/search
{"vector": [...], "limit": 10}
}

Results:

LatticeDB: 106 µs (p50), 142 µs (p99)
Qdrant: 150 µs (p50), 280 µs (p99)

Why LatticeDB is faster:

SIMD-accelerated distance calculations (4x unrolling on NEON)
Dense vector storage (cache-friendly)
Thread-local scratch space (no allocation per search)
Shortcut-enabled HNSW traversal

Upsert Benchmark

Operation: Insert single point with 128-dim vector

#![allow(unused)]
fn main() {
// LatticeDB
engine.upsert(point)?;

// Qdrant equivalent
PUT /collections/{name}/points
{"points": [{"id": 1, "vector": [...]}]}
}

Results:

LatticeDB: 0.51 µs
Qdrant: 90 µs

177x faster due to:

No network overhead (in-process)
Optimized HNSW insertion with pre-computed distances
Memory-mapped graph storage

Retrieve Benchmark

Operation: Get point by ID

#![allow(unused)]
fn main() {
// LatticeDB
let point = engine.get_point(id)?;

// Qdrant equivalent
GET /collections/{name}/points/{id}
}

Results:

LatticeDB: 2.61 µs
Qdrant: 135 µs

52x faster due to:

Direct HashMap lookup
No HTTP serialization/deserialization

Scroll Benchmark

Operation: Paginate through all points (100 per page)

#![allow(unused)]
fn main() {
// LatticeDB
let result = engine.scroll(&ScrollQuery::new().with_limit(100))?;

// Qdrant equivalent
POST /collections/{name}/points/scroll
{"limit": 100}
}

Results:

LatticeDB: 18 µs
Qdrant: 133 µs

7.4x faster due to:

Sequential memory access
No HTTP overhead

Graph Benchmark Details

match_all

MATCH (n) RETURN n LIMIT 100

LatticeDB In-Memory: 74 µs
LatticeDB HTTP: 85 µs
Neo4j Bolt: 1,147 µs
13x faster (HTTP vs Bolt)

match_by_label

MATCH (n:Person) RETURN n LIMIT 100

LatticeDB In-Memory: 72 µs
LatticeDB HTTP: 110 µs
Neo4j Bolt: 816 µs
7x faster (HTTP vs Bolt)

match_with_limit

MATCH (n:Person) RETURN n LIMIT 10

LatticeDB In-Memory: 12 µs
LatticeDB HTTP: 72 µs
Neo4j Bolt: 596 µs
8x faster (HTTP vs Bolt)

order_by

MATCH (n:Person) RETURN n.name ORDER BY n.name LIMIT 50

LatticeDB In-Memory: 120 µs
LatticeDB HTTP: 173 µs
Neo4j Bolt: 889 µs
5x faster (HTTP vs Bolt)

where_property

MATCH (n:Person) WHERE n.age > 30 RETURN n

LatticeDB In-Memory: 619 µs
LatticeDB HTTP: 965 µs
Neo4j Bolt: 3,136 µs
3x faster (HTTP vs Bolt)

Why LatticeDB is Faster

vs Qdrant

In-Memory (Browser/WASM):

No network overhead: LatticeDB runs in-process or in-browser
SIMD optimizations: AVX2/NEON distance calculations
Memory efficiency: Dense vector storage, thread-local caches
Optimized HNSW: Shortcut search, prefetching

HTTP Mode (Server):

Raw Hyper: Direct HTTP/1.1 with minimal abstraction
simd-json: SIMD-accelerated JSON parsing/serialization
TCP_NODELAY: Lower latency with Nagle algorithm disabled
HTTP pipelining: Concurrent request processing
Zero-copy paths: Static string allocations, fast response building

vs Neo4j

In-Memory Mode (embedded/browser):

Lightweight runtime: No JVM overhead
Efficient data structures: Rust-native HashMap, Vec
Query compilation: Direct execution vs interpreted Cypher
Cache-friendly layout: Sequential memory access

HTTP Mode (server deployments):

LatticeDB HTTP uses the same optimized Cypher engine
Neo4j Bolt is a binary protocol (more efficient than HTTP)
Use http_graph_profiler to compare server deployment performance

Running Benchmarks

Prerequisites

# Install criterion
cargo install cargo-criterion

# For comparison benchmarks, start competitors
docker run -p 6333:6333 qdrant/qdrant
docker run -p 7474:7474 -p 7687:7687 neo4j

Quick Benchmarks

# Vector operations (HTTP)
cargo run -p lattice-bench --release --example http_profiler

# Graph operations (in-memory)
cargo run -p lattice-bench --release --example graph_profiler

# Graph operations (HTTP vs Bolt)
cargo run -p lattice-bench --release --example http_graph_profiler

Full Criterion Benchmarks

# Vector operations
cargo bench -p lattice-bench --bench vector_ops

# Graph operations
cargo bench -p lattice-bench --bench cypher_comparison

View Reports

open target/criterion/report/index.html

Reproducing Results

All benchmarks are reproducible:

git clone https://github.com/Avarok-Cybersecurity/lattice-db
cd lattice-db

# Run all benchmarks
cargo bench -p lattice-bench

# Results saved to:
# - target/criterion/*/report/index.html (HTML)
# - target/criterion/*/new/estimates.json (JSON)

Next Steps

Tuning Guide - Optimize for your use case
SIMD Optimization - Hardware acceleration details

Tuning Guide

This chapter covers performance optimization strategies for LatticeDB, including HNSW parameters, memory configuration, and query optimization.

HNSW Parameter Tuning

Core Parameters

Parameter	Default	Range	Effect
`m`	16	4-64	Connections per node
`m0`	32	8-128	Layer 0 connections
`ef_construction`	200	50-500	Build quality
`ef`	100	10-500	Search quality

Tuning for Recall

Higher recall (accuracy) requires:

Higher m and m0
Higher ef_construction
Higher ef at search time

#![allow(unused)]
fn main() {
// High recall configuration
let config = HnswConfig {
    m: 32,
    m0: 64,
    ef_construction: 400,
    ef: 200,
    ml: HnswConfig::recommended_ml(32),
};
}

Trade-offs:

Higher memory usage (more edges per node)
Slower index construction
Slower searches (more candidates explored)

Tuning for Speed

Faster search with acceptable recall:

Lower m and m0
Lower ef at search time

#![allow(unused)]
fn main() {
// Fast search configuration
let config = HnswConfig {
    m: 8,
    m0: 16,
    ef_construction: 100,
    ef: 50,
    ml: HnswConfig::recommended_ml(8),
};
}

Trade-offs:

Lower recall (may miss some neighbors)
Lower memory usage

Tuning for Memory

Minimize memory footprint:

#![allow(unused)]
fn main() {
// Memory-efficient configuration
let config = HnswConfig {
    m: 8,
    m0: 16,
    ef_construction: 100,
    ef: 100,
    ml: HnswConfig::recommended_ml(8),
};
}

Combine with quantization:

#![allow(unused)]
fn main() {
// Use scalar quantization (4x memory reduction)
let quantized = QuantizedVector::quantize(&vector);
}

Dataset Size Guidelines

Dataset Size	m	m0	ef_construction	Memory/Vector
< 1K	8	16	100	~200 bytes
1K - 10K	12	24	150	~300 bytes
10K - 100K	16	32	200	~400 bytes
100K - 1M	24	48	300	~600 bytes
> 1M	32	64	400	~800 bytes

Search Optimization

Adjusting ef at Runtime

ef can be tuned per-query:

#![allow(unused)]
fn main() {
// Quick search (lower recall)
let fast_results = engine.search(&query.with_ef(50))?;

// High-quality search (higher recall)
let accurate_results = engine.search(&query.with_ef(300))?;
}

Batch Queries

For multiple queries, use batch search:

#![allow(unused)]
fn main() {
// 5-10x faster than individual searches
let results = index.search_batch(&queries, k=10, ef=100);
}

Benefits:

Parallel processing (on native)
Better cache utilization
Amortized overhead

Pre-filtering

Filter before vector search when possible:

#![allow(unused)]
fn main() {
// Instead of post-filtering 10K results...
let all_results = engine.search(&query.with_limit(10000))?;
let filtered: Vec<_> = all_results
    .into_iter()
    .filter(|r| r.payload.get("category") == Some("tech"))
    .take(10)
    .collect();

// Pre-filter using graph/index
let tech_ids = engine.get_nodes_by_label("tech")?;
let results = engine.search_among(&query, &tech_ids, k=10)?;
}

Memory Optimization

Vector Storage Options

Option	Memory	Speed	Use Case
Dense (default)	100%	Fastest	Small-medium datasets
Scalar Quantized	25%	95%	Large datasets
Product Quantized	3-5%	80%	Very large datasets
Memory-mapped	Variable	90%	Larger than RAM

Enabling Quantization

#![allow(unused)]
fn main() {
// Scalar quantization
use lattice_core::QuantizedVector;

let quantized_vectors: Vec<QuantizedVector> = vectors
    .iter()
    .map(|v| QuantizedVector::quantize(v))
    .collect();

// Product quantization accelerator
let accelerator = index.build_pq_accelerator(m=8, training_size=10000);
let results = index.search_with_pq(&query, k, ef, &accelerator, rerank=3);
}

Memory-Mapped Storage (Native)

#![allow(unused)]
fn main() {
// Export vectors to mmap file
index.export_vectors_mmap(Path::new("vectors.mmap"))?;

// Load with mmap (vectors stay on disk, loaded on demand)
let mmap_store = MmapVectorStore::open(Path::new("vectors.mmap"))?;
}

Storage Optimization

Choosing a Backend

Backend	Persistence	Speed	Use Case
MemStorage	No	Fastest	Testing, ephemeral
DiskStorage	Yes	Fast	Server deployments
OpfsStorage	Yes	Medium	Browser persistent
IndexedDB	Yes	Slower	Browser fallback

Page Size Tuning

For disk storage, larger pages improve sequential access:

#![allow(unused)]
fn main() {
// Default: 4KB pages
let storage = DiskStorage::with_page_size(4096);

// Larger pages for bulk operations
let storage = DiskStorage::with_page_size(64 * 1024); // 64KB
}

Query Optimization

Cypher Query Patterns

Use labels for filtering:

-- Good: Uses label index
MATCH (n:Person) WHERE n.age > 25 RETURN n

-- Less efficient: Full scan
MATCH (n) WHERE n.type = 'Person' AND n.age > 25 RETURN n

Limit early:

-- Good: Limits before ordering
MATCH (n:Person) RETURN n ORDER BY n.name LIMIT 10

-- Less efficient: Orders everything first
MATCH (n:Person) RETURN n ORDER BY n.name

Use parameters:

-- Good: Query can be cached
MATCH (n:Person {name: $name}) RETURN n

-- Less efficient: New query parse each time
MATCH (n:Person {name: 'Alice'}) RETURN n

Hybrid Query Optimization

Vector-first for similarity:

#![allow(unused)]
fn main() {
// Good: Vector search narrows candidates
let similar = engine.search(&query.with_limit(100))?;
let expanded = expand_graph(&similar);

// Less efficient: Graph-first with large result set
let all_docs = engine.query("MATCH (n:Document) RETURN n")?;
let similar = filter_by_vector(&all_docs, &query);
}

Graph-first for structured queries:

#![allow(unused)]
fn main() {
// Good: Graph query with few results
let authors = engine.query(
    "MATCH (p:Person)-[:AUTHORED]->(d:Document {topic: $topic}) RETURN p",
    params
)?;
let ranked = rank_by_vector(&authors, &query);

// Less efficient: Vector search on entire corpus
let all_similar = engine.search(&query.with_limit(1000))?;
let authors = filter_by_graph(&all_similar);
}

Monitoring

Memory Statistics

#![allow(unused)]
fn main() {
let stats = engine.stats();
println!("Vectors: {} ({} bytes)", stats.vector_count, stats.vector_bytes);
println!("Index: {} bytes", stats.index_bytes);
println!("Graph: {} edges", stats.edge_count);
}

Query Performance

#![allow(unused)]
fn main() {
use std::time::Instant;

let start = Instant::now();
let results = engine.search(&query)?;
let duration = start.elapsed();

println!("Search took {:?}", duration);
println!("Returned {} results", results.len());
}

Profiling

# CPU profiling with flamegraph
cargo install flamegraph
cargo flamegraph --bin my_benchmark

# Memory profiling
cargo install heaptrack
heaptrack ./target/release/my_benchmark

Platform-Specific Tips

Native (Server)

Enable LTO in release builds
Use memory-mapped storage for large datasets
Configure thread pool size based on CPU cores
Consider NUMA awareness for multi-socket systems

# Cargo.toml
[profile.release]
lto = true
codegen-units = 1

WASM (Browser)

Use OPFS for persistent storage
Limit concurrent operations (single-threaded)
Offload heavy operations to Web Workers
Pre-load WASM module during page load

// Preload WASM
const wasmPromise = init();

// Later, when needed
await wasmPromise;
const db = await LatticeDB.create(config);

Troubleshooting

Slow Search

Check ef parameter (too low = poor recall, too high = slow)
Verify SIMD is enabled (cargo build --features simd)
Profile to identify bottleneck (distance calc vs graph traversal)

High Memory Usage

Consider quantization (4-32x reduction)
Use memory-mapped storage
Reduce m and m0 parameters
Check for memory leaks with profiler

Poor Recall

Increase ef at search time
Increase ef_construction and rebuild index
Verify distance metric matches your data
Check for data quality issues (zero vectors, outliers)

Next Steps

Benchmarks - Detailed performance numbers
HNSW Index - Algorithm deep dive

Browser Demo

LatticeDB runs entirely in the browser via WebAssembly. No server required.

Live Demo

Open the standalone HTML demo:

examples/browser-demo.html

Or download and open locally:

curl -O https://raw.githubusercontent.com/Avarok-Cybersecurity/lattice-db/main/examples/browser-demo.html
open browser-demo.html  # macOS
# or: xdg-open browser-demo.html  # Linux
# or: start browser-demo.html     # Windows

Using from CDN

Import directly from GitHub Pages:

<script type="module">
    const CDN = 'https://avarok-cybersecurity.github.io/lattice-db';

    const { LatticeDB } = await import(`${CDN}/js/lattice-db.min.js`);
    const db = await LatticeDB.init(`${CDN}/wasm/lattice_server_bg.wasm`);

    // Create a collection
    db.createCollection('vectors', {
        vectors: { size: 128, distance: 'Cosine' }
    });

    // Insert data
    db.upsert('vectors', [
        { id: 1, vector: new Array(128).fill(0.1), payload: { name: 'example' } }
    ]);

    // Search
    const results = db.search('vectors', new Array(128).fill(0.1), 5);
    console.log(results);
</script>

Available Bundles

File	Format	Size	Use Case
`lattice-db.min.js`	ESM (minified)	~15KB	Production
`lattice-db.esm.js`	ESM	~25KB	Development
`lattice-db.js`	CommonJS	~25KB	Node.js/bundlers
`lattice_server_bg.wasm`	WASM	~500KB	Required runtime

NPM Installation

For bundled applications, install from npm:

npm install lattice-db

import { LatticeDB } from 'lattice-db';

const db = await LatticeDB.init();

See TypeScript API for complete documentation.

Development Setup

This chapter guides you through setting up a development environment for contributing to LatticeDB.

Prerequisites

Required Tools

Rust 1.75+: Install via rustup
Git: For version control
wasm-pack: For WASM builds (optional)

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Install wasm-pack (for WASM development)
curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh

Optional Tools

Docker: For running comparison databases (Qdrant, Neo4j)
mdbook: For building documentation
criterion: For benchmarking

# Install mdbook
cargo install mdbook

# Install criterion CLI
cargo install cargo-criterion

Getting the Source

# Clone the repository
git clone https://github.com/Avarok-Cybersecurity/lattice-db.git
cd lattice-db

# Check out a feature branch
git checkout -b my-feature

Project Structure

lattice-db/
├── Cargo.toml           # Workspace configuration
├── crates/
│   ├── lattice-core/    # Core library (pure logic)
│   ├── lattice-storage/ # Storage implementations
│   ├── lattice-server/  # HTTP/WASM server
│   └── lattice-bench/   # Benchmarks
├── book/                # Documentation (mdbook)
├── .github/
│   └── workflows/       # CI/CD pipelines
└── README.md

Building

Native Build

# Debug build (fast compilation)
cargo build

# Release build (optimized)
cargo build --release

# Build specific crate
cargo build -p lattice-core

WASM Build

# Build WASM package
wasm-pack build crates/lattice-server \
    --target web \
    --out-dir pkg \
    --no-default-features \
    --features wasm

# Output is in crates/lattice-server/pkg/

Build with Features

# Build with SIMD optimization
cargo build --release --features simd

# Build with memory-mapped storage
cargo build --release --features mmap

# Build with OpenAPI documentation
cargo build -p lattice-server --features openapi

Running Tests

All Tests

# Run all tests
cargo test --workspace

# Run with output
cargo test --workspace -- --nocapture

Specific Tests

# Test specific crate
cargo test -p lattice-core

# Test specific module
cargo test -p lattice-core --lib hnsw

# Test specific function
cargo test -p lattice-core test_search_returns_k_results

WASM Tests

# Requires Chrome installed
wasm-pack test --headless --chrome crates/lattice-core

# Run in Firefox
wasm-pack test --headless --firefox crates/lattice-core

Running Benchmarks

Quick Benchmark

# Fast iteration benchmark
cargo run -p lattice-bench --release --example quick_vector_bench

Full Criterion Benchmarks

# All benchmarks
cargo bench -p lattice-bench

# Specific benchmark
cargo bench -p lattice-bench --bench vector_ops

# With comparison to baseline
cargo bench -p lattice-bench -- --baseline main

View Reports

# Open HTML report
open target/criterion/report/index.html

Documentation

Build API Docs

# Generate rustdoc
cargo doc --workspace --no-deps

# Open in browser
cargo doc --workspace --no-deps --open

Build the Book

# Build mdbook
mdbook build book

# Serve locally with hot reload
mdbook serve book --open

Code Style

Formatting

# Format all code
cargo fmt --all

# Check formatting
cargo fmt --all -- --check

Linting

# Run clippy
cargo clippy --workspace -- -D warnings

# With all targets
cargo clippy --workspace --all-targets -- -D warnings

Pre-commit Hook

# Install pre-commit hook
cat > .git/hooks/pre-commit << 'EOF'
#!/bin/sh
cargo fmt --all -- --check
cargo clippy --workspace -- -D warnings
cargo test --workspace
EOF
chmod +x .git/hooks/pre-commit

IDE Setup

VS Code

Recommended extensions:

rust-analyzer: Rust language support
CodeLLDB: Debugging
Even Better TOML: TOML syntax
Error Lens: Inline error display

Settings (.vscode/settings.json):

{
  "rust-analyzer.check.command": "clippy",
  "rust-analyzer.cargo.features": ["simd"],
  "[rust]": {
    "editor.formatOnSave": true
  }
}

JetBrains (RustRover/CLion)

Install Rust plugin
Enable “Run rustfmt on save”
Configure Clippy as external linter

Debugging

Native

# Debug build
cargo build

# Run with debugger (VS Code/CodeLLDB)
# Or use lldb/gdb directly
lldb target/debug/lattice-server

WASM

# Build with debug info
wasm-pack build --dev crates/lattice-server --target web

# Use browser DevTools:
# - Sources tab for breakpoints
# - Console for WASM errors

Environment Variables

Variable	Description	Default
`RUST_LOG`	Log level	`info`
`RUST_BACKTRACE`	Enable backtraces	`0`
`LATTICE_PORT`	Server port	`6333`

# Example
RUST_LOG=debug cargo run -p lattice-server

Docker Development

Run Comparison Databases

# Qdrant
docker run -p 6333:6333 qdrant/qdrant

# Neo4j
docker run -p 7474:7474 -p 7687:7687 \
    -e NEO4J_AUTH=neo4j/password \
    neo4j:community

Build LatticeDB Image

docker build -t lattice-db .
docker run -p 6333:6333 lattice-db

Troubleshooting

Build Failures

# Clean and rebuild
cargo clean
cargo build

# Update dependencies
cargo update

Test Failures

# Run with backtrace
RUST_BACKTRACE=1 cargo test

# Run single-threaded (for debugging)
cargo test -- --test-threads=1

WASM Issues

# Ensure wasm32 target is installed
rustup target add wasm32-unknown-unknown

# Check wasm-pack version
wasm-pack --version  # Should be 0.12+

Next Steps

Testing - Testing guidelines and patterns
Architecture - Understand the codebase

Testing

This chapter covers testing guidelines, patterns, and best practices for contributing to LatticeDB.

Test Organization

Test Location

Test Type	Location	Command
Unit tests	`src/*.rs` (inline `#[cfg(test)]`)	`cargo test`
Integration tests	`tests/*.rs`	`cargo test --test <name>`
WASM tests	`src/*.rs` with `wasm_bindgen_test`	`wasm-pack test`
Benchmarks	`benches/*.rs`	`cargo bench`

Module Structure

#![allow(unused)]
fn main() {
// src/hnsw.rs

pub struct HnswIndex { ... }

impl HnswIndex {
    pub fn search(&self, query: &[f32], k: usize, ef: usize) -> Vec<SearchResult> {
        // Implementation
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_search_returns_k_results() {
        // Test implementation
    }
}
}

Writing Unit Tests

Basic Test Pattern

#![allow(unused)]
fn main() {
#[test]
fn test_function_name_describes_behavior() {
    // Arrange
    let index = HnswIndex::new(test_config(), Distance::Cosine);
    let point = Point::new_vector(1, vec![0.1, 0.2, 0.3]);

    // Act
    index.insert(&point);
    let results = index.search(&[0.1, 0.2, 0.3], 10, 100);

    // Assert
    assert_eq!(results.len(), 1);
    assert_eq!(results[0].id, 1);
}
}

Test Naming Convention

#![allow(unused)]
fn main() {
// Good: Describes behavior
#[test]
fn test_search_returns_empty_for_empty_index() { ... }

#[test]
fn test_insert_updates_existing_point_with_same_id() { ... }

#[test]
fn test_delete_returns_false_for_nonexistent_point() { ... }

// Bad: Vague names
#[test]
fn test_search() { ... }

#[test]
fn test_insert() { ... }
}

Test Helpers

#![allow(unused)]
fn main() {
// Common test configuration
fn test_config() -> HnswConfig {
    HnswConfig {
        m: 16,
        m0: 32,
        ml: HnswConfig::recommended_ml(16),
        ef: 100,
        ef_construction: 200,
    }
}

// Random vector generation (deterministic)
fn random_vector(dim: usize, seed: u64) -> Vector {
    let mut rng = seed;
    (0..dim)
        .map(|_| {
            rng = rng.wrapping_mul(6364136223846793005).wrapping_add(1);
            (rng as f64 / u64::MAX as f64) as f32
        })
        .collect()
}

// Approximate equality for floats
fn approx_eq(a: f32, b: f32, epsilon: f32) -> bool {
    (a - b).abs() < epsilon
}
}

WASM Tests

Configuration

#![allow(unused)]
fn main() {
// At the top of the test module
#[cfg(all(target_arch = "wasm32", test))]
wasm_bindgen_test::wasm_bindgen_test_configure!(run_in_browser);
}

Test Attribute

#![allow(unused)]
fn main() {
#[cfg(target_arch = "wasm32")]
use wasm_bindgen_test::wasm_bindgen_test as test;

#[test]
fn test_works_on_both_native_and_wasm() {
    // This test runs on both platforms
}
}

Async WASM Tests

#![allow(unused)]
fn main() {
#[cfg(target_arch = "wasm32")]
use wasm_bindgen_test::wasm_bindgen_test;

#[wasm_bindgen_test]
async fn test_async_storage_operation() {
    let storage = OpfsStorage::new().await.unwrap();
    storage.write_page(0, b"hello").await.unwrap();
    let page = storage.read_page(0).await.unwrap();
    assert_eq!(page, b"hello");
}
}

Running WASM Tests

# Chrome (headless)
wasm-pack test --headless --chrome crates/lattice-core

# Firefox
wasm-pack test --headless --firefox crates/lattice-core

# With output
wasm-pack test --headless --chrome crates/lattice-core -- --nocapture

Testing Async Code

Basic Async Test

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_async_operation() {
    let storage = MemStorage::new();
    storage.write_page(0, b"test").await.unwrap();
    let data = storage.read_page(0).await.unwrap();
    assert_eq!(data, b"test");
}
}

Testing with Timeouts

#![allow(unused)]
fn main() {
#[tokio::test]
async fn test_operation_completes_quickly() {
    let result = tokio::time::timeout(
        Duration::from_secs(5),
        expensive_operation()
    ).await;

    assert!(result.is_ok(), "Operation timed out");
}
}

Property-Based Testing

Using proptest

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn test_quantize_dequantize_preserves_order(
        a in prop::collection::vec(-1.0f32..1.0, 1..128),
        b in prop::collection::vec(-1.0f32..1.0, 1..128),
    ) {
        // Ensure same length
        let len = a.len().min(b.len());
        let a = &a[..len];
        let b = &b[..len];

        let qa = QuantizedVector::quantize(a);
        let qb = QuantizedVector::quantize(b);

        let original_dist = cosine_distance(a, b);
        let quantized_dist = qa.cosine_distance_asymmetric(b);

        // Quantization should preserve relative ordering
        // (within some error margin)
        prop_assert!((original_dist - quantized_dist).abs() < 0.2);
    }
}
}

Integration Tests

Test File Structure

#![allow(unused)]
fn main() {
// tests/integration_test.rs
use lattice_core::*;
use lattice_storage::MemStorage;

#[tokio::test]
async fn test_full_workflow() {
    // Create collection
    let config = CollectionConfig::new(
        "test_collection",
        VectorConfig::new(128, Distance::Cosine),
        HnswConfig::default(),
    );
    let storage = MemStorage::new();
    let mut engine = CollectionEngine::new(config, storage).unwrap();

    // Insert points
    for i in 0..100 {
        let point = Point::new_vector(i, random_vector(128, i));
        engine.upsert(point).unwrap();
    }

    // Search
    let query = random_vector(128, 999);
    let results = engine.search(&SearchQuery::new(query).with_limit(10)).unwrap();

    assert_eq!(results.len(), 10);
}
}

Benchmarks

Criterion Benchmark

#![allow(unused)]
fn main() {
// benches/search_bench.rs
use criterion::{black_box, criterion_group, criterion_main, Criterion};

fn bench_search(c: &mut Criterion) {
    // Setup
    let index = create_test_index(10000);
    let query = random_vector(128, 0);

    c.bench_function("search_10k", |b| {
        b.iter(|| {
            black_box(index.search(&query, 10, 100))
        })
    });
}

criterion_group!(benches, bench_search);
criterion_main!(benches);
}

Running Benchmarks

# Run all benchmarks
cargo bench -p lattice-bench

# Run specific benchmark
cargo bench -p lattice-bench -- search

# Compare to baseline
cargo bench -p lattice-bench -- --baseline main

Testing Best Practices

1. Test Edge Cases

#![allow(unused)]
fn main() {
#[test]
fn test_search_empty_index() {
    let index = HnswIndex::new(config, Distance::Cosine);
    let results = index.search(&[0.1, 0.2], 10, 100);
    assert!(results.is_empty());
}

#[test]
fn test_search_k_greater_than_index_size() {
    let mut index = HnswIndex::new(config, Distance::Cosine);
    index.insert(&Point::new_vector(1, vec![0.1, 0.2]));

    let results = index.search(&[0.1, 0.2], 100, 100);
    assert_eq!(results.len(), 1);  // Returns available, not k
}

#[test]
fn test_quantize_zero_vector() {
    let quantized = QuantizedVector::quantize(&[0.0, 0.0, 0.0]);
    assert!(!quantized.is_empty());
}
}

2. Test Error Conditions

#![allow(unused)]
fn main() {
#[test]
fn test_delete_nonexistent_returns_false() {
    let mut index = HnswIndex::new(config, Distance::Cosine);
    assert!(!index.delete(999));
}

#[test]
fn test_storage_error_on_missing_page() {
    let storage = MemStorage::new();
    let result = storage.read_page(999).await;
    assert!(matches!(result, Err(StorageError::PageNotFound { .. })));
}
}

3. Use Descriptive Assertions

#![allow(unused)]
fn main() {
// Good: Clear failure message
assert_eq!(
    results.len(),
    10,
    "Expected 10 results but got {}, query: {:?}",
    results.len(),
    query
);

// Bad: Unhelpful failure
assert!(results.len() == 10);
}

4. Isolate Tests

#![allow(unused)]
fn main() {
// Good: Each test is independent
#[test]
fn test_insert_single() {
    let mut index = HnswIndex::new(config, Distance::Cosine);
    index.insert(&point);
    assert_eq!(index.len(), 1);
}

// Bad: Tests depend on shared state
static mut SHARED_INDEX: Option<HnswIndex> = None;

#[test]
fn test_1_insert() {
    unsafe { SHARED_INDEX = Some(HnswIndex::new(...)); }
}

#[test]
fn test_2_search() {
    // Fails if test_1 didn't run first!
}
}

5. Test Recall Statistically

#![allow(unused)]
fn main() {
#[test]
fn test_recall_at_least_90_percent() {
    let index = create_index_with_1000_points();
    let distance = DistanceCalculator::new(Distance::Cosine);

    let mut total_recall = 0.0;
    let num_queries = 100;

    for q in 0..num_queries {
        let query = random_vector(128, 10000 + q);

        // Ground truth via brute force
        let gt = brute_force_search(&index, &query, 10);

        // HNSW search
        let results = index.search(&query, 10, 100);

        // Calculate recall
        let hits = gt.iter()
            .filter(|&id| results.iter().any(|r| r.id == *id))
            .count();
        total_recall += hits as f64 / 10.0;
    }

    let avg_recall = total_recall / num_queries as f64;
    assert!(
        avg_recall >= 0.9,
        "Average recall {:.3} below 90% threshold",
        avg_recall
    );
}
}

Coverage

Generate Coverage Report

# Install grcov
cargo install grcov

# Run tests with coverage
CARGO_INCREMENTAL=0 RUSTFLAGS='-Cinstrument-coverage' \
    LLVM_PROFILE_FILE='cargo-test-%p-%m.profraw' \
    cargo test --workspace

# Generate HTML report
grcov . --binary-path ./target/debug/deps/ -s . -t html --branch --ignore-not-existing -o ./target/coverage/

# Open report
open target/coverage/index.html

Next Steps

Development Setup - Environment configuration
Architecture - Understanding the codebase

Keyboard shortcuts

LatticeDB Documentation