Saltar a contenido

P2P and Multi-Network Architecture

1. Overview

CyberEco is a human-centered digital ecosystem where 30+ interconnected applications share identity, permissions, financial data, groups, and notifications through a unified data layer. Today that data layer runs on Firebase via the FirebaseStorageAdapter. Tomorrow it must run on any physical network -- internet, WiFi Direct, Bluetooth, LoRa, wired LAN -- with no required central server.

This document defines the architecture that makes that transition possible. The key insight is that CyberEco's existing StorageAdapter abstraction already decouples business logic from storage infrastructure. We extend that same principle to networking: just as StorageAdapter abstracts where data is stored, a new TransportAdapter abstracts how data moves between devices.

The end state is a system where:

  • A family in a rural area with no internet can sync expenses via Bluetooth mesh.
  • A neighborhood governance vote runs over a local WiFi Direct network during an outage.
  • A global community of contributors replicates financial records across continents via libp2p.
  • All of these scenarios use the exact same DataLayerService, PermissionService, SyncService, and domain services that exist in @cyber-eco/services today -- unchanged.

2. Design Principles

Transport Agnosticism

Just as StorageAdapter abstracts storage backends, TransportAdapter abstracts networking. Domain services never know whether data arrived via WebRTC, Bluetooth, LoRa, or a USB cable. The transport layer is a pluggable concern, injected at runtime.

Offline-First

Devices work locally at all times. Every read hits local storage first. Every write succeeds locally and replicates when connectivity allows. There is no distinction between "online mode" and "offline mode" -- there is only "local-first mode with optional replication."

Progressive Connectivity

A device starts with whatever transports are physically available. A phone in airplane mode uses only its local store. Turn on Bluetooth and it joins the nearest mesh. Connect to WiFi and it discovers more peers. Reach the internet and it joins the global network. Each layer adds reach but none is required.

Data Sovereignty

Users control where their data lives physically. Data can be pinned to specific devices, replicated only within a geographic region, or shared globally -- at the user's discretion. No central authority decides where bits are stored.

Resilient by Default

Network partitions are normal, not exceptional. The system is designed for the case where peers disconnect, networks fragment, and devices go offline for days. CRDTs ensure that when partitions heal, data converges automatically without human intervention.


3. Transport Abstraction Layer

3.1 TransportAdapter Interface

The TransportAdapter is the networking equivalent of StorageAdapter. Every physical network -- Bluetooth, WiFi Direct, LoRa, internet -- implements this single interface.

type TransportType = 'internet' | 'bluetooth' | 'wifi-direct' | 'lora' | 'wired';

interface TransportCapabilities {
  maxMessageSize: number;       // bytes
  averageBandwidth: number;     // bytes per second
  averageLatency: number;       // milliseconds
  supportsStreaming: boolean;
  supportsBroadcast: boolean;
  estimatedRange: string;       // human-readable, e.g. "~100m", "global"
  powerConsumption: 'low' | 'medium' | 'high';
}

interface PeerInfo {
  peerId: string;               // Base58(SHA-256(publicKey))
  displayName?: string;
  transports: TransportType[];  // which transports this peer is reachable on
  lastSeen: number;             // unix timestamp
  reputation: number;           // 0.0 - 1.0
  metadata?: Record<string, unknown>;
}

type Priority = 'critical' | 'high' | 'normal' | 'low' | 'bulk';

interface NetworkMessage {
  id: string;                   // unique message ID (UUID v4)
  type: string;                 // message type for routing
  payload: Uint8Array;          // CBOR-encoded content
  sender: string;               // sender peerId
  recipient?: string;           // target peerId (undefined = broadcast)
  timestamp: number;            // unix timestamp
  ttl: number;                  // time-to-live in seconds
  hopCount: number;             // incremented at each relay
  maxHops: number;              // maximum allowed hops
  signature: Uint8Array;        // Ed25519 signature over (type + payload + sender + timestamp)
}

type Unsubscribe = () => void;

interface TransportAdapter {
  // Connection lifecycle
  start(): Promise<void>;
  stop(): Promise<void>;
  isActive(): boolean;

  // Peer management
  getPeers(): PeerInfo[];
  connectToPeer(peerId: string): Promise<void>;
  disconnectFromPeer(peerId: string): Promise<void>;

  // Messaging
  send(peerId: string, message: NetworkMessage): Promise<void>;
  broadcast(message: NetworkMessage): Promise<void>;
  onMessage(callback: (peerId: string, message: NetworkMessage) => void): Unsubscribe;

  // Discovery
  onPeerDiscovered(callback: (peer: PeerInfo) => void): Unsubscribe;
  onPeerLost(callback: (peerId: string) => void): Unsubscribe;

  // Metadata
  getTransportType(): TransportType;
  getCapabilities(): TransportCapabilities;
}

3.2 Transport Implementations

InternetTransport (WebRTC + libp2p)

The internet transport provides global reach using battle-tested peer-to-peer protocols.

Protocol stack:

  • libp2p as the core networking framework
  • Kademlia DHT for distributed peer discovery (no central directory)
  • WebRTC data channels for browser-to-browser communication
  • TCP and QUIC for server-side and native nodes
  • Circuit relay v2 for NAT traversal when direct connections fail
  • GossipSub for pub/sub topic-based messaging

Characteristics:

Property Value
Bandwidth High (limited by user's connection)
Latency Low to medium (50-200ms typical)
Range Global
Power Medium to high
Best for Bulk replication, global sync, large file transfer

Discovery: Peers bootstrap from a configurable set of DNS seed nodes, then discover additional peers via Kademlia DHT walks. CyberEco-specific peers are found by subscribing to a well-known DHT topic (/cybereco/peers/v1).

NAT traversal: When two peers cannot connect directly (symmetric NAT, corporate firewalls), they use circuit relay nodes operated by CyberEco community members. Relay operators earn CYE tokens for bandwidth contributed.

BluetoothTransport (BLE Mesh)

Bluetooth Low Energy provides always-on, low-power device-to-device communication.

Protocol stack:

  • Bluetooth Low Energy 5.0+ as the physical layer
  • Bluetooth Mesh Profile for multi-hop mesh networking
  • GATT services for structured data exchange
  • BLE advertising for peer discovery (custom manufacturer data field)

Characteristics:

Property Value
Bandwidth Low (~1 Mbps theoretical, ~100 kbps practical for mesh)
Latency Low (10-50ms per hop)
Range ~100m line-of-sight, reduced indoors
Power Very low
Best for Identity sync, permission checks, notifications, presence

Discovery: Devices continuously advertise a CyberEco-specific BLE service UUID. Scanning devices detect advertisements and initiate GATT connections. The mesh profile enables multi-hop message relay, extending effective range beyond direct Bluetooth reach.

Payload strategy: Given the low bandwidth, BLE carries only compact payloads: identity tokens, permission grants, notification headers, presence heartbeats. Bulk data is deferred to WiFi Direct or internet transports.

WiFiDirectTransport

WiFi Direct provides high-bandwidth, low-latency local communication without requiring an access point.

Protocol stack:

  • Wi-Fi Direct (Wi-Fi P2P) for direct device connections
  • mDNS / DNS-SD for zero-configuration service discovery
  • TCP for reliable data transfer
  • TLS 1.3 over the TCP connection for encryption

Characteristics:

Property Value
Bandwidth High (~250 Mbps)
Latency Very low (<10ms)
Range ~200m
Power Medium
Best for Bulk data sync, file sharing, local group operations, database replication

Discovery: Devices announce a _cybereco._tcp.local service via mDNS. The TXT record includes the peer's public key fingerprint, supported protocol version, and available collections. Other devices on the same WiFi Direct group discover these announcements automatically.

Group formation: When multiple CyberEco devices are in range, one is elected as the WiFi Direct group owner (preferring the device with highest uptime and battery). The group owner acts as a local relay, enabling all devices in the group to communicate without individual pairings.

LoRaTransport (Meshtastic-inspired)

LoRa provides extreme-range, low-power communication for scenarios where no other network exists.

Protocol stack:

  • LoRa (Long Range) physical layer at 868 MHz (EU) or 915 MHz (US)
  • Meshtastic-compatible mesh protocol for multi-hop routing
  • Store-and-forward at each hop for reliability
  • Time-slotted ALOHA for channel access

Characteristics:

Property Value
Bandwidth Very low (~0.3-50 kbps depending on spreading factor)
Latency High (seconds to minutes for multi-hop)
Range 1-15 km per hop, effectively unlimited with mesh
Power Very low
Best for Presence beacons, emergency notifications, remote area connectivity

Discovery: Nodes broadcast periodic beacon frames containing their peerId, supported message types, and a compact bloom filter of collections they replicate. Other nodes within radio range receive these beacons and add the sender to their peer table.

Payload strategy: LoRa's extreme bandwidth constraints require aggressive message compression. Messages are CBOR-encoded with field-level compression. Typical payloads: presence heartbeat (32 bytes), notification header (64 bytes), permission grant (96 bytes). Bulk data is never sent over LoRa -- only metadata and pointers.

Multi-hop routing: Messages include a TTL (default: 5 hops) and a hop counter. Each relay node decrements TTL and increments hop count. Nodes maintain a small routing table of recently seen peers and their directions (which neighbor delivered their messages). This enables approximate geographic routing without GPS.

WiredTransport (LAN)

Wired Ethernet provides the highest bandwidth and lowest latency for fixed installations.

Protocol stack:

  • UDP multicast (224.0.0.251, port 5353) for discovery (mDNS)
  • TCP for reliable data transfer
  • TLS 1.3 for encryption

Characteristics:

Property Value
Bandwidth Very high (1 Gbps+)
Latency Very low (<1ms)
Range Local network
Power N/A (wall-powered)
Best for Home servers, office deployments, server clusters, NAS backup

Discovery: Identical to WiFi Direct (mDNS/DNS-SD), but on the wired network interface. The _cybereco._tcp.local service announcement works across both wired and wireless mDNS.

Use cases: A Raspberry Pi running as a home CyberEco node, pinning the family's data locally. An office server acting as a high-availability cache for the team's shared collections. A cluster of machines providing redundant storage for a community organization.

3.3 Multi-Transport Manager

The MultiTransportManager aggregates all active transports and provides unified peer management and intelligent message routing.

class MultiTransportManager {
  private transports: Map<TransportType, TransportAdapter>;
  private peerTransportMap: Map<string, Set<TransportType>>;  // peerId -> available transports

  // Transport lifecycle
  addTransport(transport: TransportAdapter): void;
  removeTransport(type: TransportType): void;
  getActiveTransports(): TransportAdapter[];

  // Unified peer view (merges peers across all transports)
  getAllPeers(): PeerInfo[];
  getPeerTransports(peerId: string): TransportType[];

  // Smart routing
  send(peerId: string, message: NetworkMessage, priority?: Priority): Promise<void>;
  broadcast(message: NetworkMessage): Promise<void>;

  // Events (aggregated from all transports)
  onMessage(callback: (peerId: string, message: NetworkMessage) => void): Unsubscribe;
  onPeerDiscovered(callback: (peer: PeerInfo) => void): Unsubscribe;
  onPeerLost(callback: (peerId: string) => void): Unsubscribe;
}

Routing logic:

The manager selects the optimal transport for each message based on payload size, priority, and available transports to the target peer:

Message Size Priority Preferred Transport Order
< 256 bytes Critical LoRa > Bluetooth > WiFi Direct > Wired > Internet
< 256 bytes Normal Bluetooth > WiFi Direct > Wired > Internet > LoRa
256B - 64KB Any WiFi Direct > Bluetooth > Wired > Internet
64KB - 1MB Any WiFi Direct > Wired > Internet
> 1MB Any Wired > WiFi Direct > Internet
Any Bulk (background) Internet > Wired > WiFi Direct

Fallback behavior: If the preferred transport fails, the manager attempts the next transport in order. If all transports fail, the message is queued in the store-and-forward buffer for later delivery.

Deduplication: Since a peer may be reachable on multiple transports simultaneously, the manager deduplicates both peers (same peerId seen on Bluetooth and WiFi Direct = one logical peer) and messages (same message ID received on two transports = delivered once).


4. Peer Discovery

4.1 Discovery Protocols

Each transport uses the discovery mechanism native to its physical layer:

Transport Discovery Method Range Refresh Interval
Internet Kademlia DHT, DNS seed nodes, relay server registry Global Continuous (DHT walk every 30s)
Bluetooth BLE advertising + active scanning (CyberEco service UUID) ~100m Advertisement every 1s, scan every 5s
WiFi Direct mDNS / DNS-SD (_cybereco._tcp.local) ~200m Announcement every 10s
LoRa Beacon frames (32-byte compressed peer info) 1-15 km Beacon every 30s
Wired mDNS / DNS-SD (_cybereco._tcp.local) LAN Announcement every 10s

Unified peer table: The MultiTransportManager maintains a single peer table that merges discoveries from all transports. A peer discovered on Bluetooth and later also found on WiFi Direct appears as one entry with two available transports.

4.2 Peer Identity

Every CyberEco peer has a persistent cryptographic identity independent of any transport or network:

  • Keypair: Ed25519 (256-bit) -- generated once on first launch, stored in device secure enclave or keychain
  • PeerId: Base58(SHA-256(publicKey)) -- deterministic, 44 characters, human-readable-ish
  • Cross-transport identity: The same keypair (and therefore same PeerId) is used across all transports. A phone discovered on Bluetooth with PeerId Qm7xK9... is the same peer when it appears on WiFi Direct.
  • CyberEco identity link (optional): Users may link their PeerId to their CyberEco userId. This is opt-in and not required for P2P operation. Linking enables features like: "show display name instead of PeerId," "apply user's permission rules to P2P data requests," and "earn CYE tokens for relay services."
  • Multiple devices: A single CyberEco user may have multiple devices, each with its own PeerId. Device linking is managed through the user's profile (signed declaration: "these PeerIds belong to me").

4.3 Peer Reputation

Each node maintains a local reputation score for every peer it interacts with. Reputation is not global consensus -- it is each node's subjective assessment based on direct experience.

Tracked metrics:

Metric Weight Description
Uptime reliability 0.25 How often the peer is available when expected
Response latency 0.15 Average time to respond to requests
Data integrity 0.30 Percentage of messages with valid signatures and correct CIDs
Relay willingness 0.15 How often the peer relays messages for others
Storage contribution 0.15 How much data the peer pins for the network

Reputation uses:

  • Routing decisions: Higher-reputation peers are preferred as relay nodes
  • Replication partner selection: Data is preferentially replicated to reliable peers
  • CYE reward eligibility: Minimum reputation threshold required to earn tokens for network contribution
  • Trust tiers: Peers below 0.3 reputation are deprioritized; peers below 0.1 are disconnected

5. Data Replication

5.1 Content-Addressed Storage

Every document in the P2P network is identified by its content, not by a mutable path.

  • Content Identifier (CID): CIDv1(codec=dag-cbor, hash=sha2-256, base=base32)
  • Encoding: Documents are serialized as CBOR (Concise Binary Object Representation) before hashing
  • Deterministic serialization: CBOR keys are sorted lexicographically to ensure identical documents produce identical CIDs regardless of field insertion order

Benefits:

  • Deduplication: Two peers holding the same document can verify this by comparing CIDs without transferring the document
  • Integrity verification: A received document's CID is recomputed and compared to the expected CID; any tampering is detected
  • Cache-friendly: CIDs are immutable -- a CID always points to exactly the same content, making aggressive caching safe
  • Link structure: Documents can reference other documents by CID, forming a verifiable DAG (Directed Acyclic Graph)

Mapping to collections: The CyberEco collection/id model maps to content-addressed storage as follows:

Collection: "users", ID: "user123"
  -> current CID: bafyreig5h7...
  -> document content: { id: "user123", displayName: "Alice", ... }
  -> collection index: "users" -> { "user123": "bafyreig5h7..." }

The collection index itself is a document with its own CID. The root of all collections is a single "root CID" that represents the entire database state at a point in time.

5.2 Replication Strategy

Replication factor: Configurable per collection. Defaults:

Collection Replication Factor Rationale
users 3 Core identity, moderate redundancy
permissions 5 Critical for access control, high redundancy
transactions 3 Financial data, moderate redundancy
groups 3 Standard redundancy
notifications 2 Ephemeral, lower redundancy acceptable
App-specific 2-3 Configurable by app

Replication partner selection:

  1. Proximity preference: Replicate first to peers on the same local network (WiFi Direct, Bluetooth, Wired), then to internet peers
  2. Diversity requirement: At least one replica must be on a different physical network than the origin (prevents a single network failure from losing all copies)
  3. Reputation threshold: Only replicate to peers with reputation >= 0.5
  4. User preference: Users can pin data to specific devices ("keep my financial data only on my phone and my home server")

Pinning:

  • User pin: "I want this data on this device" -- the device will not garbage-collect the data regardless of replication factor
  • App pin: "This app needs this collection locally" -- ensures the app has fast local access
  • Network pin: "This data must stay within this geographic region" -- enforced by only replicating to peers with matching region tags

Garbage collection:

  • Unpinned, unreferenced data is eligible for garbage collection after a configurable TTL (default: 30 days)
  • GC runs as a background process, never interrupting active operations
  • Before deleting, the node checks that at least replicationFactor copies exist on other peers (verified by querying the DHT)

5.3 Conflict Resolution (CRDTs)

When two peers edit the same document while disconnected, their changes must merge deterministically when they reconnect. CyberEco uses Conflict-free Replicated Data Types (CRDTs) to guarantee convergence without coordination.

CRDT types used:

Last-Writer-Wins Register (LWWRegister)

For simple scalar fields where the most recent value should win.

interface LWWRegister<T> {
  value: T;
  timestamp: HLC;  // Hybrid Logical Clock
  peerId: string;  // tiebreaker when timestamps are equal
}

function merge<T>(a: LWWRegister<T>, b: LWWRegister<T>): LWWRegister<T> {
  if (a.timestamp > b.timestamp) return a;
  if (b.timestamp > a.timestamp) return b;
  // Tiebreaker: lexicographically greater peerId wins
  return a.peerId > b.peerId ? a : b;
}

Used for: User display name, email, preferences, app settings, single-value configuration.

Grow-Only Set (G-Set)

For collections where items are only ever added, never removed.

interface GSet<T> {
  items: Set<T>;
}

function merge<T>(a: GSet<T>, b: GSet<T>): GSet<T> {
  return { items: new Set([...a.items, ...b.items]) };
}

Used for: Audit logs, permission change history, transaction history, activity feeds.

Observed-Remove Set (OR-Set)

For collections where items can be both added and removed.

interface ORSet<T> {
  items: Map<T, Set<string>>;    // value -> set of add-tags
  tombstones: Map<T, Set<string>>; // value -> set of remove-tags
}

function merge<T>(a: ORSet<T>, b: ORSet<T>): ORSet<T> {
  // Union of all add-tags and remove-tags
  // An item is present if it has at least one add-tag not in tombstones
}

Used for: Group members, app installations, friend lists, notification subscriptions, feature flags.

Counter CRDT (PN-Counter)

For numeric values that can be incremented and decremented.

interface PNCounter {
  increments: Map<string, number>;  // peerId -> total increments by this peer
  decrements: Map<string, number>;  // peerId -> total decrements by this peer
}

function value(counter: PNCounter): number {
  const inc = [...counter.increments.values()].reduce((a, b) => a + b, 0);
  const dec = [...counter.decrements.values()].reduce((a, b) => a + b, 0);
  return inc - dec;
}

function merge(a: PNCounter, b: PNCounter): PNCounter {
  // For each peerId, take max(a.increments[peer], b.increments[peer])
  // and max(a.decrements[peer], b.decrements[peer])
}

Used for: Unread notification count, vote tallies, transaction running totals.

Merge contract: All merge functions satisfy three properties:

  1. Commutative: merge(a, b) === merge(b, a)
  2. Associative: merge(merge(a, b), c) === merge(a, merge(b, c))
  3. Idempotent: merge(a, a) === a

These properties guarantee that peers converge to the same state regardless of the order in which they receive updates, and regardless of receiving the same update multiple times.

Causal ordering: Hybrid Logical Clocks (HLC) combine physical time with logical counters to provide causal ordering without synchronized clocks. Each event gets a timestamp (physicalTime, logicalCounter, peerId). HLCs are monotonically increasing per peer and respect causality: if event A caused event B, then hlc(A) < hlc(B).

5.4 Merkle DAG for Document History

Each document version forms a node in a Merkle DAG (Directed Acyclic Graph), where edges point from a version to its parent(s).

Initial:    v1 (CID: bafy...a1)
              |
Edit by A:  v2 (CID: bafy...b2, parent: bafy...a1)
              |
         +----+----+
         |         |
Edit A:  v3a       v3b  :Edit B (concurrent, disconnected)
         |         |
         +----+----+
              |
Merge:   v4 (CID: bafy...d4, parents: [bafy...c3a, bafy...c3b])

Properties:

  • Full history: Every version is permanently recorded and retrievable by CID
  • Branching: Concurrent edits by disconnected peers create branches naturally
  • Merging: When branches are discovered, CRDT merge creates a new version with multiple parents
  • Audit trail: The complete edit history is cryptographically linked -- no version can be altered without changing all descendant CIDs
  • Selective sync: A peer can request only the versions it is missing by comparing its DAG tips with a remote peer's DAG tips
  • Rollback: Any previous version can be restored by creating a new version whose content matches the old version

6. P2PStorageAdapter

The P2PStorageAdapter implements the existing StorageAdapter interface from @cyber-eco/types. This is the critical integration point: by implementing the same interface that FirebaseStorageAdapter implements, the entire DataLayerService orchestration pipeline -- permissions, caching, sync events, webhooks -- works without modification.

class P2PStorageAdapter implements StorageAdapter {
  private localStore: StorageAdapter;
  private transports: MultiTransportManager;
  private replicationManager: ReplicationManager;
  private crdtEngine: CRDTEngine;
  private syncQueue: SyncQueue;

  constructor(config: {
    localStore: StorageAdapter;       // IndexedDB, SQLite, LevelDB, etc.
    transports: MultiTransportManager;
    replicationFactor: number;        // default: 3
    conflictStrategy: ConflictResolution;
  });

  // --- StorageAdapter implementation ---

  async getDocument<T>(collection: string, id: string): Promise<T | null> {
    // 1. Check local store (instant)
    const local = await this.localStore.getDocument<T>(collection, id);
    if (local !== null) return local;

    // 2. Query connected peers (parallel requests to K nearest peers)
    const remote = await this.queryPeers<T>(collection, id);
    if (remote !== null) {
      // Cache locally for future reads
      await this.localStore.setDocument(collection, id, remote);
      return remote;
    }

    // 3. Fall back to internet gateway (if an internet transport is active)
    const gateway = await this.queryGateway<T>(collection, id);
    if (gateway !== null) {
      await this.localStore.setDocument(collection, id, gateway);
      return gateway;
    }

    return null;
  }

  async setDocument<T>(
    collection: string,
    id: string,
    data: T,
    options?: WriteOptions
  ): Promise<WriteResult> {
    // 1. Write to local store immediately (optimistic, never blocks)
    const result = await this.localStore.setDocument(collection, id, data, options);

    // 2. Compute CID for the new version
    const cid = await this.crdtEngine.computeCID(data);

    // 3. Enqueue replication to connected peers (async, best-effort)
    this.syncQueue.enqueue({
      type: 'replicate',
      collection,
      id,
      cid,
      data,
      replicationFactor: this.replicationManager.getFactorForCollection(collection),
    });

    // 4. Return immediately -- local write is the source of truth
    return result;
  }

  async updateDocument(
    collection: string,
    id: string,
    data: Record<string, unknown>
  ): Promise<WriteResult> {
    // 1. Read current local version
    const current = await this.localStore.getDocument<Record<string, unknown>>(collection, id);

    // 2. Apply update locally via CRDT merge
    const merged = current
      ? this.crdtEngine.applyUpdate(current, data)
      : data;

    // 3. Write merged result locally
    const result = await this.localStore.setDocument(collection, id, merged, { merge: true });

    // 4. Enqueue replication
    this.syncQueue.enqueue({
      type: 'replicate',
      collection,
      id,
      cid: await this.crdtEngine.computeCID(merged),
      data: merged,
      replicationFactor: this.replicationManager.getFactorForCollection(collection),
    });

    return result;
  }

  async deleteDocument(collection: string, id: string): Promise<WriteResult> {
    // 1. Mark as tombstone locally (CRDT delete = add tombstone, not physical delete)
    const tombstone = this.crdtEngine.createTombstone(collection, id);
    await this.localStore.setDocument(collection, `${id}.__tombstone`, tombstone);

    // 2. Remove from local store
    const result = await this.localStore.deleteDocument(collection, id);

    // 3. Replicate tombstone
    this.syncQueue.enqueue({
      type: 'replicate-tombstone',
      collection,
      id,
      tombstone,
    });

    return result;
  }

  async query<T>(
    collection: string,
    filters: QueryFilter[],
    options?: QueryOptions
  ): Promise<PaginatedResult<T>> {
    // Queries execute against local store only
    // Remote data is pulled in via replication, not per-query
    return this.localStore.query<T>(collection, filters, options);
  }

  async batchWrite(operations: BatchOperation[]): Promise<BatchResult> {
    // Execute batch locally
    const result = await this.localStore.batchWrite(operations);

    // Enqueue each operation for replication
    for (const op of operations) {
      if (op.type === 'delete') {
        this.syncQueue.enqueue({
          type: 'replicate-tombstone',
          collection: op.collection,
          id: op.id,
          tombstone: this.crdtEngine.createTombstone(op.collection, op.id),
        });
      } else {
        this.syncQueue.enqueue({
          type: 'replicate',
          collection: op.collection,
          id: op.id,
          cid: await this.crdtEngine.computeCID(op.data),
          data: op.data,
          replicationFactor: this.replicationManager.getFactorForCollection(op.collection),
        });
      }
    }

    return result;
  }

  subscribe<T>(
    collection: string,
    id: string,
    callback: (data: T | null) => void
  ): Unsubscribe {
    // Subscribe to local store changes
    // Remote updates arrive via replication and are written to local store,
    // which triggers the local subscription
    return this.localStore.subscribe<T>(collection, id, callback);
  }

  subscribeToQuery<T>(
    collection: string,
    filters: QueryFilter[],
    callback: (data: T[]) => void
  ): Unsubscribe {
    return this.localStore.subscribeToQuery<T>(collection, filters, callback);
  }

  serverTimestamp(): unknown {
    // In P2P mode, timestamps are HLC (Hybrid Logical Clock) values
    return this.crdtEngine.now();
  }

  generateId(collection: string): string {
    // Generate a globally unique ID without coordination
    // Format: <timestamp>-<random>-<peerId-prefix>
    return this.crdtEngine.generateId();
  }
}

Key Behaviors

Reads are local-first. The getDocument method checks local storage before ever touching the network. In most cases, replication has already brought the data locally, and the read completes in microseconds. Network queries happen only on cache misses.

Writes are optimistic. The setDocument and updateDocument methods write to local storage and return immediately. Replication happens asynchronously in the background. The user never waits for network round-trips on writes.

Subscriptions are local. The subscribe method watches the local store. When remote updates arrive via replication, they are written to the local store, which triggers the local subscription callback. This means the subscription API is identical whether running on Firebase (real-time listener) or P2P (local store + replication-driven updates).

Queries work locally. The query method executes entirely against the local store. This ensures consistent performance regardless of network conditions. Data that has not yet been replicated locally will not appear in query results -- this is the explicit trade-off for offline-first operation.

Deletes are tombstones. In a distributed system, a delete must be a positive assertion ("this document is deleted") rather than an absence. Tombstones are replicated like any other document and are eventually garbage-collected.


7. Offline-First Mesh Networking

7.1 Store-and-Forward

When a message cannot be delivered directly to its intended recipient (because the recipient is not currently connected to any shared transport), intermediate peers store the message and attempt delivery later.

Message lifecycle:

  1. Origin: Sender creates message with TTL (default: 7 days) and maxHops (default: 10)
  2. Relay: If recipient is not directly connected, message is sent to the sender's connected peers
  3. Storage: Each relay peer stores the message in a bounded buffer (default: 1000 messages, LRU eviction)
  4. Forwarding: When a relay peer discovers a new peer, it checks its buffer for messages addressed to that peer and delivers them
  5. Delivery confirmation: Recipient sends an ACK back to the sender (also via store-and-forward if necessary)
  6. Expiration: Messages older than their TTL are discarded without delivery
  7. Hop enforcement: Each relay increments hopCount. If hopCount reaches maxHops, the message is discarded

Buffer management:

  • Maximum buffer size is configurable (default: 10 MB per transport)
  • Messages are prioritized: critical > high > normal > low > bulk
  • When buffer is full, lowest-priority messages with the shortest remaining TTL are evicted first
  • Messages addressed to peers with higher reputation scores are retained preferentially

7.2 Mesh Topologies

The network self-organizes into one of three topologies depending on the number of peers and transport characteristics:

Full mesh (< 20 peers)

Every peer maintains a direct connection to every other peer. Suitable for small groups: a family sharing expenses, a study group collaborating on notes.

    A --- B
   /|\ / |\
  / | X  | \
 /  |/ \ |  \
C --+--- D---E
  • Pros: Minimum latency (single hop), maximum redundancy
  • Cons: O(n^2) connections, does not scale

Partial mesh (20 - 1000 peers)

Each peer connects to K neighbors (default: K=6-8), forming a structured overlay. Routing uses a Kademlia-like distance metric to find peers responsible for specific data.

A -- B -- C
|    |    |
D -- E -- F
|    |    |
G -- H -- I
  • Pros: O(log n) routing, scalable, resilient to individual node failures
  • Cons: Higher latency than full mesh (multiple hops)

Star (any size, when a super-peer is available)

One node acts as a relay for all others. Used when a high-availability node is present: a home server on a wired connection, an office gateway, a community relay node.

    B
    |
A --S-- C
    |
    D
  • Pros: Simple, one-hop latency, easy NAT traversal
  • Cons: Single point of failure (mitigated by fallback to mesh when star center is lost)

Automatic adaptation: The MultiTransportManager monitors peer count and transport capabilities, and transitions between topologies as conditions change. If the star center goes offline, the network falls back to partial mesh within seconds.

7.3 Network Partitions

Network partitions -- where a group of peers becomes disconnected from the rest -- are treated as a normal operating condition, not an error.

During partition:

  • Each partition continues operating independently
  • Writes succeed locally and replicate within the partition
  • Reads return data available within the partition
  • No operations block waiting for the other partition

On reconnection:

  1. Peers exchange their DAG tip CIDs (the CID of their most recent version of each collection index)
  2. If tips differ, peers exchange the versions they are missing (using the Merkle DAG to identify exactly which versions are needed)
  3. CRDT merge functions resolve any conflicts deterministically
  4. Both peers arrive at the same merged state
  5. The merged state is propagated to other peers via normal replication

Split-brain detection:

Vector clocks enable detection of concurrent edits that happened during a partition:

  • If peer A's vector clock is strictly greater than peer B's, A has all of B's updates (no conflict)
  • If neither is strictly greater, the edits are concurrent (true conflict, resolved by CRDT merge)
  • The merge result is a new version with a vector clock that dominates both inputs

8. Security

8.1 Transport Security

All inter-peer communication is encrypted regardless of transport layer:

Transport Encryption Protocol Key Exchange Notes
Internet (TCP) TLS 1.3 ECDHE Standard TLS with certificate pinning
Internet (QUIC) TLS 1.3 (built-in) ECDHE QUIC mandates TLS 1.3
Internet (WebRTC) DTLS 1.2 + SRTP ECDHE Standard WebRTC security
Bluetooth (BLE) Custom over GATT X25519 BLE's native encryption is insufficient; CyberEco adds Noise_XX
WiFi Direct TLS 1.3 over TCP ECDHE Same as wired
LoRa Noise_IK protocol X25519 Lightweight, designed for constrained devices
Wired (LAN) TLS 1.3 over TCP ECDHE Same as internet TCP

Perfect forward secrecy: All key exchanges use ephemeral keys. Compromising a long-term key does not compromise past sessions.

Message authentication: Every NetworkMessage includes an Ed25519 signature over (type + payload + sender + timestamp). Receiving peers verify the signature before processing. Messages with invalid signatures are dropped and the sender's reputation is decremented.

8.2 Data Security

End-to-end encryption for private data:

Private documents (e.g., financial records, personal notes) are encrypted before leaving the originating device. The encryption uses the recipient's Ed25519 public key (converted to X25519 for Diffie-Hellman). Relay peers and storage nodes see only ciphertext.

Sender:
  1. Generate ephemeral X25519 keypair
  2. Derive shared secret: ECDH(ephemeral_private, recipient_X25519_public)
  3. Derive encryption key: HKDF-SHA256(shared_secret, "cybereco-e2ee-v1")
  4. Encrypt: AES-256-GCM(key, nonce, plaintext)
  5. Send: (ephemeral_public, nonce, ciphertext)

Recipient:
  1. Derive shared secret: ECDH(recipient_X25519_private, ephemeral_public)
  2. Derive decryption key: HKDF-SHA256(shared_secret, "cybereco-e2ee-v1")
  3. Decrypt: AES-256-GCM(key, nonce, ciphertext)

Group encryption (threshold encryption):

Shared group data (e.g., group expenses, shared budgets) uses k-of-n threshold encryption. A group of n members requires at least k members to cooperate to decrypt the data. This prevents any single member (or compromised device) from accessing group data alone, while allowing the group to function even if some members are offline.

  • Scheme: Shamir's Secret Sharing for key shares, AES-256-GCM for data encryption
  • Key rotation: Group keys are rotated when members join or leave
  • Default threshold: k = ceil(n/2 + 1) for groups of 3+, k = n for pairs

Integration with encrypted computation:

For use cases requiring computation on encrypted data (e.g., aggregating financial totals across users without revealing individual amounts), the P2P layer integrates with the Encrypted Computation layer (see ENCRYPTED_COMPUTATION.md). The P2P network transports encrypted shares; computation happens on encrypted data; only the result is decrypted.

8.3 Peer Authentication

Before exchanging any application data, peers authenticate each other using a challenge-response protocol:

Initiator                              Responder
    |                                      |
    |-- 1. Hello(initiator_pubkey) ------->|
    |                                      |
    |<- 2. Challenge(nonce_r, resp_pub) ---|
    |                                      |
    |-- 3. Response(                   --->|
    |       sign(nonce_r, init_priv),      |
    |       nonce_i                         |
    |   )                                  |
    |                                      |
    |<- 4. Confirm(                    ----|
    |       sign(nonce_i, resp_priv)       |
    |   )                                  |
    |                                      |
    [---- Authenticated session ----]

Steps:

  1. Initiator sends their Ed25519 public key
  2. Responder generates a random nonce, sends it along with their own public key
  3. Initiator signs the responder's nonce with their private key, and sends their own nonce
  4. Responder verifies the initiator's signature, signs the initiator's nonce, and sends it back
  5. Initiator verifies the responder's signature
  6. Both peers are now mutually authenticated

Optional CyberEco identity verification:

After peer authentication, if both peers have linked their PeerId to a CyberEco userId, they can verify the link by checking the signed declaration in the user's profile (stored in the P2P network or fetched from an internet gateway).

Access control:

  • Allowlist: "Only accept connections from these PeerIds" -- for private family networks
  • Blocklist: "Never accept connections from these PeerIds" -- for blocking malicious peers
  • Open: "Accept connections from any authenticated peer" -- default for public networks

9. Integration with Existing Architecture

9.1 How It Extends the Current StorageAdapter

The existing architecture already provides the abstraction needed for P2P. The DataLayerService in @cyber-eco/services depends only on the StorageAdapter interface from @cyber-eco/types. It has no dependency on @cyber-eco/firebase. Swapping the storage backend requires changing one line:

// Current: Firebase backend
import { FirebaseStorageAdapter } from '@cyber-eco/firebase';

const dataLayer = createDataLayer({
  adapter: new FirebaseStorageAdapter(() => getFirestore()),
  cache: { ttl: { default: 180000 } },
  sync: { enabled: true },
  permissions: { enabled: true },
});

// Future: P2P backend
import { P2PStorageAdapter } from '@cyber-eco/p2p';
import { IndexedDBStorageAdapter } from '@cyber-eco/indexeddb';

const dataLayer = createDataLayer({
  adapter: new P2PStorageAdapter({
    localStore: new IndexedDBStorageAdapter(),
    transports: multiTransportManager,
    replicationFactor: 3,
    conflictStrategy: 'merge',
  }),
  cache: { ttl: { default: 180000 } },
  sync: { enabled: true },
  permissions: { enabled: true },
});

// Hybrid: Firebase + P2P (Firebase as authoritative, P2P for offline/local)
import { HybridStorageAdapter } from '@cyber-eco/p2p';

const dataLayer = createDataLayer({
  adapter: new HybridStorageAdapter({
    primary: new FirebaseStorageAdapter(() => getFirestore()),
    secondary: new P2PStorageAdapter({
      localStore: new IndexedDBStorageAdapter(),
      transports: multiTransportManager,
      replicationFactor: 3,
      conflictStrategy: 'merge',
    }),
    strategy: 'primary-with-offline-fallback',
  }),
  cache: { ttl: { default: 180000 } },
  sync: { enabled: true },
  permissions: { enabled: true },
});

What does not change:

  • DataLayerService orchestration (permission check, cache lookup, adapter call, sync broadcast, webhook emit)
  • PermissionService, SharedDataService, NotificationService, DashboardService, DataExportService
  • All domain logic in consuming applications
  • The createDataLayer() factory wiring
  • The CyberEcoDataLayer return type

9.2 How It Extends SyncService

The current SyncService in @cyber-eco/services is an in-process event broadcaster. It maintains a set of subscriber callbacks and broadcasts SyncEvent objects when data changes occur. In P2P mode, SyncService gains a second source of events: remote peers.

Current flow (Firebase):

DataLayerService.create() -> SyncService.broadcast(event) -> local subscribers

P2P flow:

DataLayerService.create()
  -> SyncService.broadcast(event) -> local subscribers
  -> P2PStorageAdapter replicates to peers
  -> Remote peer receives replication
  -> Remote peer writes to its local store
  -> Remote peer's DataLayerService detects change
  -> Remote peer's SyncService.broadcast(event) -> remote local subscribers

The key insight is that SyncService does not need to change. The P2P replication happens at the StorageAdapter level. When a remote update arrives and is written to the local store, the local subscribe callback fires, which the DataLayerService already handles by updating cache and notifying subscribers.

Extended SyncService (optional, for real-time P2P events):

For use cases requiring lower-latency cross-device sync (e.g., collaborative editing), the SyncService can be extended to also broadcast events over the MultiTransportManager:

class P2PSyncService extends SyncService {
  private transports: MultiTransportManager;

  broadcast(event: SyncEvent): void {
    // Broadcast to local subscribers (existing behavior)
    super.broadcast(event);

    // Also broadcast to connected peers (new P2P behavior)
    const message: NetworkMessage = {
      id: generateUUID(),
      type: 'sync-event',
      payload: cborEncode(event),
      sender: this.localPeerId,
      timestamp: Date.now(),
      ttl: 3600,
      hopCount: 0,
      maxHops: 5,
      signature: sign(event, this.privateKey),
    };
    this.transports.broadcast(message);
  }

  // Listen for remote sync events
  private listenForRemoteEvents(): void {
    this.transports.onMessage((peerId, message) => {
      if (message.type === 'sync-event') {
        const event = cborDecode(message.payload) as SyncEvent;
        event.source = peerId;

        // Merge via CRDT if needed, then broadcast to local subscribers
        super.broadcast(event);
      }
    });
  }
}

9.3 Future Packages

The P2P architecture introduces two new packages to the CyberEco ecosystem, following the same modular pattern as existing packages:

Package Purpose Dependencies
@cyber-eco/p2p TransportAdapter interface and implementations, MultiTransportManager, P2PStorageAdapter, HybridStorageAdapter, ReplicationManager, SyncQueue @cyber-eco/types, @cyber-eco/crdt
@cyber-eco/crdt CRDT implementations (LWWRegister, G-Set, OR-Set, PN-Counter), Hybrid Logical Clocks, vector clocks, Merkle DAG, CID computation, merge functions @cyber-eco/types

Dependency graph extension:

@cyber-eco/types        (zero runtime deps)
       |
       +----------------------+---------------------+
       |                      |                     |
       v                      v                     v
@cyber-eco/firebase     @cyber-eco/auth        @cyber-eco/crdt
  (peer: firebase)       (peer: firebase)      (zero external deps)
       |                      |                     |
       |    +-----------------+                     |
       |    |                                       |
       v    v                                       v
@cyber-eco/services                          @cyber-eco/p2p
  (NO firebase dependency)                    (peer: libp2p)

@cyber-eco/services remains at the center, depending only on @cyber-eco/types for the StorageAdapter interface. It never imports @cyber-eco/p2p or @cyber-eco/firebase directly. The concrete adapter is injected at runtime by the consuming application.


10. Phased Rollout

Phase Timeline Deliverables Success Criteria
Phase 0: Design 2025-2026 This architecture document. TransportAdapter interface added to @cyber-eco/types. CRDT type definitions. Proof-of-concept prototype (two browsers syncing via WebRTC). Interface definitions reviewed and merged. Prototype demonstrates round-trip sync.
Phase 1: Local P2P 2026-2027 @cyber-eco/crdt package (LWWRegister, OR-Set, PN-Counter, HLC, Merkle DAG). WiFi Direct and BLE transport implementations. P2PStorageAdapter with local mesh replication. IndexedDB local store adapter. Two phones on the same WiFi Direct network can create, read, update, and delete documents with automatic CRDT conflict resolution. Works without internet.
Phase 2: Internet P2P 2027-2028 libp2p-based InternetTransport. Kademlia DHT discovery. Circuit relay for NAT traversal. Community relay node infrastructure. HybridStorageAdapter (Firebase primary + P2P fallback). Browser-to-browser sync via WebRTC. Server-to-server sync via TCP/QUIC. Graceful degradation from internet to local mesh when connectivity drops.
Phase 3: Multi-Network 2028-2029 LoRaTransport implementation. MultiTransportManager with intelligent routing. Store-and-forward mesh networking. Wired LAN transport. Cross-transport deduplication. A message sent via LoRa in a rural area reaches a peer on the internet via multi-hop relay. A device seamlessly transitions from WiFi Direct to Bluetooth to internet as conditions change.
Phase 4: Production Hardening 2029-2030 Security audit (transport encryption, E2EE, peer authentication). Performance optimization (replication efficiency, CRDT compaction, GC). Peer reputation system. CYE token integration for relay/storage incentives. Monitoring and observability. System handles 10,000+ concurrent peers across mixed transports. Security audit passes with no critical findings. CYE incentive model sustains community-operated relay infrastructure.

Appendix A: Glossary

Term Definition
BLE Bluetooth Low Energy -- low-power wireless protocol for short-range communication
CBOR Concise Binary Object Representation -- binary data serialization format
CID Content Identifier -- hash-based address for content-addressed storage
CRDT Conflict-free Replicated Data Type -- data structure that can be merged without coordination
CYE CyberEco token -- the ecosystem's governance and incentive token
DAG Directed Acyclic Graph -- data structure where nodes link to parents but cycles are impossible
DHT Distributed Hash Table -- decentralized key-value lookup (e.g., Kademlia)
DTLS Datagram Transport Layer Security -- TLS for UDP-based protocols
E2EE End-to-End Encryption -- encryption where only sender and recipient can read the data
HLC Hybrid Logical Clock -- clock combining physical time with logical ordering
LoRa Long Range -- low-power, long-range radio protocol
LWW Last-Writer-Wins -- conflict resolution where the most recent write takes precedence
mDNS Multicast DNS -- zero-configuration local network service discovery
NAT Network Address Translation -- the reason direct peer connections are difficult on the internet
OR-Set Observed-Remove Set -- CRDT that supports both add and remove operations
PN-Counter Positive-Negative Counter -- CRDT for distributed increment/decrement
TTL Time to Live -- how long a message or data item remains valid
Document Description
ARCHITECTURE.md Current system architecture, StorageAdapter pattern, DataLayerService orchestration
VISION.md Long-term ecosystem vision, including the 2030-2035 decentralization roadmap
TENETS.md Core design principles that guide all architectural decisions
CONSTRAINTS.md Technical and organizational constraints the architecture must respect