Projects — Mansoor Mamnoon | C++ Matching Engine · LLMFirewall

LLMFirewall

MCP security proxy enforcing taint-aware least privilege for tool-using LLM agents.

Problem

Tool-using LLM agents are vulnerable to prompt injection through malicious tool outputs and retrieved documents. Regex denylists are enumerable — an attacker can iterate every known rule.

System

Four-layer signal pipeline: heuristic denylist → lightweight hashed feature vectors for semantic drift (no external model) → capability gating (taint-aware tool allowlisting) → instruction density scoring. Self-play red-team generates 500+ adaptive attacks.

Proof

100% → 16% ASR on local eval suite of 4,200+ cases — baseline is an unfiltered passthrough, not a production system. 783 tests. FPR held at 8.3% via FPR-constrained threshold sweep.

What I Built

A four-layer signal pipeline fused into a single calibrated policy score:

Heuristic denylist (fast path — regex + keyword)
Semantic drift detector using lightweight hashed feature vectors — catches paraphrased attacks the regex layer cannot see (no external model required, sub-ms)
Capability gating — taint-aware tool allowlisting based on document trust level
Document instruction density scoring — flags embedded adversarial instructions

Self-play red-team pipeline generates 500+ adaptive attacks across 7 classes via a 3-round mutation loop. Threshold calibration via FPR-constrained sweep holds false positives to 8.3%. Tested against real MCP servers: filesystem, git, and sqlite.

Results

4,200+ evaluation cases local eval suite 100% → 16% ASR vs. no-filter baseline p95 latency < 1 ms policy check 783 tests FPR held at 8.3%

Ablation: removing the semantic layer has the same effect as removing regex (+4pp ASR each). The two layers catch completely different attack classes — neither is redundant.

Stack

Python FastAPI MCP Protocol hashed feature vectors NumPy pytest (783 tests)

GitHub ↗ Design Notes ↗

Limit Order Book + Matching Engine

C++20 exchange-style matching engine for high-frequency trading (HFT) and quantitative systems — sustaining 20M+ messages/sec at sub-µs latency.

Problem

Exchange matching engines and high-frequency trading (HFT) systems require deterministic order matching at throughputs and latencies that commodity implementations cannot reach. Memory allocation, branch misprediction, and cache misses are the bottlenecks — not algorithm complexity.

System

Price-time priority matching engine with slab allocators, branch elimination, cache-hot pointer layouts, and CPU pinning. Full analytics: VWAP, TWAP, Iceberg, POV with reproducible PnL. Binance US market data connector and TAQ replay.

Proof

20.7M msgs/sec on a synthetic local benchmark (single-threaded, compiler-optimized). p50 = 0.04 µs, p99 ≈ 1 µs. Hardware spec and benchmark harness in the GitHub README. Crash recovery via snapshot/resume — mid-file restart produces identical fills.

What I Built

Matching Engine: Price-time priority with FIFO fairness. Limit, market, IOC, FOK, POST_ONLY, and STP order types with deterministic fill semantics.
Performance Engineering: Slab allocators eliminate heap fragmentation, branch elimination reduces misprediction cost, cache-hot pointer layouts maximize L1 hit rate, CPU pinning eliminates NUMA penalties.
Market Data: WebSocket + REST connector for Binance US feeds, normalization to Parquet. Replay engine regenerates TAQ-style quotes and trades at 1×–100× speed.
Analytics: Spread, imbalance, depth, volatility, impact curves. VWAP, TWAP, POV, and Iceberg execution strategies with reproducible PnL.
Crash Recovery: Snapshot/resume proof — mid-file restart produces identical fills and PnL as single-pass replay.
Tooling: Streamlit dashboard for real-time replay, Docker + GitHub Actions CI, one-command HTML report generator.

Results

20.7M msgs/sec local benchmark p50 = 0.04 µs single-threaded p99 ≈ 1 µs 1M event replay Deterministic replay Full analytics suite

Synthetic local benchmark — single-threaded, compiler-optimized build. Hardware spec and benchmark harness in the GitHub README.

Stack

C++20 Python FastAPI WebSocket pandas / NumPy Docker Streamlit GitHub Actions

GitHub ↗ Design Notes ↗

Edge Deployer — Serverless IDE

Zero-config desktop IDE for deploying serverless APIs to Cloudflare Workers, AWS Lambda@Edge, and Vercel.

Problem

Deploying serverless functions across providers requires context-switching between dashboards, CLIs, and config formats. There's no single interface with live feedback, history, and IaC.

System

Electron desktop app: Monaco editor, multi-cloud deploy logic, Pulumi IaC per provider, real-time log streaming, deployment history, and per-provider ENV variable injection.

Proof

Cloudflare Workers, AWS Lambda@Edge, and Vercel deployed from one interface. Pulumi applies infra dynamically. Code bundling + ZIP export pipeline for Lambda packaging.

What I Built

Monaco-based code editor with language server integration for TypeScript/JavaScript
Multi-cloud deploy logic: Cloudflare Workers API, AWS CLI scaffolding, Vercel REST API
Pulumi IaC: generates and applies infra config per provider dynamically
Real-time deploy logs streamed back to the IDE terminal
Deployment history: local cache of last 5 deploys with timestamp, provider, and status
Environment configuration panel: per-provider ENV vars, dynamic injection at build time
Code bundling + ZIP export pipeline for Lambda packaging

Stack

Electron React TypeScript Monaco Editor Pulumi Cloudflare API AWS CLI Vercel REST API

GitHub ↗ Design Notes ↗