2.9 KiB
2.9 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
This project implements a K-Nearest Neighbors (KNN) algorithm using Fully Homomorphic Encryption (FHE) in Rust. The implementation uses TFHE-rs for cryptographic operations and operates on a 10-dimensional synthetic dataset with 100 training points.
Key Architecture
- Homomorphic Encryption: Uses TFHE-rs library for fully homomorphic encryption operations
- Data Processing: Synthetic dataset features are scaled by 10x to preserve decimal precision as integers
- KNN Implementation: Complete implementation with multiple algorithms:
euclidean_distance()
: Optimized distance calculation using precomputed squaresperform_knn_selection()
: Supports selection sort, bitonic sort, and heap-based selectionencrypted_bitonic_sort()
: Parallel bitonic sort with power-of-2 padding
Build and Development Commands
# Build the project
cargo build
# Run with different algorithms
cargo run --bin enc # Default selection sort
cargo run --bin enc -- --algorithm=bitonic # Bitonic sort (fastest for large datasets)
cargo run --bin enc -- --algorithm=heap # Heap-based selection
cargo run --bin enc -- --debug # Debug mode with plaintext verification
cargo run --bin plain # Plaintext version for comparison
# Development commands - ALWAYS use cargo check for verification
cargo check # Use this for code verification, NOT cargo run
cargo test
cargo fmt
cargo clippy
Data Structure
The project processes synthetic 10-dimensional dataset with these key data structures:
EncryptedQuery
: Query point with precomputed values for optimizationEncryptedPoint
: Training data points with precomputed squared sumsEncryptedNeighbor
: Distance and index pairs for KNN results- Custom deserializer converts float values to scaled integers (×10) for FHE compatibility
Dataset
- Training Data:
dataset/train.jsonl
containing one query point and 100 10-dimensional training points - Results:
dataset/answer.jsonl
anddataset/answer1.jsonl
contain KNN classification results in JSON format
Important Technical Notes
- FheInt14 Range: Valid range is -8192 to 8191 (2^13). Using values outside this range (like i16::MAX = 32767) will cause overflow
- Bitonic Sort: Requires
up=true
for ascending order to get smallest distances first. Usingfalse
gives largest distances (wrong for KNN) - Performance: Bitonic sort is fastest for larger datasets due to parallel processing, but requires power-of-2 padding
Git Workflow Instructions
IMPORTANT: When user asks to "write commit" or "帮我写commit":
- Do NOT add any files to staging area
- User has already staged the files they want to commit
- Only create the commit with appropriate message for the staged changes