65 lines
2.9 KiB
Markdown
65 lines
2.9 KiB
Markdown
# CLAUDE.md
|
||
|
||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||
|
||
## Project Overview
|
||
|
||
This project implements a K-Nearest Neighbors (KNN) algorithm using Fully Homomorphic Encryption (FHE) in Rust. The implementation uses TFHE-rs for cryptographic operations and operates on a 10-dimensional synthetic dataset with 100 training points.
|
||
|
||
## Key Architecture
|
||
|
||
- **Homomorphic Encryption**: Uses TFHE-rs library for fully homomorphic encryption operations
|
||
- **Data Processing**: Synthetic dataset features are scaled by 10x to preserve decimal precision as integers
|
||
- **KNN Implementation**: Complete implementation with multiple algorithms:
|
||
- `euclidean_distance()`: Optimized distance calculation using precomputed squares
|
||
- `perform_knn_selection()`: Supports selection sort, bitonic sort, and heap-based selection
|
||
- `encrypted_bitonic_sort()`: Parallel bitonic sort with power-of-2 padding
|
||
|
||
## Build and Development Commands
|
||
|
||
```bash
|
||
# Build the project
|
||
cargo build
|
||
|
||
# Run with different algorithms
|
||
cargo run --bin enc # Default selection sort
|
||
cargo run --bin enc -- --algorithm=bitonic # Bitonic sort (fastest for large datasets)
|
||
cargo run --bin enc -- --algorithm=heap # Heap-based selection
|
||
cargo run --bin enc -- --debug # Debug mode with plaintext verification
|
||
cargo run --bin plain # Plaintext version for comparison
|
||
|
||
# Development commands - ALWAYS use cargo check for verification
|
||
cargo check # Use this for code verification, NOT cargo run
|
||
cargo test
|
||
cargo fmt
|
||
cargo clippy
|
||
```
|
||
|
||
## Data Structure
|
||
|
||
The project processes synthetic 10-dimensional dataset with these key data structures:
|
||
|
||
- `EncryptedQuery`: Query point with precomputed values for optimization
|
||
- `EncryptedPoint`: Training data points with precomputed squared sums
|
||
- `EncryptedNeighbor`: Distance and index pairs for KNN results
|
||
- Custom deserializer converts float values to scaled integers (×10) for FHE compatibility
|
||
|
||
## Dataset
|
||
|
||
- **Training Data**: `dataset/train.jsonl` containing one query point and 100 10-dimensional training points
|
||
- **Results**: `dataset/answer.jsonl` and `dataset/answer1.jsonl` contain KNN classification results in JSON format
|
||
|
||
## Important Technical Notes
|
||
|
||
- **FheInt14 Range**: Valid range is -8192 to 8191 (2^13). Using values outside this range (like i16::MAX = 32767) will cause overflow
|
||
- **Bitonic Sort**: Requires `up=true` for ascending order to get smallest distances first. Using `false` gives largest distances (wrong for KNN)
|
||
- **Performance**: Bitonic sort is fastest for larger datasets due to parallel processing, but requires power-of-2 padding
|
||
|
||
## Git Workflow Instructions
|
||
|
||
**IMPORTANT**: When user asks to "write commit" or "帮我写commit":
|
||
|
||
- Do NOT add any files to staging area
|
||
- User has already staged the files they want to commit
|
||
- Only create the commit with appropriate message for the staged changes
|