Back

Final Project

Khana – Indian Food Classification and Detection

Results

PartMethodResult
1 – ClassificationConvNeXt-Base, 2-stage training95.87% val acc (baseline 91%)
2 – DetectionSAM + ConvNeXt classifierKhana 80-class labels, Precision/Recall
3 – BEVHomography warp + DetectionNatural image → top-down → detection
Bonus – NeRFCoarse+Fine NeRF, ArUco poses22 dB PSNR, 20k steps, T4 GPU

Links

Part 1

Image Classification

Train a classifier on the Khana dataset and beat the 91% baseline.

Used ConvNeXt-Base pretrained on ImageNet-22k. Two-stage training: 224px then 320px finetune. Mixup, cutmix, RandAugment, layer-wise LR decay, and weighted sampling for class balance.

Final val accuracy: 95.87% — +4.87% above baseline.

Part 2

Thali Detection

Detect food items in clean thali images with correct Khana labels.

SAM generates food region masks, each crop is passed through the trained ConvNeXt classifier for Khana labels. NMS and confidence filtering clean up detections.

Evaluated by Precision/Recall on labels only.

Part 3

BEV Transform + Detection

Convert natural angled images to bird's eye view, then detect.

Interactive mode: click 4 corners of the tray. Auto mode: Hough circle detection. Perspective warp flattens the view before running Part 2 pipeline.

Evaluated qualitatively by visual inspection.

Bonus

NeRF of Thaali

Neural Radiance Field from 51 real thali photos.

Camera calibrated with ArUco grid (RMS = 1.21 px). Poses estimated using solvePnP. 2D neural field warmup (34 dB). Full 3D NeRF trained 20k iterations on T4 GPU.

Final PSNR: 22 dB. Spiral video at 60 frames.

How to Run

Setup
pip install torch torchvision timm segment-anything opencv-python pillow matplotlib imageio

Part 1 – Train & Predict
python codes/train.py
python codes/predict_khana.py

Part 2 – Detection
python codes/detect_thali.py path/to/thali.jpg --bev none

Part 3 – BEV + Detection
python codes/detect_thali.py path/to/thali.jpg --bev interactive
python codes/detect_thali.py path/to/thali.jpg --bev auto

Bonus – NeRF
python codes/step0_calibrate.py
python codes/step0_pose_and_dataset.py
python codes/step1_neural_field_2d.py --image path/to/image.jpg
python codes/step2_nerf_3d.py

Model weights are on Google Drive (link above).