MediaPipe Models and Model Cards
- Face Detection
- Face Mesh
- Iris
- Hands
- Pose
- Holistic
- Selfie Segmentation
- Hair Segmentation
- Object Detection
- Objectron
- KNIFT
Face Detection
- Short-range model (best for faces within 2 meters from the camera): TFLite model, TFLite model quantized for EdgeTPU/Coral, Model card
- Full-range model (dense, best for faces within 5 meters from the camera): TFLite model, Model card
- Full-range model (sparse, best for faces within 5 meters from the camera): TFLite model, Model card
Full-range dense and sparse models have the same quality in terms of F-score however differ in underlying metrics. The dense model is slightly better in Recall whereas the sparse model outperforms the dense one in Precision. Speed-wise sparse model is ~30% faster when executing on CPU via XNNPACK whereas on GPU the models demonstrate comparable latencies. Depending on your application, you may prefer one over the other.
Face Mesh
- Face landmark model: TFLite model, TF.js model
- Face landmark model w/ attention (aka Attention Mesh): TFLite model
- Model card, Model card (w/ attention)
Iris
- Iris landmark model: TFLite model
- Model card
Hands
- Palm detection model: TFLite model (lite), TFLite model (full), TF.js model
- Hand landmark model: TFLite model (lite), TFLite model (full), TF.js model
- Model card
Pose
- Pose detection model: TFLite model
- Pose landmark model: TFLite model (lite), TFLite model (full), TFLite model (heavy)
- Model card
Holistic
- Hand recrop model: TFLite model
Selfie Segmentation
Hair Segmentation
Object Detection
Objectron
- TFLite model for shoes
- TFLite model for chairs
- TFLite model for cameras
- TFLite model for cups
- Single-stage TFLite model for shoes
- Single-stage TFLite model for chairs
- Model card
KNIFT
- TFLite model for up to 200 keypoints
- TFLite model for up to 400 keypoints
- TFLite model for up to 1000 keypoints
- Model card