MediaPipe Models and Model Cards
- Face Detection
 - Face Mesh
 - Iris
 - Hands
 - Pose
 - Holistic
 - Selfie Segmentation
 - Hair Segmentation
 - Object Detection
 - Objectron
 - KNIFT
 
Face Detection
- Short-range model (best for faces within 2 meters from the camera): TFLite model, TFLite model quantized for EdgeTPU/Coral, Model card
 - Full-range model (dense, best for faces within 5 meters from the camera): TFLite model, Model card
 - Full-range model (sparse, best for faces within 5 meters from the camera): TFLite model, Model card
 
Full-range dense and sparse models have the same quality in terms of F-score however differ in underlying metrics. The dense model is slightly better in Recall whereas the sparse model outperforms the dense one in Precision. Speed-wise sparse model is ~30% faster when executing on CPU via XNNPACK whereas on GPU the models demonstrate comparable latencies. Depending on your application, you may prefer one over the other.
Face Mesh
- Face landmark model: TFLite model, TF.js model
 - Face landmark model w/ attention (aka Attention Mesh): TFLite model
 - Model card, Model card (w/ attention)
 
Iris
- Iris landmark model: TFLite model
 - Model card
 
Hands
- Palm detection model: TFLite model (lite), TFLite model (full), TF.js model
 - Hand landmark model: TFLite model (lite), TFLite model (full), TF.js model
 - Model card
 
Pose
- Pose detection model: TFLite model
 - Pose landmark model: TFLite model (lite), TFLite model (full), TFLite model (heavy)
 - Model card
 
Holistic
- Hand recrop model: TFLite model
 
Selfie Segmentation
Hair Segmentation
Object Detection
Objectron
- TFLite model for shoes
 - TFLite model for chairs
 - TFLite model for cameras
 - TFLite model for cups
 - Single-stage TFLite model for shoes
 - Single-stage TFLite model for chairs
 - Model card
 
KNIFT
- TFLite model for up to 200 keypoints
 - TFLite model for up to 400 keypoints
 - TFLite model for up to 1000 keypoints
 - Model card