Selected image:
Draw a box around each musical system in the image
Model weights path (defined in acai_omr/config.py): tf_omr_train/vitomr.pth
Encoding image latent