Use with DataLad¶
Version control your fitness data with DataLad for full history tracking.
What is DataLad?¶
DataLad is a data management tool built on Git and git-annex. It lets you:
- Track changes to large files (photos, GPS data)
- Version your entire fitness history
- Sync data across machines
- Share datasets with others
Setting Up DataLad¶
Install DataLad¶
Create a DataLad Dataset¶
This initializes your data directory as a DataLad dataset with:
- Git for text files (JSON, TSV)
- git-annex for binary files (photos, Parquet)
- Appropriate
.gitattributesrules
Workflow¶
After Each Sync¶
Save changes to the dataset:
View History¶
See all changes:
Restore Previous Version¶
# View a specific activity from the past
git show HEAD~10:data/athl=username/ses=20251218T063000/info.json
# Restore a file
git checkout HEAD~10 -- data/athl=username/ses=20251218T063000/info.json
Remote Storage¶
Push to GitHub¶
Push to Another Machine¶
Automated Saves¶
Combine with cron for automatic version control:
Large File Handling¶
Photos and Parquet files are stored in git-annex:
# Check which files are annexed
git annex whereis data/athl=username/ses=20251218T063000/photos/
# Get specific files
datalad get data/athl=username/ses=20251218T063000/photos/
# Drop to save space (keeps in remote)
datalad drop data/athl=username/ses=20251218T063000/photos/
Migration from Existing Data¶
If you already have MyKrok data:
Best Practices¶
- Save after each sync - Keep a clean history
- Use meaningful messages - Include date or activity count
- Push regularly - Keep remote backup current
- Don't modify manually - Let MyKrok manage files
Troubleshooting¶
"Not a datalad dataset"¶
Initialize the dataset:
git-annex Lock Issues¶
Large Clone Size¶
Use partial clones: