This folder is a starter template for a GitHub annotation repository that works with Cellucid's community annotation UI.
The core idea: each annotator writes their own file (no shared edits / no merge conflicts), and Cellucid compiles the merged consensus view in the browser on Pull.
annotations/schema.json- JSON schema reference for user vote filesannotations/config.json- Dataset binding + author-controlled annotatable fields + per-field consensus settingsannotations/users/*.json- Per-user suggestions & votes (conflict-free collaboration)annotations/moderation/merges.json- Optional author-only merges (maintainers/admins)scripts/validate_user_files.py- Validation script (run by CI and usable locally).github/workflows/validate.yml- GitHub Actions workflow (validation)
This template is designed for many annotators to collaborate safely:
- Each person contributes only
annotations/users/ghid_<id>.json. - Authors (maintain/admin) can optionally curate
annotations/moderation/merges.json. - In Cellucid, Pull downloads the raw files under
annotations/users/andannotations/moderation/(SHA-based: downloads only what changed) and compiles a merged view locally.- The browser cache is scoped by datasetId + repo + branch + GitHub user.id (multi-user + multi-project safe).
- Cellucid can export a locally-built
consensus.jsonsnapshot from the sidebar (useful for downstream tooling); it is not committed back to the repo.
- Create a new GitHub repo from this template folder contents.
- Configure
annotations/config.jsonto match your dataset id(s) and annotatable field(s).supportedDatasets[]may include multiple dataset ids.- Authors can also update
fieldsToAnnotate,annotatableSettings(minAnnotators,threshold), andclosedFieldsvia the Cellucid UI (Publish writes back toannotations/config.json).
- Each collaborator writes only their own file under
annotations/users/. - In Cellucid, connect via GitHub App sign-in (no token paste). Users with write access publish directly; others publish via fork + Pull Request.
This template includes one workflow:
File: .github/workflows/validate.yml
- Runs on pushes and pull requests that touch
annotations/**orscripts/**. - Validates human/client-authored inputs:
annotations/config.jsonannotations/users/*.jsonannotations/moderation/merges.json(optional)
- Executes:
python scripts/validate_user_files.py
If this fails, fix the JSON files in annotations/ (do not edit any derived/exported outputs).
You can run the same checks locally (Python 3.10+ recommended; CI uses Python 3.11):
# Validate inputs (what humans/clients write)
python scripts/validate_user_files.pyIf you maintain the repo and want to "merge" suggestions (e.g. two different labels that should be treated as the same), you can add annotations/moderation/merges.json.
- This file is optional and typically restricted to maintainers/admins.
- Merges create a mapping from
fromSuggestionId→intoSuggestionIdwithin the same bucket. - Bucket key format:
<fieldKey>:<categoryLabel>. IffieldKeycontains:, Cellucid encodes it asfk~<urlencoded>(example:fk~celltype%3Acoarse:...). - Cellucid applies this mapping at runtime when computing bundle vote totals and consensus.
- In the Cellucid UI, authors can add merges by dragging a suggestion card onto another.
- The merge dialog includes an optional note.
- You can later edit or delete the merge note from the bundle’s View merged modal (the merge mapping stays the same).
- When a merge note is edited, the merge record may include
editedAt(timestamp of the note edit) in addition toat(timestamp of the merge creation).
User files include identity metadata that Cellucid stores in each annotations/users/*.json:
githubUserId(stable GitHub numeric id; file identity isghid_<id>)login(GitHub username; informational only)displayName,title,orcid,linkedin(optional; LinkedIn is handle-only)datasets(optional): informational record of dataset ids and annotatable fields the user has accessed
- Suggestions may include
editedAtwhen the proposer edits a suggestion (e.g. label/evidence/ontology id/markers). - Comments include
editedAtwhen a comment is edited.
This template does not commit a merged consensus artifact. Instead:
- Cellucid pulls the raw per-user files (
annotations/users/*.json) and optional merges file (annotations/moderation/merges.json) - Cellucid compiles the merged view locally in the browser on Pull
- You can download a compiled
consensus.jsonsnapshot from the sidebar when needed
The first Pull has to populate the local raw-file cache.
After that, Pull uses GitHub sha values to download only the user/merge files that changed.
In the Cellucid sidebar:
- Use Remove downloaded files to clear the raw-file cache for the current cache scope:
datasetIdowner/repo@branchuser.idthen Pull again.
If the currently loaded dataset id is not listed in annotations/config.json for the connected repo:
- Annotators are blocked (no Pull / no viewing annotations).
- Authors can connect anyway and Publish updated settings; this adds/updates the matching
supportedDatasets[]entry inannotations/config.json.