BioVault leverages SyftBox's powerful Access Control List (ACL) system to protect sensitive biological data. This document explains how permissions work and how BioVault ensures data security.
SyftBox uses a hierarchical, file-based permission system with YAML configuration files (syft.pub.yaml) placed in directories throughout the datasite structure. Each file controls access to its directory and subdirectories.
- Owner-Based Access Control: Datasite owners have implicit full access to their files
- Hierarchical Rules: Permissions cascade from parent to child directories
- Terminal Nodes: Can prevent subdirectories from overriding parent permissions
- Template-Based Patterns: Dynamic access control based on user identity
- Granular Permissions: Separate controls for read, write, and admin operations
BioVault includes comprehensive permission tests to ensure data security. Here's how the system protects data through various scenarios:
Client1 (@syftbox.net) has created several folders in their datasite to test different sharing scenarios:
private/- Personal space with user-specific folderspublic/- Content everyone can readshared/- Content shared with specific collaboratorsdynamic/- Folder for testing permission changes
The system tests against three clients:
- [email protected]: Primary data owner
- [email protected]: Trusted collaborator
- [email protected]: Potential bad actor (used for security testing)
Goal: Each user can only access files in their own email-named subfolder
Permission file: datasites/[email protected]/private/syft.pub.yaml
rules:
- pattern: '{{.UserEmail}}/*'
access:
admin: []
read:
- 'USER'
write:
- 'USER'File structure:
datasites/[email protected]/private/
├── syft.pub.yaml
└── [email protected]/
└── secret.txt
Security guarantee:
- ✅ Client1 accessing
private/[email protected]/*→ Allowed (template matches their email) - ❌ Client2 accessing
private/[email protected]/*→ Denied (template doesn't match) - ❌ Bad actor accessing
private/[email protected]/*→ Denied (template doesn't match)
The {{.UserEmail}} template ensures user-specific isolation - each user can only access a folder matching their email address.
Goal: Everyone can read files, but only the owner can modify them
Permission file: datasites/[email protected]/public/syft.pub.yaml
rules:
- pattern: '**'
access:
admin: []
read:
- '*'
write: []Security guarantee:
- ✅ Everyone (client1, client2, bad actor) can read files
- ✅ Client1 (owner) can write new files
- ❌ Client2 and bad actor cannot write or modify files
This pattern is ideal for sharing public datasets or results while maintaining control.
Goal: Share with specific trusted users only
Permission file: datasites/[email protected]/shared/syft.pub.yaml
rules:
- pattern: '**'
access:
admin: []
read:
- '[email protected]'
write: []Security guarantee:
- ✅ Client1 (owner) has full access (implicit)
- ✅ Client2 can read files
- ❌ Client2 cannot modify files
- ❌ Bad actor cannot read or write
This enables controlled collaboration with specific team members.
Goal: Test that permission changes propagate correctly
Initial state (private):
rules:
- pattern: '{{.UserEmail}}/*'
access:
admin: []
read:
- 'USER'
write:
- 'USER'Updated state (public):
rules:
- pattern: '**/*'
access:
admin: []
read:
- '*'
write: []Security guarantee:
- Permission changes propagate immediately
- Access control is enforced in real-time
- No caching issues that could lead to unauthorized access
The SyftBox server requires specific YAML formatting with proper indentation:
rules:
- pattern: '**' # Note the 2-space indentation before the dash
access:
admin: [] # Required field, even if empty
read:
- '*' # List items with proper indentation
write: [] # Empty list means owner-only writeCritical formatting rules:
- Use 2 spaces before the
-in rules array - Always include the
admin: []field - Use
**for current directory and subdirectories - Use
*for current directory only - Empty arrays
[]mean no explicit permissions (owner still has implicit access)
- Datasite owners always have implicit full access to all their files
- No need to explicitly list the owner in permission rules
- Owner is determined from the first path segment (e.g.,
alicein/alice/data/)
- Read: View file contents and metadata
- Write: Create, modify, or delete files
- Admin: Full control including modifying ACL files
*: Wildcard representing all users (public access)USER: Dynamic token that resolves to the requesting user's email{{.UserEmail}}: Template that creates user-specific paths
*: Matches files in current directory only**: Matches files in current directory and all subdirectories**/*: Matches files in all subdirectories (not current directory){{.UserEmail}}/*: Template pattern for user-specific folders
BioVault will use SyftBox permissions to:
- Keep raw genomic data private by default
- Share analysis results selectively with collaborators
- Publish aggregate statistics publicly while protecting individual data
- Enable time-limited access for specific research projects
- All access attempts are logged
- Permission changes are tracked
- Terminal nodes prevent accidental permission escalation
- Regular permission audits can be automated
- Start Private: All data should be private by default
- Use Terminal Nodes: For sensitive genomic data directories
- Explicit Sharing: Only share with specific email addresses, avoid wildcards
- Regular Audits: Review permission files periodically
- Test Permissions: Always verify access controls before sharing sensitive data
- Template Isolation: Use
{{.UserEmail}}patterns for user-specific workspaces
BioVault includes automated security tests that verify:
- User isolation works correctly
- Public sharing is read-only
- Specific user permissions are enforced
- Bad actors cannot access restricted data
- Permission changes propagate correctly
These tests run nightly in CI to ensure the security model remains intact.
If you discover a security vulnerability in BioVault or its use of SyftBox permissions, please report it to:
- Email: [email protected] (placeholder)
- Do not disclose security issues publicly until they have been addressed