Thank you for your interest in contributing to OmniTranscripts! This guide will help you get started with contributing to this YouTube video transcription API.
# Fork the repository on GitHub, then clone your fork
git clone https://github.com/YOUR_USERNAME/OmniTranscripts.git
cd OmniTranscripts
# Add upstream remote
git remote add upstream https://github.com/wilmoore/OmniTranscripts.git# Install dependencies (see docs/development.md for details)
make setup
# Copy environment configuration
cp .env.example .env
# Install Go dependencies
go mod tidy
# Run tests to ensure everything works
make test-short# Sync with upstream
git fetch upstream
git checkout main
git merge upstream/main
# Create feature branch
git checkout -b feature/your-feature-nameHelp us identify and fix issues:
- Search existing issues first to avoid duplicates
- Use the bug report template when creating new issues
- Include detailed information:
- Go version and OS
- Steps to reproduce
- Expected vs actual behavior
- Relevant log outputs
- Minimal reproduction case
Example Bug Report:
## Bug Description
Transcription fails for videos longer than 10 minutes with "context deadline exceeded" error.
## Steps to Reproduce
1. Submit video URL: https://www.youtube.com/watch?v=LONG_VIDEO_ID
2. Wait for processing
3. Check job status
## Expected Behavior
Video should be transcribed successfully
## Actual Behavior
Job fails with timeout error after 5 minutes
## Environment
- Go version: 1.23.0
- OS: Ubuntu 22.04
- OmniTranscripts version: main branchSuggest new functionality:
- Check the roadmap and existing feature requests
- Describe the use case and problem it solves
- Provide implementation suggestions if you have ideas
- Consider backwards compatibility
Example Feature Request:
## Feature Description
Add support for batch processing multiple videos in a single API call
## Use Case
Users often need to transcribe multiple related videos (e.g., a video series) and would benefit from submitting them all at once.
## Proposed API
```json
{
"urls": [
"https://www.youtube.com/watch?v=video1",
"https://www.youtube.com/watch?v=video2"
],
"batch_options": {
"priority": "normal",
"webhook_url": "https://example.com/webhook"
}
}- Add new
/transcribe/batchendpoint - Return batch job ID for tracking all videos
- Support batch-level webhooks
### 🔧 Code Contributions
#### Areas Needing Help
- **Performance optimizations**: Improve transcription speed
- **Error handling**: Better error messages and recovery
- **Testing**: Increase test coverage
- **Documentation**: API docs, guides, examples
- **Integrations**: SDK libraries for different languages
- **Monitoring**: Metrics and observability features
#### Before You Start
1. **Discuss large changes** in an issue first
2. **Follow the coding standards** (see below)
3. **Write tests** for new functionality
4. **Update documentation** as needed
## Coding Standards
### Go Code Style
#### General Guidelines
```go
// Good: Clear, descriptive names
func processTranscriptionJob(job *models.Job) error {
return nil
}
// Bad: Unclear abbreviations
func procTxJob(j *models.Job) error {
return nil
}
// Good: Descriptive error with context
if err := validateURL(req.URL); err != nil {
return fmt.Errorf("invalid YouTube URL %q: %w", req.URL, err)
}
// Good: Structured error response
return &errs.Error{
Code: errs.InvalidArgument,
Message: "URL validation failed",
Details: map[string]interface{}{
"url": req.URL,
"reason": "not a valid YouTube URL",
},
}// ProcessTranscription orchestrates the complete video transcription pipeline.
// It downloads the video, extracts audio, and generates timestamped transcripts.
//
// The function handles both short videos (processed synchronously) and long videos
// (processed asynchronously via job queue).
//
// Parameters:
// - url: YouTube video URL to transcribe
// - jobID: Unique identifier for tracking processing progress
//
// Returns:
// - transcript: Complete text transcription
// - segments: Timestamped transcript segments
// - error: Processing error or nil on success
func ProcessTranscription(url, jobID string) (string, []models.Segment, error) {
// Implementation...
}func TestProcessTranscription(t *testing.T) {
tests := []struct {
name string
url string
jobID string
wantErr bool
wantSegments int
}{
{
name: "valid short video",
url: "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
jobID: "test_job_123",
wantErr: false,
wantSegments: 10, // Approximate expected segments
},
{
name: "invalid URL",
url: "not-a-url",
jobID: "test_job_456",
wantErr: true,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
transcript, segments, err := ProcessTranscription(tt.url, tt.jobID)
if (err != nil) != tt.wantErr {
t.Errorf("ProcessTranscription() error = %v, wantErr %v", err, tt.wantErr)
return
}
if !tt.wantErr {
assert.NotEmpty(t, transcript)
assert.Len(t, segments, tt.wantSegments)
}
})
}
}// Good: Clear field names and validation tags
type TranscribeRequest struct {
URL string `json:"url" validate:"required,url"`
Options *TranscribeOptions `json:"options,omitempty"`
}
type TranscribeOptions struct {
Model string `json:"model,omitempty" validate:"oneof=tiny small base large"`
Language string `json:"language,omitempty" validate:"len=2"`
}
// Good: Consistent response structure
type TranscribeResponse struct {
JobID string `json:"job_id,omitempty"`
Transcript string `json:"transcript,omitempty"`
Segments []models.Segment `json:"segments,omitempty"`
Status string `json:"status"`
CreatedAt time.Time `json:"created_at"`
}// Consistent error format
type ErrorResponse struct {
Error string `json:"error"`
Code string `json:"code,omitempty"`
Details map[string]interface{} `json:"details,omitempty"`
}# Make your changes
vim lib/transcription.go
# Run tests frequently
make test-short
# Check code quality
make lint
make fmt# Run all tests
make test
# Run specific tests
go test -v ./lib/ -run TestProcessTranscription
# Test with coverage
make test-coverage
# Performance tests (if relevant)
make benchmark- Update relevant
.mdfiles indocs/ - Add inline code comments for new functions
- Update API documentation if endpoints change
- Add examples for new features
<type>(<scope>): <description>
<body>
<footer>
feat: New featurefix: Bug fixdocs: Documentation changesstyle: Code style changes (formatting, etc.)refactor: Code refactoringtest: Adding or updating testschore: Maintenance tasks
# Good commit messages
git commit -m "feat(api): add batch transcription endpoint
Support processing multiple videos in a single request.
Includes new /transcribe/batch endpoint with job tracking.
Closes #123"
git commit -m "fix(transcription): handle audio extraction timeout
Increase FFmpeg timeout for long videos and add retry logic.
Fixes issue where 20+ minute videos would fail consistently.
Fixes #456"
git commit -m "docs(api): update endpoint documentation
Add examples for new batch processing endpoint and
clarify error response formats."# Sync with upstream
git fetch upstream
git rebase upstream/main
# Final check
make check # Runs fmt, lint, vet, and test-short
# Push to your fork
git push origin feature/your-feature-nameWhen creating a PR, include:
- Description: Clear explanation of changes
- Motivation: Why is this change needed?
- Testing: How was this tested?
- Breaking Changes: Any backwards compatibility issues?
- Checklist: Confirm you've completed all requirements
Example PR Description:
## Description
Adds support for batch processing multiple YouTube videos in a single API request.
## Motivation
Users frequently need to transcribe multiple related videos and currently have to make separate API calls for each one. This feature allows them to submit multiple URLs at once and track progress via a single batch job ID.
## Changes
- Added `/transcribe/batch` endpoint
- New `BatchTranscribeRequest` and `BatchTranscribeResponse` models
- Batch job tracking and status aggregation
- Updated documentation and examples
## Testing
- Unit tests for batch processing logic
- Integration tests with real YouTube videos
- Load testing with 10+ videos per batch
- Manual testing via curl and Postman
## Breaking Changes
None - this is a new endpoint with no changes to existing APIs.
## Checklist
- [x] Tests pass locally
- [x] Code follows style guidelines
- [x] Documentation updated
- [x] No breaking changes
- [x] Performance impact assessed- Automated checks: CI will run tests and linting
- Code review: Maintainers will review your code
- Feedback: You may need to make changes
- Approval: Once approved, your PR will be merged
- Be responsive: Reply to comments promptly
- Ask questions: If feedback isn't clear, ask for clarification
- Make requested changes: Address all feedback before requesting re-review
- Update tests: If functionality changes, update tests accordingly
We follow the Contributor Covenant to ensure a welcoming environment for all contributors.
- Be respectful: Treat all community members with respect
- Be constructive: Provide helpful feedback and suggestions
- Be patient: Maintainers are volunteers with limited time
- Stay on topic: Keep discussions focused on the project
- Documentation: Check the docs first
- GitHub Discussions: Ask questions and share ideas
- Issues: Report bugs and request features
- Discord (if available): Real-time chat with other contributors
All contributors are recognized in:
- Repository contributors page
- Release notes for significant contributions
- Special mentions for major features or bug fixes
Regular contributors who demonstrate:
- Consistent high-quality contributions
- Good understanding of the codebase
- Helpful community participation
- Commitment to project goals
May be invited to become maintainers with additional responsibilities.
- Development Guide - Development setup and workflows
- Architecture - Technical architecture details
- API Documentation - Complete API reference
- Deployment Guide - Production deployment options
- Troubleshooting - Common issues and solutions
Thank you for contributing to OmniTranscripts! Your contributions help make video transcription more accessible and powerful for everyone.