A comprehensive Flask-based API for assessing IATI (International Aid Transparency Initiative) data quality across programmes and projects. This API validates IATI activities against defined quality standards including attribute completeness, sector/location percentages, and document publication requirements. The response also includes available segmentations for a client to easily display available filters.
- Activity Validation: Validates H1 (programmes) and H2 (projects) activities against multiple criteria
- Attribute Checks: Title, description, dates, sectors, locations, and participating organizations
- Document Publication: Validates Business Case, Logical Framework, and Annual Review publications
- Segmentation: Filter by countries, regions, and sectors; response includes
available_segmentations. the unique codes present in the data, ready for use as filter values - Redis Caching: 24-hour cache with daily refresh
- Docker Compose: Easy deployment with Redis and Flask API
- Comprehensive Tests: Full pytest suite with edge case coverage
See DIAGRAMS.md for system architecture, request pipeline, Solr query construction, validation flow, data models, and cache strategy.
- Python 3.11+
- UV (package manager)
- Docker and Docker Compose
- Solr instance with IATI data
git clone <repository-url>
cd iati-dqa-apiThe repo includes a committed uv.lock - all dependency versions are pinned for reproducible installs.
cp .env.example .env
# Edit .env with your Solr URL and other settingsdocker-compose up -dThe API will be available at http://localhost:5000
# Install all dependencies from uv.lock and install pre-commit hooks
make dev
# Run Redis
docker-compose up redis -d
# Run Flask app
python -m flask --app app.main:app run --debugGET /dqa/healthResponse:
{
"status": "healthy",
"redis": "connected",
"timestamp": "2024-02-12T10:30:00"
}POST /dqa
Content-Type: application/json
{
"organisation": "GB-GOV-1",
"segmentation": {
"countries": ["AF", "BD"],
"regions": ["298"],
"sectors": ["151", "15170"]
},
"require_funding_and_accountable": false,
"include_exemptions": true,
"skip_cache": false,
"failed_activities": true
}Response:
{
"summary": {
"organisation": "GB-GOV-1",
"total_programmes": 45,
"total_projects": 230,
"total_budget": 125000000.0,
"financial_year": "2024-2025"
},
"failed_activities": [
{
"iati_identifier": "GB-GOV-1-12345",
"hierarchy": 1,
"title": "Short",
"activity_status": "2",
"attributes": [
{
"attribute": "title",
"status": "fail",
"message": "Title is too short (5 characters, minimum 10 required)",
"details": {"length": 5}
}
],
"documents": [],
"overall_status": "fail",
"failure_count": 1
}
],
"pass_count": 270,
"fail_count": 5,
"not_applicable_count": 150,
"generated_at": "2024-02-12T10:30:00",
"percentages": {
"title_percentage": 45
},
"available_segmentations": {
"countries": ["AF", "BD", "KE"],
"regions": ["298"],
"sectors": ["15110", "15170", "72010"]
}
}POST /dqa/cache/clear
Content-Type: application/json
{ "pattern": "dqa:*" }Response:
{
"cleared": 42,
"pattern": "dqa:*"
}The data/ directory holds JSON arrays used by the validator (default dates, document exemptions, etc.). These endpoints let you inspect and edit them at runtime without restarting the service.
List available configs
GET /dqa/config{ "configs": ["default_dates", "document_validation_exemptions", "non_acronyms"] }Get all values in a config
GET /dqa/config/<config_name>{ "config_name": "default_dates", "values": ["1900-01-01", "1970-01-01"] }Edit a config - action is one of add, remove, or update:
PATCH /dqa/config/<config_name>
Content-Type: application/json// Add a value
{ "action": "add", "value": "2000-01-01" }
// Remove a value
{ "action": "remove", "value": "1900-01-01" }
// Replace a value
{ "action": "update", "old_value": "1900-01-01", "new_value": "1901-06-01" }Returns the full updated list on success. Error responses: 400 (bad request), 404 (config or value not found), 409 (value already exists).
Note: Changes to
default_datesare applied to the running process immediately. All other edits are persisted to disk and picked up on the next request that reads the file.
See VALIDATOR_RULES.md for the full per-attribute and per-document validation logic, including conditions, statuses, messages, and percentage calculations.
pytest # run all tests with coverage
pytest -v # verbose outputSee DEVELOPMENT.md for more test commands and guidance.
| Variable | Description | Default |
|---|---|---|
SOLR_URL |
Solr instance URL | http://localhost:8983/solr/activity |
REDIS_URL |
Redis connection URL | redis://localhost:6379/0 |
CACHE_TTL |
Cache TTL in seconds | 86400 (24 hours) |
SECRET_KEY |
API authentication key (Authorization header) |
ZIMMERMAN |
DEFAULT_DATES |
Comma-separated default dates | 1900-01-01,1970-01-01 |
BUSINESS_CASE_EXEMPTION_MONTHS |
Months before BC required | 3 |
LOGICAL_FRAMEWORK_EXEMPTION_MONTHS |
Months before LF required | 3 |
ANNUAL_REVIEW_EXEMPTION_MONTHS |
Months before AR required | 19 |
SECTOR_TOLERANCE |
Sector percentage tolerance | 0.02 |
LOCATION_TOLERANCE |
Location percentage tolerance | 0.02 |
See DEVELOPMENT.md for the full annotated project structure.
This project was co-developed with AI to accelerate feature delivery. All code has been manually reviewed and tested for quality.
See DEVELOPMENT.md for setup, formatting, linting, adding new validations, debugging, and more.
- Monitoring: Implement logging and metrics
- Scaling: Use Redis Cluster for high availability
- Rate Limiting: Add rate limiting middleware
- HTTPS: Deploy behind reverse proxy with SSL
docker build -t iati-dqa-api:latest .
docker-compose -f docker-compose.yml up -d- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
For issues and questions: