Memory Management with Stash and Buf
Paral provides two powerful memory management mechanisms for sharing data between tasks: stash and buf. These features allow you to pass data, cache results, and coordinate complex workflows across multiple tasks.
Stash: Task-Scoped Memory
The stash is a temporary memory store that's shared between a task and all its dependent tasks in the dependency chain. Think of it as a secure, ephemeral workspace that automatically cleans up after task completion.
Key Characteristics of Stash
- Scoped: Available only to the current task and its dependency chain
- Temporary: Automatically freed when the task (and all dependent tasks) complete
- Secure: Data is isolated and cleaned up, preventing memory leaks
- Sequential: Shared with tasks that depend on the current task
Basic Stash Usage
task process_data {
-> @stash["user_count"] << wc -l users.csv
-> @stash["timestamp"] << date +%s
}
@depend process_data
task generate_report {
-> @printf("Processing %d users at %s", @stash('user_count'), @stash('timestamp')) > report.txt
-> @stash["report_path"] << echo "reports/$(date).txt"
}
@depend generate_report
task upload_report {
-> aws s3 cp report.txt @sprintf("s3://mybucket/%s", @stash("report_path"))
-> @printf("Upload completed for %d users", ${@stash('user_count'))
}
In this example:
- process_data stores user count and timestamp in stash
- generate_report can access both values and adds its own
- upload_report has access to all stash data from the dependency chain
- Once upload_report completes, all stash data is automatically freed
Advanced Stash Example: Data Pipeline
task extract_emails {
-> @stash["raw_emails"] << cat contacts.txt | grep -E '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
-> @stash["email_count"] << echo "${@stash('raw_emails')}" | wc -l
}
@depend extract_emails
task validate_emails {
-> @stash["valid_emails"] << echo @sprintf(@stash('raw_emails')) | python3 scripts/validate_emails.py
-> @stash["validation_report"] << @sprintf("Validated %s emails", @stash('email_count'))
}
@depend validate_emails
task send_newsletters {
-> python3 scripts/send_bulk.py --emails=@sprintf(@stash('valid_emails'))
-> @printf("%s - Newsletter sent!", @stash('validation_report'))
}
Buf: Global Persistent Memory
The buf is a global memory store shared across all tasks in your Paral execution. Unlike stash, buf data persists throughout the entire workflow and can be accessed by any task.
Key Characteristics of Buf
- Global: Accessible from any task in the workflow
- Persistent: Data remains available until the entire Paral execution completes
- Shared: All tasks can read and write to the same buf space
- Coordination: Perfect for task synchronization and global state
Basic Buf Usage
task initialize {
-> @buf["config"] << cat config.json
-> @buf["start_time"] << date +%s
}
task process_frontend {
-> npm run build
-> @buf["frontend_status"] << echo "completed"
}
task process_backend {
-> go build ./cmd/server
-> @buf["backend_status"] << echo "completed"
}
@wait @isnot(@buf("frontend_status"), "") @isnot(@buf("backend_status"), "")
task deploy {
-> @printf("Both services ready. Config: %s", @buf('config'))
-> sh scripts/deploy.sh
-> @buf["deployment_time"] << date +%s
}
Task Coordination with Buf
One of buf's most powerful features is enabling task coordination through the @wait directive:
task download_dataset {
-> wget https://example.com/data.csv -O dataset.csv
-> @buf["dataset_ready"] << echo "true"
}
task setup_database {
-> docker run -d postgres:13
-> sleep 10
-> @buf["db_ready"] << echo "true"
}
# Wait for both prerequisites before proceeding
@wait @isnot(@buf("dataset_ready"), "") @isnot(@buf("db_ready"), "")
task import_data {
-> psql -c "COPY data FROM '/dataset.csv' CSV HEADER"
-> @buf["import_complete"] << date
}
@wait @isnot(@buf("import_complete"), "")
task run_analysis {
-> python3 analysis.py --data=imported
-> @printf("Analysis completed at %s", @buf('import_complete'))
}
Real-World Example: CI/CD Pipeline
# Initialize global configuration
task setup {
-> @buf["build_id"] << git rev-parse --short HEAD
-> @buf["environment"] << echo @getenv("ENV", "staging")
-> @buf["artifact_bucket"] << echo @sprintf("builds-%s", @buf('environment'))
}
# Parallel testing tasks
task test_unit {
-> npm test
-> @buf["unit_tests"] << echo "passed"
}
task test_integration {
-> npm run test:integration
-> @buf["integration_tests"] << echo "passed"
}
task security_scan {
-> npm audit --audit-level=moderate
-> @buf["security_scan"] << echo "passed"
}
```javascript
task build_app {
-> npm run build
-> tar -czf @sprintf("app-%s.tar.gz", @buf("build_id")) dist/
-> @buf["artifact_path"] << @sprintf("app-%s.tar.gz", @buf('build_id'))
}
# Deploy using shared configuration
@depend build_app
task deploy {
-> aws s3 cp @buf("artifact_path") @sprintf("s3://%s/", @buf("artifact_bucket"))
-> kubectl set image deployment/app @sprintf("app=myregistry/%s", @buf("artifact_path"))
-> @printf("Deployed %s to %s", @buf('build_id'), @buf('environment'))
}
Choosing Between Stash and Buf
-
Use Stash when:
- Data is only needed within a specific task dependency chain
- You want automatic memory cleanup for security
- Working with sensitive data that shouldn't persist
- Processing temporary intermediate results
-
Use Buf when:
- Multiple unrelated tasks need access to the same data
- Coordinating parallel tasks with shared state
- Caching expensive computations for reuse
- Maintaining global configuration throughout the workflow
Advanced Patterns
task analyze_logs {
-> @buf["analysis_start"] << date +%s
-> @stash["raw_logs"] << tail -n 1000 /var/log/app.log
-> @stash["error_count"] << echo @stash('raw_logs') | grep ERROR | wc -l
}
@depend analyze_logs
task generate_alert {
-> @if(@is(@stash("error_count"), "gt", 10)){
@buf["alert_needed"] << echo "true"
@buf["error_details"] << @sprintf("High error rate: %d errors", @stash('error_count'))
}
}
@wait @isnot(@buf("alert_needed"), "")
task send_alert {
-> slack-cli send @buf('error_details') --channel=#alerts
}
Dynamic Task Coordination
task check_services {
-> @buf["web_status"] << curl -s -o /dev/null -w "%{http_code}" http://web:8080/health
-> @buf["api_status"] << curl -s -o /dev/null -w "%{http_code}" http://api:3000/health
-> @buf["db_status"] << pg_isready -h database -p 5432 && echo "200" || echo "500"
}
@wait @is(@buf("web_status"), "200") @is(@buf("api_status"), "200") @is(@buf("db_status"), "200")
task run_e2e_tests {
-> npm run test:e2e
-> @buf["e2e_complete"] << date
}
Performance Tips
- Cache expensive operations in buf for reuse across multiple tasks
- Use stash for intermediate processing steps that don't need global access
- Leverage @wait conditions to prevent unnecessary polling or resource waste