Configuring Provenance Tracking
This guide covers how to configure provenance tracking in MetricEngine for different use cases, from development and debugging to production environments.
Quick Start
Enable/Disable Provenance
from metricengine.provenance_config import enable_provenance, disable_provenance
# Disable provenance globally
disable_provenance()
# Re-enable provenance
enable_provenance()
Performance Mode
For production environments where performance is critical:
from metricengine.provenance_config import set_performance_mode
# Configure for optimal performance
set_performance_mode()
Debug Mode
For development and debugging:
from metricengine.provenance_config import set_debug_mode
# Enable all features and detailed logging
set_debug_mode()
Configuration Options
Core Feature Toggles
from metricengine.provenance_config import update_global_config
# Control which types of operations are tracked
update_global_config(
enabled=True, # Master switch for all provenance
track_literals=True, # Track literal value creation
track_operations=True, # Track arithmetic operations
track_calculations=True, # Track engine calculations
)
Error Handling
# Configure error handling behavior
update_global_config(
fail_on_error=False, # Degrade gracefully on errors
log_errors=True, # Log provenance errors
log_level=logging.WARNING, # Log level for errors
)
Performance Controls
# Tune performance-related settings
update_global_config(
max_history_depth=1000, # Limit provenance chain depth
enable_spans=True, # Enable calculation spans
enable_id_interning=True, # Intern IDs to save memory
max_hash_cache_size=10000, # Cache size for hash operations
)
Memory Management
# Configure memory usage
update_global_config(
enable_weak_refs=False, # Use weak references in graphs
max_graph_size=10000, # Maximum nodes in provenance graph
enable_history_truncation=True, # Enable history truncation
)
Debugging Options
# Enable debugging features
update_global_config(
debug_mode=True, # Enable debug information
include_stack_traces=True, # Include stack traces in errors
)
Context-Specific Configuration
Use the provenance_config context manager for temporary configuration changes:
from metricengine.provenance_config import provenance_config
# Temporarily disable provenance for performance-critical code
with provenance_config(enabled=False):
# Provenance tracking disabled in this block
result = expensive_calculation()
# Provenance tracking restored outside the block
Multiple Overrides
# Override multiple settings temporarily
with provenance_config(
track_literals=False,
enable_spans=False,
max_history_depth=100
):
# Reduced provenance tracking for this block
results = batch_calculations()
Environment-Specific Configurations
Development Environment
from metricengine.provenance_config import ProvenanceConfig, set_global_config
# Full-featured configuration for development
dev_config = ProvenanceConfig(
enabled=True,
track_literals=True,
track_operations=True,
track_calculations=True,
fail_on_error=False,
log_errors=True,
log_level=logging.DEBUG,
debug_mode=True,
include_stack_traces=True,
max_history_depth=10000,
enable_spans=True,
)
set_global_config(dev_config)
Production Environment
# Optimized configuration for production
prod_config = ProvenanceConfig(
enabled=True,
track_literals=False, # Skip literals for performance
track_operations=True,
track_calculations=True,
fail_on_error=False, # Always degrade gracefully
log_errors=True,
log_level=logging.ERROR, # Only log serious errors
debug_mode=False,
include_stack_traces=False,
max_history_depth=500, # Limit history depth
enable_spans=False, # Disable spans for performance
enable_id_interning=True, # Enable memory optimizations
max_hash_cache_size=5000,
enable_weak_refs=True,
enable_history_truncation=True,
)
set_global_config(prod_config)
Testing Environment
# Configuration for automated testing
test_config = ProvenanceConfig(
enabled=True,
track_literals=True,
track_operations=True,
track_calculations=True,
fail_on_error=True, # Fail fast in tests
log_errors=False, # Reduce test noise
debug_mode=False,
max_history_depth=1000,
enable_spans=True,
)
set_global_config(test_config)
Performance Tuning
High-Performance Applications
For applications where every microsecond counts:
# Minimal provenance tracking
update_global_config(
track_literals=False, # Skip literal tracking
enable_spans=False, # Disable span tracking
max_history_depth=50, # Very limited history
enable_id_interning=True, # Memory optimization
max_hash_cache_size=1000, # Smaller cache
enable_weak_refs=True, # Prevent memory leaks
)
Memory-Constrained Environments
For environments with limited memory:
# Memory-optimized configuration
update_global_config(
max_history_depth=100, # Limit history depth
max_graph_size=1000, # Limit graph size
enable_id_interning=True, # Intern strings
enable_weak_refs=True, # Use weak references
enable_history_truncation=True, # Enable truncation
max_hash_cache_size=500, # Smaller cache
)
Monitoring and Diagnostics
Check Configuration Status
from metricengine.provenance_config import get_config
# Get current configuration
config = get_config()
print(f"Provenance enabled: {config.enabled}")
print(f"Track operations: {config.track_operations}")
print(f"Max history depth: {config.max_history_depth}")
Check Provenance Availability
from metricengine.provenance_config import is_provenance_available
if is_provenance_available():
print("Provenance tracking is available and functional")
else:
print("Provenance tracking is disabled or unavailable")
Error Logging
Configure logging to monitor provenance issues:
import logging
# Set up logging for provenance errors
logging.getLogger('metricengine.provenance_config').setLevel(logging.WARNING)
# Enable detailed error logging
update_global_config(
log_errors=True,
log_level=logging.INFO,
include_stack_traces=True,
)
Best Practices
1. Environment-Specific Configuration
Always configure provenance based on your environment:
Development: Enable all features for debugging
Testing: Enable strict error handling
Production: Optimize for performance and stability
2. Gradual Rollout
When enabling provenance in production:
Start with minimal tracking (
track_calculations=Trueonly)Monitor performance impact
Gradually enable more features as needed
3. Memory Management
For long-running applications:
Enable history truncation
Set reasonable limits on graph size
Use weak references to prevent memory leaks
Monitor memory usage over time
4. Error Handling
Configure error handling appropriately:
Development:
fail_on_error=Falsewith detailed loggingProduction:
fail_on_error=Falsewith minimal loggingTesting:
fail_on_error=Trueto catch issues early
5. Performance Testing
Always benchmark your specific use case:
import time
from metricengine.provenance_config import provenance_config
# Benchmark with and without provenance
def benchmark_calculation():
start = time.time()
# Your calculation here
result = complex_calculation()
return time.time() - start
# Test with provenance
with_provenance = benchmark_calculation()
# Test without provenance
with provenance_config(enabled=False):
without_provenance = benchmark_calculation()
overhead = (with_provenance - without_provenance) / without_provenance * 100
print(f"Provenance overhead: {overhead:.2f}%")
Troubleshooting
High Memory Usage
If provenance is using too much memory:
Reduce
max_history_depthEnable
enable_history_truncationUse
enable_weak_refs=TrueReduce
max_graph_size
Performance Issues
If provenance is impacting performance:
Disable literal tracking:
track_literals=FalseDisable spans:
enable_spans=FalseReduce cache size:
max_hash_cache_size=1000Use performance mode:
set_performance_mode()
Missing Provenance Data
If provenance data is missing:
Check if provenance is enabled:
get_config().enabledVerify specific tracking is enabled (literals, operations, calculations)
Check for error logs that might indicate failures
Ensure you’re not in a context with provenance disabled
Error Messages
Common error scenarios and solutions:
“Unknown configuration option”: Check spelling of configuration keys
“Provenance generation failed”: Enable error logging to see details
“Hash collision detected”: Extremely rare; contact support if this occurs
“Graph size limit exceeded”: Increase
max_graph_sizeor enable truncation