Data Generators
Lariv relies heavily on programmatically generated synthetic data during local testing and CI/CD pipelines.
The Architecture
Due to the strictly decoupled plugin nature of the project, data generation scripts cannot simply be run in an arbitrary order. For example, the p_invoices plugin cannot generate mock invoices until the p_orders plugin has generated mock orders.
To solve this, Lariv uses a topological sorting GeneratorRegistry.
Writing a Generator
Every plugin that needs data generation implements a typical Python class that is decorated with GeneratorRegistry.register.
# plugins/p_invoices/generator.py
from lariv.registry import GeneratorRegistry
@GeneratorRegistry.register("invoices_generator")
class InvoiceGenerator:
# Explicitly define dependencies on other generators by their string keys
dependencies = ["users_generator", "orders_generator"]
def run(self):
""" The entrypoint method invoked by generate_data """
from .models import Invoice
print("Generating mock invoices...")
Invoice.objects.create(...)
Running Generators
The generate_data.py script (accessible via python manage.py generate_data) acts as the conductor.
# lariv/generate_data.py
from lariv.registry import GeneratorRegistry
class DataGenerator:
def generate_all_data(self):
print("\nStarting data generation...")
# Automatically resolves the DAG dependencies and executes `.run()` in order
GeneratorRegistry.run_all()
When generate_all_data runs, GeneratorRegistry.run_all() resolves the dependency graph and ensures users_generator and orders_generator complete successfully before attempting to execute InvoiceGenerator.run().