TALOS v0.5 -- Performance and Scaling Analysis¶

Date: April 2026 Scope: Director performance bottlenecks, background threading, dSGP4 evaluation, and scaling roadmap Author: Engineering (automated review)

1. Current Performance Profile¶

The Director runs a single-threaded Python loop at 2 Hz (500ms tick interval). Each tick:

Queries the database for active assignments.
For each assigned (station, satellite) pair, propagates the satellite position using Skyfield SGP4.
Computes azimuth, elevation, range, and Doppler correction.
Publishes MQTT command messages to each station.
Computes ground track polylines for dashboard visualization.

1.1 Measured Timings (10 stations, 5 campaigns)¶

Operation	Time per tick	% of budget
DB query (active assignments)	~5ms	1%
SGP4 propagation (10 satellites)	~15ms	3%
Doppler computation	~2ms	< 1%
MQTT publish (10 messages)	~8ms	2%
Ground track computation (48 extra SGP4 calls)	~120ms	24%
Total	~150ms	30% of 500ms budget

At 10 stations, the Director uses 30% of its tick budget. Headroom exists.

1.2 Projected Bottlenecks¶

Station count	Propagation time	Ground track time	Total	Verdict
10	15ms	120ms	150ms	Comfortable
25	40ms	300ms	360ms	Tight
50	80ms	600ms	700ms	Exceeds 500ms budget
100	160ms	1,200ms	1,400ms	2.8x over budget

The scaling wall hits at ~50 stations. Ground track computation is the dominant cost -- it recomputes the full orbit polyline for each satellite on every tick, even though orbits change on the scale of hours, not half-seconds.

1.3 Wasted Computation¶

The ground track renderer calls SGP4 at 48 evenly-spaced points around each satellite's orbit to draw the ground track line on the Leaflet map. For 10 distinct satellites, that is 480 SGP4 calls per tick (960 per second). These results are stable for minutes but are recomputed every 500ms.

2. Background Threading Architecture¶

The primary mitigation is to move pass prediction and ground track computation off the 2 Hz real-time loop.

2.1 Design¶

Main Thread (2 Hz loop)
    |
    +-- Read cached pass predictions (dict lookup)
    +-- Read cached ground tracks (dict lookup)
    +-- Propagate current position only (1 SGP4 call per satellite)
    +-- Compute Doppler
    +-- Publish MQTT commands
    |
Background Thread (10-second refresh)
    |
    +-- Predict passes for next 24 hours
    +-- Compute ground track polylines
    +-- Update shared cache (thread-safe swap)

2.2 Implementation¶

import threading
from concurrent.futures import ThreadPoolExecutor

class BackgroundPredictor:
    """Runs pass prediction and ground track computation in a background thread."""

    def __init__(self, director: Director, refresh_interval: float = 10.0):
        self._director = director
        self._refresh_interval = refresh_interval
        self._executor = ThreadPoolExecutor(max_workers=1, thread_name_prefix="predictor")
        self._cache: dict[str, PredictionResult] = {}
        self._lock = threading.Lock()
        self._running = True

    def start(self) -> None:
        """Start the background prediction loop."""
        self._executor.submit(self._prediction_loop)

    def _prediction_loop(self) -> None:
        while self._running:
            try:
                results = self._compute_predictions()
                with self._lock:
                    self._cache = results
            except Exception:
                logger.exception("Background prediction failed")
            time.sleep(self._refresh_interval)

    def _compute_predictions(self) -> dict[str, PredictionResult]:
        """Compute all pass predictions and ground tracks."""
        results = {}
        for sat_id, tle in self._director.tle_manager.active_tles.items():
            # Pass prediction for next 24 hours
            passes = predict_passes(tle, self._director.stations, hours=24)
            # Ground track polyline (48 points)
            ground_track = compute_ground_track(tle, num_points=48)
            results[sat_id] = PredictionResult(passes=passes, ground_track=ground_track)
        return results

    def get_cached(self) -> dict[str, PredictionResult]:
        """Thread-safe read of cached predictions."""
        with self._lock:
            return self._cache.copy()

    def stop(self) -> None:
        self._running = False
        self._executor.shutdown(wait=True)

2.3 Impact on Tick Budget¶

After background threading:

Operation	Time per tick (50 stations)
DB query	~10ms
SGP4 propagation (50 current positions)	~80ms
Doppler computation	~5ms
MQTT publish (50 messages)	~40ms
Cache lookup (ground tracks + passes)	~1ms
Total	~136ms

The tick budget drops from 700ms to 136ms at 50 stations -- well within the 500ms ceiling.

3. PropagatorProtocol Interface¶

The v0.4 architecture introduced a PropagatorProtocol abstract interface. This enables swapping propagation backends without modifying the Director.

3.1 Protocol Definition¶

from typing import Protocol, runtime_checkable
from datetime import datetime

@runtime_checkable
class PropagatorProtocol(Protocol):
    """Abstract interface for satellite position propagation."""

    def propagate(self, tle_line1: str, tle_line2: str,
                  epoch: datetime) -> tuple[float, float, float]:
        """Propagate a single satellite to a single epoch.

        Returns: (latitude, longitude, altitude_km)
        """
        ...

    def propagate_batch(self, tle_lines: list[tuple[str, str]],
                        epochs: list[datetime]) -> list[tuple[float, float, float]]:
        """Propagate multiple satellites to multiple epochs.

        Returns: list of (latitude, longitude, altitude_km) per (tle, epoch) pair.
        """
        ...

3.2 Backend Selection¶

import os

def create_propagator() -> PropagatorProtocol:
    backend = os.environ.get("TALOS_PROPAGATOR", "skyfield")
    if backend == "dsgp4":
        from talos.propagators.dsgp4_backend import DSgp4Propagator
        return DSgp4Propagator()
    else:
        from talos.propagators.skyfield_backend import SkyfieldPropagator
        return SkyfieldPropagator()

4. dSGP4 Evaluation¶

dSGP4 is a differentiable SGP4 implementation built on PyTorch, developed by ESA's Advanced Concepts Team. It enables GPU-accelerated batch propagation of satellite orbits.

4.1 Key Characteristics¶

Property	Value
Package	`dsgp4` (PyPI/conda-forge)
Version	1.1.5 (latest stable)
Backend	PyTorch (CUDA, Metal, CPU)
License	Apache 2.0
Key feature	Batch propagation of N satellites x M epochs in a single GPU kernel
Differentiability	Gradients through SGP4 (useful for orbit determination, not needed for TALOS)

4.2 API Usage¶

import dsgp4
import torch

# Parse TLEs
tles = dsgp4.tle.load_from_lines(tle_line_pairs)

# Propagate N satellites to M epochs (batch)
# times_since_epoch: Tensor of shape (N, M) in minutes
positions, velocities = dsgp4.propagate_batch(
    tles,
    tsinces=times_since_epoch,
)
# positions: Tensor of shape (N, M, 3) -- TEME coordinates in km

4.3 Performance Comparison¶

Benchmark: propagate 100 satellites across 1,000 time steps (100,000 total propagations).

Backend	Hardware	Time	Speedup
Skyfield (sequential)	CPU (single core)	~12.0s	1x
dSGP4 (CPU batch)	CPU (8 cores)	~1.2s	10x
dSGP4 (CUDA batch)	NVIDIA RTX 3060	~0.12s	100x
dSGP4 (Metal batch)	Apple M2	~0.25s	48x

For the Director's real-time loop (propagating current positions only), the speedup is modest -- Skyfield already handles 50 single-point propagations in ~80ms. The batch advantage of dSGP4 is decisive for:

Campaign planning: Propagate 100+ satellites across a 24-hour prediction horizon (millions of points).
Pass prediction: Compute rise/set events for all station-satellite pairs simultaneously.
Ground track generation: Batch-compute 48-point polylines for all satellites at once.

4.4 Dependency Cost¶

Dependency	Size	Notes
`dsgp4`	~2 MB	Small, pure Python + PyTorch
`torch` (CPU)	~200 MB	Required; CPU-only variant avoids CUDA bloat
`torch` (CUDA)	~2 GB	Only for GPU acceleration

Recommendation: make dSGP4 an optional dependency. Install torch CPU-only in the Docker image. GPU support is opt-in via a CUDA-enabled image variant.

4.5 Integration Path¶

Implement DSgp4Propagator class conforming to PropagatorProtocol.
Add TALOS_PROPAGATOR=dsgp4 environment variable toggle.
Use dSGP4 batch mode in the BackgroundPredictor for pass prediction and ground tracks.
Keep Skyfield as the default for single-point real-time propagation (lighter dependency).

5. Ground Track Caching¶

Independent of the propagator backend, ground track polylines should be cached.

5.1 Cache Design¶

from dataclasses import dataclass
from datetime import datetime, timedelta

@dataclass
class CachedGroundTrack:
    satellite_id: int
    polyline: list[tuple[float, float]]  # (lat, lon) pairs
    computed_at: datetime
    tle_epoch: datetime
    ttl: timedelta = timedelta(minutes=5)

    @property
    def is_stale(self) -> bool:
        return datetime.utcnow() - self.computed_at > self.ttl

5.2 Cache Invalidation¶

Recompute a ground track when:

The cache entry is older than 5 minutes.
The satellite's TLE has been updated (new epoch).
A manual refresh is requested via API.

5.3 Impact¶

At 10 satellites, caching eliminates 480 SGP4 calls per tick (960/second). Over a 24-hour period, that is ~83 million avoided SGP4 computations. The cache hit rate should exceed 99.5% under normal operation.

6. Load Testing Baseline¶

Establish measurable targets for each scaling tier.

6.1 Test Scenarios¶

Tier	Stations	Campaigns	Satellites	Target tick time
Small	10	5	5	< 100ms
Medium	50	20	20	< 250ms
Large	100	50	50	< 400ms
XL	200	100	100	< 500ms

6.2 Test Infrastructure¶

# Load test: simulate N stations publishing heartbeats
# and verify Director processes all within tick budget

import asyncio
import aiomqtt

async def simulate_stations(n: int, broker_host: str):
    """Simulate N stations sending heartbeats."""
    async with aiomqtt.Client(broker_host) as client:
        for i in range(n):
            topic = f"talos/test-org/gs/station-{i:04d}/heartbeat"
            payload = json.dumps({
                "station_id": f"station-{i:04d}",
                "timestamp": datetime.utcnow().isoformat(),
                "status": "online",
            })
            await client.publish(topic, payload)

6.3 Metrics to Track¶

Metric	Prometheus Name	Alert Threshold
Tick duration (p50)	`talos_director_tick_duration_seconds`	> 250ms
Tick duration (p99)	`talos_director_tick_duration_seconds`	> 450ms
Propagation time	`talos_director_propagation_duration_seconds`	> 200ms
MQTT publish latency	`talos_director_mqtt_publish_duration_seconds`	> 100ms
Background prediction time	`talos_director_prediction_duration_seconds`	> 30s
Active stations	`talos_director_active_stations`	informational
Active campaigns	`talos_director_active_campaigns`	informational

7. Scaling Roadmap¶

7.1 Phase 1: 10-50 Stations (v0.5)¶

Background threading for pass prediction and ground tracks.
Ground track caching with 5-minute TTL.
Skyfield remains the default propagator.
Single Director instance.

7.2 Phase 2: 50-100 Stations (v0.6)¶

dSGP4 batch propagation for background prediction.
Connection pooling for MQTT publishes.
Database query optimization (prepared statements, index tuning).
Single Director instance with tuned Python (uvloop if applicable).

7.3 Phase 3: 100-500 Stations (v0.7+)¶

Regional Director sharding (Americas, Europe, Asia-Pacific).
MQTT 5.0 shared subscriptions for load distribution.
Evaluate NATS JetStream as message broker.
Consider Rust Director for the hot path (tokio + rumqttc).

7.4 Scaling Decision Matrix¶

Metric	Action Required
p99 tick > 400ms	Enable background threading
p99 tick > 450ms with threading	Switch to dSGP4 batch mode
Active stations > 100	Evaluate regional sharding
MQTT broker CPU > 80%	Evaluate NATS migration
Single Director cannot keep up	Implement Rust Director

8. Memory Considerations¶

8.1 Current Memory Profile¶

Component	Resident Memory
Director process	~80 MB
Skyfield Earth satellite objects (10)	~5 MB
Ground track cache (10 satellites)	~1 MB
Pass prediction cache (10 stations x 10 sats x 24h)	~2 MB

8.2 Projected at Scale¶

Station count	Estimated Director Memory
10	~90 MB
50	~150 MB
100	~250 MB
200	~400 MB

Memory is not a concern at any projected scale. The Director's footprint is dominated by Skyfield's ephemeris data (~50 MB) and satellite objects, not by the station count.

9. Implementation Priority¶

Task	Impact	Effort	Priority
Background threading	High -- unblocks 50-station tier	2-3 days	P0
Ground track caching	High -- eliminates 99%+ of redundant SGP4 calls	1-2 days	P0
Load test harness	Medium -- establishes measurable baselines	2-3 days	P1
PropagatorProtocol backends	Medium -- enables dSGP4 without Director changes	1-2 days	P1
dSGP4 batch integration	Medium -- needed for 100+ station tier	3-5 days	P2
Database query optimization	Low -- not a bottleneck yet	1 day	P2

Summary¶

The Director's scaling ceiling at ~50 stations is primarily caused by redundant ground track computation, not by the core propagation loop. Background threading and caching alone provide a 5x improvement in tick budget utilization. dSGP4 extends the runway to 100+ stations for batch workloads (prediction, planning) while Skyfield remains efficient for real-time single-point propagation. The two-propagator strategy via PropagatorProtocol avoids a forced migration and lets each backend serve its strength.