When Trains Meet Snowstorms: Turning Weather Chaos into Predictable Rail Operations

Opening — Why this matters now

Railway delays are one of those problems everyone experiences and almost no one truly understands. Passengers blame weather. Operators blame operations. Data scientists blame missing variables. Everyone is partially correct.

What has quietly shifted in recent years is not the weather itself, but our ability to observe it alongside operations—continuously, spatially, and at scale. As rail systems push toward AI‑assisted scheduling, predictive maintenance, and real‑time disruption management, delay prediction without weather is no longer just incomplete—it is structurally misleading.

This paper arrives at the right moment: not with another model, but with something more foundational—a properly engineered dataset that finally lets weather and rail operations talk to each other.

Background — Context and prior art

Railway analytics has never suffered from a lack of data. Timetables, GPS traces, rolling stock logs, and station events are widely available across Europe and Asia. What has been missing is integration.

Most public railway datasets focus on what trains did, not under what conditions they did it. Weather, despite being repeatedly cited as a major delay driver—especially in Nordic and alpine regions—has typically been left out or reduced to a handful of coarse indicators.

Previous studies that attempted weather-aware delay prediction often relied on limited variables (temperature, rainfall) or narrow geographies. Others acknowledged weather as important future work, a polite academic way of saying “we know this matters, but the data is painful.”

The Finnish context makes this omission particularly costly. With a largely single-track network and temperatures ranging from −40°C to +30°C, small disruptions propagate fast. Weather is not noise here; it is structure.

Analysis — What the paper actually does

Instead of proposing a new neural architecture, the authors do something more operationally valuable: they build infrastructure.

Two data worlds, finally aligned

The dataset fuses:

Railway operations data from Finland’s Digitraffic system (2018–2024), covering schedules, arrivals, departures, delays, cancellations, and stop-level granularity.
Meteorological observations from 209 Finnish Meteorological Institute stations, with up to 13 weather variables at 1‑minute or 10‑minute resolution.

The key is not collection, but alignment.

Spatial–temporal matching that respects reality

Each train stop is matched to the nearest weather station using Haversine distance, then aligned in time using nearest-neighbor matching within tolerance windows. When a weather station lacks a specific sensor (snow depth, visibility, precipitation), a 50 km spatial fallback retrieves the closest valid measurement.

This is not mathematically elegant. It is operationally sane.

Feature engineering that understands time

Rather than treating hours, weekdays, and months as linear numbers, the dataset applies cyclical sine–cosine encoding, preserving temporal continuity (midnight is close to 23:00, February is not “far” from January).

Weather variables are robust-scaled using interquartile ranges, acknowledging that meteorological data is inherently outlier-prone.

Delay is not one thing—and the dataset proves it

The most underappreciated contribution is how the paper treats delay itself.

Three targets are provided:

Target	What it captures	Why it matters
Cumulative delay	End-to-end passenger experience	Useful for passenger information systems
Origin-offset delay	Removes initial lateness	Cleaner operational signal
Station-specific delay	Isolates incremental delay per segment	Critical for diagnosing causes

Most delays, it turns out, are inherited—not created locally. Without this distinction, models learn propagation, not causality.

Findings — What the data reveals

The descriptive analysis alone justifies the dataset.

Seasonal reality, quantified

Winter months regularly exceed 25% delay incidence.
Summer performs better overall, but June spikes—likely due to traffic volume rather than weather.

Weekly structure

Fridays are consistently the worst-performing day.
Saturdays are the most reliable—an operational pattern, not a meteorological one.

Delay severity distribution

Medium delays (10–15 minutes) dominate. Extreme delays exist, but they are not the norm—important when designing loss functions and evaluation metrics.

Baseline ML result (and why it matters)

A straightforward XGBoost regression, trained on station-specific delay targets, achieves:

Target	MAE (minutes)
Station-specific delay	2.73
Cumulative delay	4.21
Origin-offset delay	4.81

The lesson is not that XGBoost is impressive. It is that the right target formulation makes prediction easier.

Implications — What this enables next

This dataset quietly unlocks several directions that were previously speculative:

Weather impact attribution With station-level delay increments, weather variables can finally be analyzed causally rather than correlatively.
Graph and sequence models Delay propagation is inherently sequential and networked. Transformers and GNNs now have the data structure they require.
Infrastructure vulnerability mapping Repeated weather-sensitive delay hotspots point directly to physical bottlenecks, not just operational ones.
Operational realism in AI systems Models that ignore weather tend to look accurate—until winter arrives. This dataset removes that illusion.

Conclusion — Less mystery, more mechanics

This work does not promise perfect delay prediction. It does something better: it removes a long-standing excuse.

By integrating operational railway data with high-resolution meteorological observations—carefully, transparently, and at national scale—the authors provide a foundation that future models can actually stand on.

In an industry where “weather disruption” is often treated as an act of God, this dataset reframes it as a measurable, modelable, and ultimately manageable variable.

Cognaptus: Automate the Present, Incubate the Future.

Opening — Why this matters now#

Background — Context and prior art#

Analysis — What the paper actually does#

Two data worlds, finally aligned#

Spatial–temporal matching that respects reality#

Feature engineering that understands time#

Delay is not one thing—and the dataset proves it#

Findings — What the data reveals#

Seasonal reality, quantified#

Weekly structure#

Delay severity distribution#

Baseline ML result (and why it matters)#

Implications — What this enables next#

Conclusion — Less mystery, more mechanics#