Revisiting Expected Possession Value in Football: Introducing a U-Net

Architecture, Reward and Risk for Passes, and a Benchmark

Thijs Overmeer

1,2 a

, Tim Janssen

3 b

and Wim P. M. Nuijten

1,2 c

Eindhoven AI Systems Institute, Eindhoven University of Technology, Eindhoven, The Netherlands

Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands

Royal Dutch Football Association (KNVB), Zeist, The Netherlands

Keywords:

Expected Possession Value, U-Net Architecture, Football Analytics, Pass Analysis, Risk-Reward Decompo-

sition, Machine Learning in Sports.

Abstract:

This paper presents an Expected Possession Value (EPV) model for football with three main new components:

a U-Net-inspired convolutional neural network architecture, ball height as a feature, and a dual-component

pass value model that analyzes reward and risk. We furthermore introduce the Overmeer–Janssen–Nuijten Pass

Expected Possession Value benchmark (OJN-Pass-EPV benchmark), which enables a quantitative evaluation

of EPV models by using pairs of game states with given relative EPVs. The presented EPV model achieves

good results in model loss and Expected Calibration Error on a dataset containing Dutch Eredivisie and 2022

FIFA Men’s World Cup matches and correctly identiﬁes the higher value state in 78% of the game state

pairs in the OJN-Pass-EPV benchmark, demonstrating its ability to accurately assess goal-scoring potential.

Our ﬁndings enable more precise EPV estimations, support risk-reward analysis for passing decisions, and

establish quality control standards for EPV models.

1 INTRODUCTION

Football analytics is increasingly important for gain-

ing a competitive edge. This paper focuses on a spe-

ciﬁc metric in the expanding realm of football data

analysis: Expected Possession Value (EPV). EPV

quantiﬁes the net probability of goal outcomes within

a ﬁxed time horizon: the probability that the team

in possession scores minus the probability that they

concede within τ seconds. Following Fern

andez et al.

(2021), who propose a horizon consistent with the av-

erage possession duration, we set τ = 15 seconds in

this study. The resulting value is a continuous, zero-

centered measure of goal-scoring potential with range

in [−1, 1].

This work addresses four key research questions

(RQs) in EPV modeling:

First, can we develop a high-quality EPV model

using modern deep learning architectures? We in-

vestigate whether U-Net convolutional neural net-

works, successful in medical imaging and other spa-

https://orcid.org/0009-0003-9108-1909

https://orcid.org/0000-0002-8050-1176

https://orcid.org/0000-0003-0351-2768

tial domains, can capture the complex spatial pat-

terns in football. For this, we introduce OJN-

EPV (Overmeer–Janssen–Nuijten Expected Posses-

sion Value), a U-Net–type architecture focused on

pass EPV, tested on the Dutch Eredivisie dataset and

the 2022 FIFA Men’s World Cup dataset.

Second, does incorporating ball height as a fea-

ture improve pass prediction? We examine whether

adding the vertical dimension enables the model to

distinguish between aerial and ground passes, po-

tentially improving prediction accuracy for different

types of passing scenarios.

Third, can decomposing pass value into risk and

reward components provide more actionable insights?

We deﬁne reward as the probability of scoring and

risk as the probability of conceding within 15 sec-

onds, for both successful and unsuccessful passes.

Fourth, how can we establish standardized eval-

uation methods for EPV models? We introduce the

Overmeer–Janssen–Nuijten Pass Expected Posses-

sion Value benchmark (OJN-Pass-EPV benchmark),

consisting of 50 expert-validated pairs of game states

for relative pass-value comparison.

100

Overmeer, T., Janssen, T. and Nuijten, W. P. M.

Revisiting Expected Possession Value in Football: Introducing a U-Net Architecture, Reward and Risk for Passes, and a Benchmark.

DOI: 10.5220/0013784300003988

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 13th International Conference on Sport Sciences Research and Technology Support (icSPORTS 2025), pages 100-109

ISBN: 978-989-758-771-9; ISSN: 2184-3201

1.1 Main Contributions

Based on our research questions, this work delivers

the following contributions:

• A U-Net-based EPV model achieving strong per-

formance across all components: pass success,

pass likelihood, and pass value prediction with

low loss and strong calibration.

• Empirical validation that ball height improves

pass likelihood predictions, enabling the model to

distinguish aerial from ground passes while ac-

counting for their inherent uncertainty.

• Decomposition of pass value into interpretable re-

ward and risk components, enabling tactical anal-

ysis of volatility in passing decisions beyond a

single metric.

• The OJN-Pass-EPV benchmark with 50 expert-

validated game state pairs, establishing the ﬁrst

standardized quantitative evaluation framework

for EPV models.

2 RELATED WORK

Quantifying the value of actions in football follows

two main lines: event-only models and tracking-

based models. Event-only approaches leverage the

broad availability of event data, while tracking-based

models exploit full-player spatio temporal data to cap-

ture off-ball context.

Event-based metrics such as Rudd’s Markov mod-

els (Rudd, 2011), VAEP (Decroos et al., 2019), Ex-

pected Threat (xThreat) (Singh, 2019), and ECOM

(Bransen et al., 2019) estimate the added value of ac-

tions from event data. They scale well across com-

petitions but cannot distinguish game states that share

identical on-ball actions yet differ in off-ball position-

ing, which limits their ability to evaluate relative pass

value in otherwise similar situations.

Tracking-based approaches combine tracking

data with events to model spatial dynamics. Metrics

such as Dangerousity (Link et al., 2016) and expected

pass (Anzer and Bauer, 2022) incorporate for example

player locations and velocities, offering richer con-

text. However, many works either simplify represen-

tation (e.g., coarse zones) or only partially encode the

full game state, and the resulting models are not al-

ways transparently interpretable for practitioners.

Fern

andez et al. (2019) introduce Expected Pos-

session Value (EPV) to football as a tracking-based

framework that estimates, at each event of a pass,

dribble, or shot, the net probability of scoring mi-

nus conceding within a ﬁxed horizon. EPV decom-

poses possession into actions (passes, carries, shots)

and produces spatially interpretable surfaces, includ-

ing pass value within a ﬁxed horizon (commonly 15

seconds) and pass success probabilities. Fern

andez

et al. (2021) further reﬁne components such as pass

likelihood, dribble and shot evaluation, and action se-

lection, and evaluate models with calibration and loss

metrics.

Beyond single-number value, prior work has ex-

amined risk–reward trade-offs for passes using track-

ing data (Goes et al., 2022; Power et al., 2017), typi-

cally deﬁning risk via interception likelihood and re-

ward via tactical outcomes. Our perspective differs

in objective: we deﬁne reward and risk directly in

terms of future goal scoring and conceding probabili-

ties within a 15-second horizon, yielding spatial value

surfaces for both successful and unsuccessful passes.

Two gaps remain salient. First, ball height is

rarely modeled, despite empirical differences be-

tween aerial and ground passes (H

aland et al., 2020).

Second, standardized relative evaluation for pass EPV

is lacking; existing works predominantly report ag-

gregate loss and calibration, which do not capture

whether a model ranks two closely related states in

the expert-expected order. Our work addresses both

by incorporating ball height explicitly, and by intro-

ducing an expert-validated benchmark of paired game

states for relative assessment.

3 METHODOLOGY

3.1 Data Collection

The data used in this study are sourced from the

Koninklijke Nederlandse Voetbalbond (KNVB), pro-

viding tracking data from TRACAB, and event data

from OPTA, which encompasses the 2021/22 and

2022/23 seasons of the Dutch Eredivisie, in addition

to data from the 2022 FIFA Men’s World Cup.

Our analysis utilizes data from 624 Eredivisie

matches and 63 2022 FIFA Men’s World Cup

matches. This combination captures a diverse array

of performance levels and playing styles, thereby pro-

viding a robust foundation for OJN-EPV.

3.2 Data Preprocessing and Feature

Engineering

We transform the raw event and tracking data for in-

tegration into our models through the following steps:

• Coordinate Normalization and Grid Scaling:

Player and ball coordinates are ﬁrst normalized to

Revisiting Expected Possession Value in Football: Introducing a U-Net Architecture, Reward and Risk for Passes, and a Benchmark

101

a standard pitch dimension (105m x 68m) based

on the venue information associated with each

match, addressing variations in actual pitch sizes.

These normalized coordinates are then scaled to

ﬁt a 104×68 grid representation for efﬁcient pro-

cessing in NumPy and TensorFlow. Velocities are

smoothed using a Savitzky-Golay ﬁlter to reduce

noise.

• Direction Standardization and Cleaning: We

standardize the data by ensuring all attacks pro-

ceed uniformly from left to right. Additionally,

we remove instances of players recorded outside

the pitch boundaries to improve data integrity.

• Real Playing Time Calculation: To accurately as-

sess pass value, we calculate the actual playing

time, excluding periods when the ball is out of

play. This ensures that our evaluation window of

15 seconds following each pass reﬂects only the

active duration of the game, providing a more pre-

cise assessment of in-game actions.

• Data Alignment: To ensure synchronicity, we

align the event data with the tracking data. This

ensures that each pass event is accurately reﬂected

in the tracking data, enabling precise spatial and

temporal analysis. The tracking data includes ball

height (z-axis). Positional data from optical track-

ing systems inherently contains noise (with typi-

cal errors around 7–8 cm (Linke et al., 2020)); our

1 m grid resolution is robust to such deviations in

the (x, y) positions.

For the pass likelihood, pass success, and pass value

models, we use the features described in Fern

andez

et al. (2021) and additionally incorporate the z-value

(height) of the ball.

3.3 Model Architecture

We select a U-Net–type architecture (Ronneberger

et al., 2015) due to its proven effectiveness in image

segmentation tasks, which share similarities with pre-

dicting dense, spatially-aware surfaces like pass EPV

across the pitch. The U-Net’s encoder-decoder struc-

ture with skip connections allows the model to cap-

ture both ﬁne-grained local details (e.g., player prox-

imity) and broader global context (e.g., overall team

formation), which are both crucial for accurate EPV

estimation.

Our pass OJN-EPV model takes a multi-channel

grid representation of the game state over the pitch

with dimensions (104×68) and produces a single out-

put grid of the same size. Each cell corresponds to

the predicted quantity at that location (e.g., pass suc-

cess probability, pass likelihood, or pass value). The

model comprises encoder and decoder blocks with

max pooling, replication padding, attention gates, and

concatenation layers. A diagram is provided in Fig-

ure 1.

Each encoder block applies two repetitions of:

replication padding, a convolution with a 5×5 kernel,

batch normalization, and a LeakyReLU activation (al-

pha = 0.1). The number of ﬁlters per block is 16,

32, and 64 in the contracting path, then 32 and 16 in

the expanding path to mirror the U-shape. Decoder

blocks consist of upsampling, replication padding, a

5×5 convolution with the corresponding number of

ﬁlters, batch normalization, and LeakyReLU (alpha

= 0.1).

Downsampling is performed by max pooling after

the ﬁrst two encoder blocks; pooling is omitted after

the third to preserve spatial resolution. The most con-

tracted feature maps are 26×17.

In the decoder, feature maps are upsampled. At-

tention gates modulate the high-resolution encoder

features using a gating signal from the decoder before

concatenation. The concatenated features are then

processed by the decoder convolutional blocks, com-

bining local detail with global context.

The ﬁnal layer uses a sigmoid activation for the

pass success model and softmax over the 104 × 68

grid for the pass likelihood model. For pass value,

we employ a softmax per grid cell with three classes

indicating outcomes within 15 seconds: goal for the

passing team, no goal, or goal for the opponent.

3.4 Model Training and Evaluation

We split the matches into training, validation, and test

sets using an 80-10-10 split for Eredivisie matches

and a 60-20-20 split for 2022 FIFA Men’s World Cup

matches. Due to the smaller size of the 2022 FIFA

Men’s World Cup dataset, we assign a higher per-

centage of samples to the validation and test sets to

enhance their statistical relevance. Table 1 shows

the distribution of successful and unsuccessful passes

across both datasets.

We ﬁrst train on the larger Eredivisie dataset

and subsequently ﬁne-tune on the 2022 FIFA Men’s

World Cup data. The training employs a cyclic learn-

ing rate, which ﬂuctuates between a base learning rate

of 1× 10

−6

and a maximum learning rate of 1 × 10

−4

following a triangular policy with a full cycle lasting 8

epochs (Smith, 2017). This method helps to avoid lo-

cal minima. Subsequently, we ﬁne-tune the model us-

ing data from the 2022 FIFA Men’s World Cup, where

the maximum learning rate is decreased to 1 ×10

−5

A batch size of 128 is used for all OJN-EPV mod-

els. Training stops when the validation loss does not

icSPORTS 2025 - 13th International Conference on Sport Sciences Research and Technology Support

102

Figure 1: High-level U-Net architecture with encoder–decoder and skip connections used in OJN-EPV.

Table 1: Comparison of Successful and Unsuccessful Passes.

Dataset Total Train Val Test % Success

Eredivisie 507,953 406,495 49,542 51,916 79.79%

2022 FIFA Men’s World Cup 58,569 34,093 11,787 12,689 81.52%

improve for 8 consecutive epochs. After training con-

verges, we select the epoch that provides a suitable

balance between the loss and expected calibration er-

ror (ECE). The Adam optimizer with default settings

in TensorFlow 2.18 is employed for all models.

Both pass success and pass value models employ

temperature scaling as a post-processing step. The

optimal temperature value, ranging from 0.1 to 2 with

a step size of 0.1, is selected to minimize the ECE on

the validation set.

3.5 Optimal Pass Location

Identiﬁcation

To operationalize OJN-EPV for evaluating passes (as

visualized in Figure 4), we create a pitch-wide surface

that combines pass success, pass value, and the like-

lihood of a pass arriving at each location. We restrict

the outputs to reasonably probable destinations using

a likelihood threshold.

Deﬁnition 1 (OJN-EPV Output). The model output

for location (x, y) is:

Output(x, y) =

(

V (x, y) if L(x, y) > 0.001

0 otherwise

(1)

where V (x, y) = S(x, y)V

(x, y) + (1 − S(x, y))V

(x, y)

(2)

(x, y) = P

score

(x, y|success)

− P

concede

(x, y|success) (3)

(x, y) = P

score

(x, y|no success)

− P

concede

(x, y|no success) (4)

with:

• V (x, y): estimated value of a pass that ends up at

location (x, y)

• L(x, y): likelihood that a pass ends up at location

(x, y)

• S(x, y): probability of a successful pass to (x, y)

• P

score

(x, y|success): probability of scoring after a

successful pass to (x, y)

• P

concede

(x, y|success): probability of conceding

after a successful pass to (x, y)

• P

score

(x, y|no success): probability of scoring after

an unsuccessful pass to (x, y)

• P

concede

(x, y|no success): probability of conceding

after an unsuccessful pass to (x, y)

Revisiting Expected Possession Value in Football: Introducing a U-Net Architecture, Reward and Risk for Passes, and a Benchmark

103

We set L(x, y) > 0.001 as a practical threshold in Def-

inition 1. This hyperparameter can be adjusted: lower

values surface less likely (but potentially more cre-

ative) options, whereas higher values restrict recom-

mendations to more traditional and realistic destina-

tions. This output deﬁnition enables the identiﬁcation

of optimal pass locations by evaluating Output(x, y)

across all feasible locations on the pitch. The location

with the highest output value represents the model’s

recommendation for the most valuable pass option,

accounting for both the probability of successfully

executing the pass and its expected impact on scor-

ing/conceding probabilities. This practical applica-

tion is demonstrated in our analysis of real game situ-

ations (see Figure 4 in Section 5).

3.6 Benchmark Creation and

Evaluation

To enable quantitative evaluation of the performance

of EPV models, we create the OJN-Pass-EPV bench-

mark. This benchmark consists of 50 modiﬁed game

state pairs, where we use a real game state and realis-

tically alter aspects of it (e.g. player positions and ve-

locities). While this number is modest, it represents a

ﬁrst step towards standardized quantitative evaluation

of pass EPV models, focusing on challenging scenar-

ios identiﬁed by experts.

We rely on a panel of three football experts, in-

cluding members of the Royal Dutch Football As-

sociation (KNVB) with backgrounds in performance

and data analysis, to assess which of the two game

states in each game state pair has a higher pass EPV.

This approach follows guidelines from Davis et al.

(2024), who emphasize expert evaluation as essential

for sports analytics validation. We focus on relative

EPVs rather than absolute EPVs, as the latter are of-

ten more subjective. In designing the benchmark, we

select game state pairs that we expect to have widely

accepted relative values, and we only include game

state pairs in the benchmark set if all three football

experts agree on their relative value.

To obtain the performance of OJN-EPV on the

benchmark, we ﬁrst create a single scalar pass EPV

for a game state. For this, we take the prediction

of pass EPV over endpoints with respect to the pass

likelihood as in Deﬁnition 2 and compare the scalar

pass EPV of one game state with the modiﬁed game

state. We then report the percentage of cases where

our model ranks the game state either lower or higher

in correspondence with the ratings of the experts.

Deﬁnition 2: Scalar Pass EPV.

EPV

pass

∑

x,y

L(x, y)V (x, y) (5)

with V (x, y) as in Deﬁnition 1.

4 RESULTS

This section presents our main results and ﬁndings,

beginning with a feature ablation study. Building

upon the optimal feature set identiﬁed, we subse-

quently present the results of an architectural study

aimed at determining the ideal number of parame-

ters for OJN-EPV. The best-performing architecture

is then employed to evaluate the performance of the

resulting model. Losses and calibration metrics in this

section are computed on held-out splits of the Eredi-

visie and 2022 FIFA Men’s World Cup datasets (Ta-

bles 2 and 3). The OJN-Pass-EPV benchmark in Sec-

tion 4.6 is a separate, expert-validated set of 50 paired

game states used only for relative evaluation.

4.1 Feature Ablation Study

In our feature ablation study, we investigate the ef-

fects of various feature combinations on model per-

formance. Across all models, adding features beyond

the fundamental features (i) player positions and ve-

locities, (ii) distance to the ball for every location, and

(iii) ball height, yield only negligible performance

gains in aggregate metrics such as loss and ECE. In

practice, the only additional features we include are

distance to goal and angle to goal. Closer inspec-

tion of the value models, including visual evaluations

of predicted probability surfaces, reveals that these

two features substantially improve contextual accu-

racy. Without them, the model underestimates value

in scenarios near the penalty area. We retain these

features speciﬁcally because they improve benchmark

performance even when global loss and calibration

change only marginally.

4.2 Architecture Study

To validate the model architecture, we examine var-

ious setups, speciﬁcally focusing on the number of

ﬁlters (8, 16, 32) and the ﬁlter dimensions (3x3,

5x5). We ﬁnd that using only 8 ﬁlters signiﬁcantly

reduces performance for all models except the pass

value models. For the models using 32 ﬁlters, we

ﬁnd that these models do show slightly better per-

formance compared to the models using 16 ﬁlters,

but due to only slight improvements and signiﬁcantly

more parameters, we choose to use 16 ﬁlters with di-

mension 5x5 for the OJN-EPV model. This conﬁgu-

ration results in 372,355 parameters for the pass suc-

cess model, 372,359 for the pass likelihood model,

icSPORTS 2025 - 13th International Conference on Sport Sciences Research and Technology Support

104

and 373,201 parameters for both pass value models

(successful and unsuccessful).

4.3 Model Performance

This section assesses the performance of the OJN-

EPV model, focusing on the loss and calibration met-

rics. We use ECE with 10 bins as the calibration met-

ric. To enhance model calibration, we apply tem-

perature scaling to the pass success and pass value

models. All models, except one, demonstrate optimal

calibration with a temperature of 1.0, indicating no

need for temperature scaling. Only the value unsuc-

cessful model, trained and validated on the Eredivisie

data, exhibits improved calibration with a temperature

value of 1.1. We do not report ECE for the pass likeli-

hood surface, because it is a spatial distribution (soft-

max over 104 × 68 endpoints) rather than a binary

probability; bin-based calibration is not well-posed

for this output. Table 2 presents the overall perfor-

mance metrics for models trained on Eredivisie data,

while Table 3 shows results for models ﬁne-tuned on

2022 FIFA Men’s World Cup data. Loss and ECE

curves remain stable over epochs, indicating consis-

tent convergence behavior.

The detailed loss per class based on the value

model, as shown in Tables 4 and 5, provides a com-

prehensive representation of the model’s nuanced un-

derstanding of pass outcomes. Low loss and calibra-

tion scores demonstrate the model’s proﬁciency in ac-

curately forecasting whether a pass will result in scor-

ing a goal, conceding a goal, or no goal at all within

a window of 15 seconds for the team in possession of

the ball.

4.4 Inﬂuence of Ball Height on

Predictions

Incorporating the ball’s height (z-axis) into the model

signiﬁcantly impacts the pass likelihood predictions

by adding the crucial vertical dimension to the anal-

ysis. Figure 2 illustrates this by contrasting two sce-

narios: a ground pass (Figure 2a) and an aerial pass

with the ball at 2 meters high (Figure 2b). When

trained without the ball height feature on the Eredi-

visie dataset, the pass likelihood model achieves a

loss of 4.7478, compared to 4.7225 with ball height

included (as shown in Table 2). While this aggre-

gate metric shows only minor changes, reﬂecting the

relative rarity of aerial passes, Figure 2 demonstrates

the practical importance of considering ball height for

speciﬁc passing situations, where the model recog-

nizes that aerial passes can be made over opponents

while also showing increased uncertainty about the

pass destination.

(a) Ground pass scenario (z=0 m). Red team in possession;

yellow dot marks the ball carrier belonging to the red team;

pink dot marks the ball.

(b) Aerial pass scenario (z=2 m). Same game state as top

panel but with the ball at 2 meters in the z-axis.

Figure 2: Impact of ball height on pass likelihood pre-

diction. Top: ground pass (z=0 m) with blocked passing

lanes; Bottom: same state with an aerial pass (z=2 m) that

can be played over defenders. Aerial passes yield broader,

more diffuse likelihoods, reﬂecting lower precision relative

to ground passes.

4.5 Risk and Reward Decomposition

Decomposing pass EPV into reward and risk compo-

nents offers a more nuanced perspective on the com-

plexities of passing decisions in football. By sepa-

rating the potential positive and negative impacts of a

pass, we can gain deeper insights into the underlying

volatility of seemingly straightforward game states.

Figure 3 depicts one such game state, where the de-

composition of EPV reveals the trade-offs inherent in

a passing decision.

In this game state, the pass value model trained

on successful passes assigns a slightly negative over-

all value (i.e., -0.0047) for the end location of the

Revisiting Expected Possession Value in Football: Introducing a U-Net Architecture, Reward and Risk for Passes, and a Benchmark

105

Table 2: Loss and ECE on both datasets for models trained on Eredivisie data. Note: ERE = Eredivisie; WC = World Cup.

Model Loss (ERE) ECE (ERE) Loss (WC) ECE (WC)

Pass Success 0.1558 0.0024 0.1355 0.0122

Pass likelihood 4.7225 - 4.4528 -

Pass Value (Successful) 0.0689 0.0016 0.0835 0.0060

Pass Value (Unsuccessful) 0.0663 0.0042 0.0726 0.0056

Table 3: Loss and ECE on both datasets for models ﬁne-tuned on 2022 FIFA Men’s World Cup data. Note: ERE = Eredivisie;

WC = World Cup.

Model Loss (ERE) ECE (ERE) Loss (WC) ECE (WC)

Pass Success 0.1568 0.0090 0.1326 0.0047

Pass likelihood 4.7227 - 4.4367 -

Pass Value (Successful) 0.0687 0.0024 0.0836 0.0065

Pass Value (Unsuccessful) 0.0671 0.0045 0.0740 0.0050

pass marked with a ”+”. This implies that even if the

pass is completed, the blue team is still considered

more likely to score within 15 seconds than the red

team. Building on nuanced risk-reward assessments

in passing, such as Goes et al. (2022) who quantiﬁed

risk via interception probability and reward through

multiple tactical factors, our analysis, grounded in the

EPV framework, offers a complementary perspective

focused on ultimate outcomes. Instead of solely fo-

cusing on pass completion or immediate tactical ad-

vantage, we decompose the predicted pass value into

’reward’ (the probability of scoring within 15 sec-

onds) and ’risk’ (the probability of conceding within

15 seconds). This direct link to future goal events

allows for evaluating the potential volatility and net

goal impact inherent in different passing options, dis-

tinct from the immediate risk of turnover or speciﬁc

tactical gains. Even a successful pass can lead to an

unpredictable game state, potentially detrimental for

the team in possession, depending on factors such as

opponent pressure. This assessment is based on po-

tential outcomes: the blue team has a 0.0199 proba-

bility of scoring compared to 0.0152 for the red team.

This analysis illustrates the complex trade-offs that

can be present in passing decisions.

OJN-EPV quantiﬁes these potential outcomes (re-

ward vs. risk for different options) to inform player

and coach decision-making, rather than prescribing a

single ’best’ action, which may depend on factors be-

yond the model’s scope (e.g., game state, tactical in-

structions, individual player risk tolerance).

4.6 Benchmark Performance and

Validation

Before incorporating distance to goal and angle to

goal features in our pass value models, the benchmark

performance for the model trained only on the Ere-

Figure 3: Pass value decomposition in a single game state.

Red team is in possession; the ball carrier is marked with a

yellow dot; the intended pass destination is marked by ”+”.

The example illustrates how OJN-EPV can assign slightly

negative overall value even for a completed pass when con-

ceding risk outweighs reward in the horizon of 15 seconds.

divisie dataset is 68% on the OJN-Pass-EPV bench-

mark, rising to 70% after ﬁne-tuning on the 2022

FIFA Men’s World Cup data. Once we add these

two features, the global loss and calibration metrics

do not show a pronounced improvement. However,

the benchmark performance for both the Eredivisie-

trained and 2022 FIFA Men’s World Cup ﬁne-tuned

models increases to 78%.

5 DISCUSSION

This section discusses our ﬁndings and their implica-

tions for football analytics.

5.1 Technical Achievements

5.1.1 U-Net Architecture Performance

Our results indicate that the U-Net architecture deliv-

ers strong performance on the OJN-Pass-EPV bench-

icSPORTS 2025 - 13th International Conference on Sport Sciences Research and Technology Support

106

Table 4: Pass value model - loss and ECE by class for model trained on Eredivisie data.

Class Loss (Eredivisie) ECE (Eredivisie) Loss (WC) ECE (WC)

Scoring goal (Successful) 0.0590 0.0018 0.0689 0.0048

No goal (Successful) 0.0660 0.0040 0.0791 0.0035

Conceding goal (Successful) 0.0096 0.0023 0.0144 0.0022

Scoring goal (Unsuccessful) 0.0379 0.0060 0.0357 0.0074

No goal (Unsuccessful) 0.0620 0.0105 0.0663 0.0134

Conceding goal (Unsuccessful) 0.0271 0.0041 0.0362 0.0062

Table 5: Pass value model - loss and ECE by class for model ﬁne-tuned on 2022 FIFA Men’s World Cup data.

Class Loss (Eredivisie) ECE (Eredivisie) Loss (WC) ECE (WC)

Scoring goal (Successful) 0.0590 0.0011 0.0693 0.0057

No goal (Successful) 0.0658 0.0032 0.0793 0.0042

Conceding goal (Successful) 0.0093 0.0021 0.0141 0.0020

Scoring goal (Unsuccessful) 0.0385 0.0064 0.0366 0.0081

No goal (Unsuccessful) 0.0619 0.0095 0.0661 0.0125

Conceding goal (Unsuccessful) 0.0272 0.0039 0.0365 0.0058

mark. Its effectiveness stems from the U-Net’s abil-

ity to ﬁrst capture broad, contextual information like

the overall team formation in its encoder, and then

re-integrate ﬁne-grained local details, such as player

proximity, using skip connections to produce a spa-

tially precise output.

The choice of 16 ﬁlters with 5x5 kernels balances

model complexity and performance. While 32 ﬁlters

give marginal gains, the parameter cost is dispropor-

tionate.

5.1.2 Ball Height Impact

Incorporating ball height improves the model’s abil-

ity to distinguish between ground and aerial passes.

As shown in the Results, the model recognizes that

aerial passes can bypass defenders while exhibiting

increased spatial uncertainty, aligning with domain

knowledge that headers are less precise than ground

passes.

5.1.3 Risk-Reward Decomposition

Decomposing pass value into separate reward and

risk components provides a more nuanced view than

single-value EPV approaches. By quantifying the

probability of scoring (reward) and conceding (risk)

for both successful and unsuccessful passes, the

model surfaces the volatility of different options.

5.2 Methodological Considerations

5.2.1 Benchmark Design

The OJN-Pass-EPV benchmark represents a ﬁrst step

toward standardized evaluation in EPV modeling.

By focusing on relative comparisons between paired

game states rather than absolute values, we reduce

subjectivity in target labels. Unanimous expert agree-

ment ensures high-conﬁdence ground truth, though it

limits the set to 50 pairs. The public repository in-

cludes the 50 paired states and the expert selections.

5.2.2 Training Strategy

We train on Eredivisie data and ﬁne-tune on 2022

FIFA Men’s World Cup data; similar qualitative

behavior across both suggests the model captures

general football principles rather than competition-

speciﬁc patterns.

5.3 Comparison with Previous Work

Attempting to implement the Fern

andez et al. (2021)

model as a baseline encountered vanishing gradients

during training, preventing a direct comparison and

motivating our U-Net approach with skip connections

and LeakyReLU activations (Glorot et al., 2011).

The OJN-Pass-EPV benchmark provides a quanti-

tative measure previously unavailable in the ﬁeld. In-

corporating distance and angle to goal improves rel-

ative value assessment even when aggregate loss and

calibration metrics change only marginally.

5.4 Practical Implications

We illustrate how our model can be employed as a

decision support tool, utilizing Deﬁnition 1 by eval-

uating the model surface across the pitch (Figure 4).

In this ﬁgure, we demonstrate the locations that our

model estimates to be optimal based on the current

game state. As previously discussed in subsection 4.5,

these outputs should be considered a supportive tool

Revisiting Expected Possession Value in Football: Introducing a U-Net Architecture, Reward and Risk for Passes, and a Benchmark

107

for practitioners, taking into account current limita-

tions, such as the model’s player-agnostic nature.

Beyond benchmark accuracy, OJN-EPV supports

player- and team-level decision proﬁling. At the

player level, analysts can quantify a risk proﬁle by

measuring the share of high-variance options (high

risk–high reward) selected relative to available al-

ternatives, and detect systematic biases toward con-

servative choices. At the team level, aggregated

risk–reward tendencies reveal phase- or opponent-

speciﬁc strategies (e.g., increased aerial risk under

high press). These diagnostics help align coaching

intent with on-pitch execution.

Figure 4: Decision analysis with OJN-EPV. Red team is

in possession; the ball carrier is marked with a yellow dot.

The player’s actual pass destination is marked by ”+”; the

model-recommended optimal location is circled (Deﬁnition

1). The comparison highlights potential areas for decision-

making reﬁnement.

OJN-EPV supports applied analysis across the

match cycle: pre-match scouting of opponent passing

patterns, half-time assessment of risk-taking shifts,

post-match comparison of actual decisions to model

surfaces.

5.5 Limitations & Future Work

Three benchmark errors are attributable to offside;

rule-aware masking of offside receivers at pass start

would likely correct these cases. Offside is deﬁned for

receivers at pass start, not for abstract endpoints, so a

location-only mask is inadequate. A practical remedy

is a post-processing step that assigns candidate end-

points to intended receivers and sets S(x, y) = 0 for

receivers who are offside at pass start.

Future work includes implementing explicit off-

side handling via receiver-aware masking, expanding

the OJN-Pass-EPV set beyond 50 pairs, increasing

spatial resolution beyond 1 m to capture ﬁner nuances,

and developing player-speciﬁc models that account

for individual abilities and preferences.

6 CONCLUSION

We present OJN-EPV with four core contributions:

a U-Net architecture for spatial EPV at modest scale

( 372K parameters), incorporation of ball height to

distinguish aerial from ground passes, a risk-reward

decomposition deﬁned for both successful and un-

successful passes, and the OJN-Pass-EPV benchmark

of 50 expert-validated pairs for standardized, relative

evaluation.

Our model demonstrates strong performance

across both Eredivisie and World Cup datasets,

achieving low loss, robust calibration, and 78% accu-

racy on our benchmark. These results afﬁrm the value

of each contribution: the U-Net architecture produces

high-quality spatial EPV surfaces, ball height integra-

tion reﬁnes pass predictions, the risk-reward frame-

work offers actionable insights, and the benchmark

provides a necessary tool for standardized evaluation,

thereby answering RQ1–RQ4.

To foster transparency and drive further innova-

tion, the OJN-Pass-EPV dataset is publicly available

at https://github.com/EAISI/OJN-EPV-benchmark.

This resource provides the community with a stan-

dardized tool to rigorously compare and validate

new EPV models, moving beyond aggregate met-

rics toward expert-aligned relative assessments.

Ultimately, this work not only introduces a novel,

high-performing EPV model but also establishes

a more robust methodological foundation for its

evaluation, setting a new baseline for future research

in football analytics.

ACKNOWLEDGEMENTS

We thank the Royal Dutch Football Association

(KNVB) for providing data access and the expert

panel for benchmark validation. We used OpenAI’s

GPT-4 to review the manuscript for grammar and

spelling. No content, data, code, ﬁgures, or results

were generated or altered by AI tools; the authors re-

main fully responsible for the content.

REFERENCES

Anzer, G. and Bauer, P. (2022). Expected passes: Determin-

ing the difﬁculty of a pass in football (soccer) using

spatio-temporal data. Data Mining and Knowledge

Discovery, 36(1):295–317.

Bransen, L., Haaren, J. V., and van de Velden, M. (2019).

Measuring soccer players’ contributions to chance

creation by valuing their passes. Journal of Quanti-

tative Analysis in Sports, 15(2):97–116.

icSPORTS 2025 - 13th International Conference on Sport Sciences Research and Technology Support

108

Davis, J., Bransen, L., Devos, L., Jaspers, A., Meert, W.,

Robberechts, P., Van Haaren, J., and Van Roy, M.

(2024). Methodology and evaluation in sports ana-

lytics: Challenges, approaches, and lessons learned.

Machine Learning, 113(9):6977–7010.

Decroos, T., Bransen, L., Van Haaren, J., and Davis, J.

(2019). Actions speak louder than goals: Valuing

player actions in soccer. In Proceedings of the 25th

ACM SIGKDD International Conference on Knowl-

edge Discovery & Data Mining, pages 1851–1861.

Fern

andez, J., Bornn, L., and Cervone, D. (2019). Decom-

posing the immeasurable sport: A deep learning ex-

pected possession value framework for soccer. In 13th

MIT Sloan Sports Analytics Conference.

Fern

andez, J., Bornn, L., and Cervone, D. (2021). A frame-

work for the ﬁne-grained evaluation of the instanta-

neous expected value of soccer possessions. Machine

Learning, 110(6):1389–1427.

Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep

sparse rectiﬁer neural networks. In Gordon, G., Dun-

son, D., and Dud

ık, M., editors, Proceedings of the

Fourteenth International Conference on Artiﬁcial In-

telligence and Statistics (AISTATS), volume 15, pages

315–323, Fort Lauderdale, FL, USA. PMLR.

Goes, F., Schwarz, E., Elferink-Gemser, M., Lemmink,

K., and Brink, M. (2022). A risk-reward assessment

of passing decisions: comparison between positional

roles using tracking data from professional men’s soc-

cer. Science and Medicine in Football, 6(3):372–380.

aland, E. M., Wiig, A. S., St

alhane, M., and Hvattum,

L. M. (2020). Evaluating passing ability in association

football. IMA Journal of Management Mathematics,

31(1):91–116.

Link, D., Lang, S., and Seidenschwarz, P. (2016). Real

time quantiﬁcation of dangerousity in football us-

ing spatiotemporal tracking data. PLOS ONE,

11(12):e0168768.

Linke, D., Link, D., and Lames, M. (2020). Football-

speciﬁc validity of tracab’s optical video tracking sys-

tems. PLOS ONE, 15(3):1–17.

Power, P., Ruiz, H., Wei, X., and Lucey, P. (2017). Not

all passes are created equal: Objectively measuring

the risk and reward of passes in soccer from tracking

data. In Proceedings of the 23rd ACM SIGKDD In-

ternational Conference on Knowledge Discovery and

Data Mining, pages 1605–1613, Halifax, NS, Canada.

Association for Computing Machinery.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-

net: Convolutional networks for biomedical im-

age segmentation. In Medical Image Computing

and Computer-Assisted Intervention–MICCAI 2015,

pages 234–241. Springer.

Rudd, S. (2011). A framework for tactical analysis and in-

dividual offensive production assessment in soccer us-

ing markov chains. NESSIS 2011 presentation. (Ac-

cessed April 8, 2025).

Singh, K. (2019). Expected threat. Online Resource.

Smith, L. N. (2017). Cyclical learning rates for training

neural networks. In 2017 IEEE Winter Conference on

Applications of Computer Vision (WACV), pages 464–

472. IEEE.

Revisiting Expected Possession Value in Football: Introducing a U-Net Architecture, Reward and Risk for Passes, and a Benchmark

109