top of page

Forecasting BlueFin Tuna Trends

Group MDA Project

Project Type:

Time Series Forecasting, RStudio

Date:

GitHub Repo:

Project Background

This project focuses on forecasting monthly wholesale prices of Bluefin Tuna at Tokyo's historic Tsukiji market, using data from 2003 to 2016. Bluefin tuna is a high-demand, high-value commodity and accurate price forecasts can help buyers, sellers, and sustainability stakeholders make better operational and procurement decisions.

Executive Summary

We developed and evaluated multiple time series forecasting models to predict tuna prices for 2017, with a focus on seasonality and economic disruptions (e.g., the 2009 dip).


Two models stood out:

  • Auto-ARIMA
     

  • TSLM (Time Series Linear Model)
     

Both models demonstrated strong accuracy with low error metrics (RMSE, MAE, MAPE). Final forecasts projected winter peaks and summer dips, consistent with historical trends. These predictions could guide smarter pricing, stockpiling, and market planning.

Data Exploration

Bluefin distribution.png

Histogram of Bluefin Prices: indicates a left skewed distribution with a peak ~3,500 yen per kg. This suggests that while prices can range widely, a majority of transactions occur within this price bracket.

Boxplot of Monthly Price Variation: Prices tend to be higher and more volatile during the winter months (December & January to March), which could indicate increased demand or limited supply during this period.

​

Lower and more stable prices in the summer months (May to August) could suggest a seasonal increase in supply or possibly a lower demand.

Price Variation Bluefin.png

Price of Caught Fresh Bluefin Tuna by the Japanese Fleet Overtime: There seems to be a pattern of seasonality with the peaks and valleys, but the STL decomposition will share specifics.

STL Decomposition of Trend Line of Fresh Bluefin Tuna: The trend line in the decomposition does not show a trend. There is one very large price decrease in 2009. The seasonal pattern (season_year) suggests multiplicative seasonality.

Bluefin stl.png

Forecasting

Model Selection

Training Data Isolation: Segregated all but the last two years of data (24 months) to form the training set for model building.

Data has monthly multiplicative seasonality, therefore is non-stationary.

  • auto_ARIMA

  • auto_ETS

  • Log transformed TSLM

Dip in 2009 (shown in STL decomposition), therefore a recession indicator variable added.

Bluefin forecast 2.png

Based on the lowest RMSE, MAE, and MAPE, models Auto-ARIMA and TSLM performed the best.  We then went ahead and performed cross validation.

Cross validation: the data set was split sequentially to train and test the models multiple times for stability.

Parameters: There are 168 available observations. We used 80 observations in the training period, stepping forward 3 observations at a time resulting in 30 different training periods.

The formula to calculate the number of training periods is:

image.png

Auto-ARIMA performed the best. The residuals appear to be homoskedastic and normally distributed. There is one significant autocorrelation at lag 9, but it is not large enough to be a major concern and not in lags 1 or 2.

Final Forecast:

Recommendations

  • For Fish Markets & Distributors: Plan bulk purchases and pricing strategies in summer months (May-August), where price are consistent.
     

  • For Restaurant Supply Chains: Anticipate lower prices in summer for promotional offers or limited-time dishes to increase customer traffic.
     

  • For Researchers or Economists: Incorporate external market signals (e.g., recessions, import bans) into future models to improve accuracy.

Citation

tcashion. (n.d.). Tokyo wholesale tuna prices [Dataset]. Kaggle. Retrieved June 25, 2025, from https://www.kaggle.com/datasets/tcashion/tokyo-wholesale-tuna-prices

Thank you! Please message any suggestions.

bottom of page