Abstract
When voyage report data is utilized as the main data source for ship fuel efficiency analysis, its information on weather and sea conditions is often regarded as unreliable. To solve this issue, this study approaches AIS data to obtain the ship's actual detailed geographical positions along its sailing trajectory and then further retrieve the weather and sea condition information from publicly accessible meteorological data sources. These more reliable data about weather and sea conditions the ship sails through is fused into voyage report data in order to improve the accuracy of ship fuel consumption rate models. Eight 8100-TEU to 14,000-TEU containerships from a global shipping company were used in experiments. For each ship, nine datasets were constructed based on data fusion and eleven widely-adopted machine learning models were tested. Experimental results revealed the benefits of fusing voyage report data, AIS data, and meteorological data in improving the fit performances of machine learning models of forecasting ship fuel consumption rate. Over the best datasets, the performances of several decision tree-based models are promising, including Extremely randomized trees (ET), AdaBoost (AB), Gradient Tree Boosting (GB) and XGBoost (XG). With the best datasets, their R2 values over the training sets are all above 0.96 and mostly reach the level of 0.99–1.00, while their R2 values over the test sets are in the range from 0.75 to 0.90. Fit errors of ET, AB, GB, and XG on daily bunker fuel consumption, measured by RMSE and MAE, are usually between 0.8 and 4.5 ton/day. These results are slightly better than our previous study, which confirms the benefits of adopting the actual geographical positions of the ship recorded by AIS data, compared with the estimated geographical positions derived from the great circle route, in retrieving weather and sea conditions the ship sails through.