INTRODUCTION

Mobility information extracted from Mobile Phone Data (MPD) has shown substantial progress the last years. MPD is collected continuously, 24-hours a day and every day of the year (24/7/365). In the Netherlands these data are prepared by Mezuro based on data from the mobile network operator. Using these data (over 17 billion location-based events monthly) regular and irregular traffic patterns can be determined at local, regional and national scales for any time period, including the average working day, which is commonly used for transport policy purposes.

During the last few years we have made relevant improvements in processing and augmenting MPD- data to create reliable information that can be used for traffic monitoring, and high-quality origin- destination matrices in transport models. In this paper we will share our experiences, results from on- going research, and our vision on the use of MPD in the near future.

In transport models many different data sources are used, e.g. transport network, socio-economic data, survey data, ANPR data and traffic counts. Our studies show that new techniques that include MPD can make the modelling process more efficient and create higher quality models.
Regarding the improvement of OD-matrices in transport models, we performed several studies with the Dutch National Model System and the transport model of the Metropole Region Rotterdam–The Hague. We learned that trip distribution, i.e. the structure of the synthetic OD-matrix can be improved significantly using MPD-data. For example, MPD data perform much better for OD-relations that are difficult to model with the gravity model due to historical patterns, e.g. due to spatial policy, like the Zoetermeer–The Hague-connection. Also, MPD based OD’s compare much better with commuter data than traditionally build synthetic OD’s.

Recently we carried out several use cases with MPD commissioned by the Dutch National Data Warehouse for Traffic Information (NDW) to explore the possibilities to use origin destination information based on mobile sources. These use cases comprise the assessment of flows at intersections, selected-link analyses on motorways, and OD-analyses at regional levels.

Lastly, a new use of MPD concerns transport modelling for Out-of-Home (OOH) advertising. For OOH complete traffic flows, i.e. motorized traffic and slow modes must be assessed at billboard locations. This information is used to determine the value of different advertising locations. Clearly, the availability of up-to-date, dynamic OD-data at a national scale is highly beneficial for this purpose.

TRAVEL INFORMATION FROM MOBILE PHONE DATA

Living in a world without a smartphone is no longer imaginable. 93% of the Dutch adult population (18+) owns a smartphone which makes the Netherlands smartphone country #1 (Deloitte, 2017). Between 2013 and 2017 the share of smartphone users in the Netherlands raised from less than 75% toward 93%. The penetration is not only high for young adults but also for the elderly people and is

above 90% for all considered age classes (18-24 years, 25-34 years, 35-44 years, 45-54 years and 55+). The biggest increase in this period was by the people over 55 years (about 50% in 2014 and 90% in 2017). Since smartphones communicate constantly with cell towers travel patterns of users can be determined, for every hour of the day, all day long (24/7/365). Although the location information is less granular than GPS positioning, it can be used to position users in different zones.

For several years Mezuro and DAT.Mobility work together to analyze anonymous the mobile network operator CDR/EDR data and build Origin-Destination (OD)-matrices which can be used for several purposes in the domain of transport but is also of interest for other domains. Based on over 550 million daily location observations from about 4 million active users it is possible to monitor the movement of people in the Netherlands. The sample size of about 30% of the Dutch population is a very extensive data source. For comparison: OViN, the main other source in the Netherlands that provides information on movement patterns at a national level, is based on one day travel diaries of 35,000 respondents per year (CBS 2018).

Determination of trips

From the raw event data, trips and destinations are created by rule-based algorithms. Firstly, the raw data are cleaned to remove noisy data. Some devices have a low number of events and for these devices travel and stay information is incomplete. Also, some smartphone events are generated automatically causing impossible travel patterns. After the filtering the location of a destination is determined. A destination is defined as a location where a device stays for at least 30 minutes. This period is chosen to detect all major destinations and excludes trip locations which are not a trip destination (e.g. traffic jams and short stops). The hour in which the trip is made is defined as the mid-time of the trip. Each origin and destination of a trip is assigned to an area (zone). In total, the Netherlands is divided in more than 1,200 areas. The national OD-matrix can be assigned to a network resulting in link flows which are fully based on MPD-data. Figure 2.1 shows the way in which event data from mobile phone leads to road volumes.

Figure 2.1: from mobile phone event data to road volumes

The privacy of the data is guaranteed by one-way hashing phone numbers every month and allowing only aggregates which contain more than 15 users at OD-level. The hash procedure is done every month, so unique devices can be linked on a day-to-day basis for one month, which allows to derive data on trip frequency (for one month). Aggregates must have a device count of more than 15 users to make it impossible to derive individual trip information. By aggregation of trips over multiple days and determining the average number of trips per day afterwards, the number of trips per day may be lower than 16.

All trips are assigned to OD-pairs. On municipality-to-municipality level the OD-pairs with a long distance between each other and low numbers of inhabitants, are filtered out due to the privacy (>15) rule. To bypass this, the province-to-province levels are used to estimate the number of trips on the missing municipality-to-municipality level and accordingly to the zone-to-zone level in a similar way. Provinces are aggregations of municipalities. The Netherlands consists of 12 provinces and around 400 municipalities.

From the origin side, the displacement data were subsequently increased to actual numbers using validated algorithms. These algorithms consider the market share of the mobile network operator, the number of inhabitants per area and the penetration of mobile devices by age group. This results in origin- destination matrices with person movements at the national level.

Trips are split in train and non-train trips. This division is based on the locations and characteristics of the cell towers near railway lines. Using national average car occupancy per travel distance, non- train trips of persons are scaled to a national matrix with vehicle trips for each unique day of the year. Figure 2.2 shows an example of the national MPD-matrix of vehicle trips assigned to the national road network resulting in flows. Figure 2.2: Travel flows on the

Figure 2.2: Travel flows on the Dutch road network obtained from Mobile Phone Data on a working day at 08:00

 

 

 

 

 

 

 

USAGES OF MOBILE PHONE DATA

In this chapter we describe several usages of OD-information obtained from mobile phone data.

Improving the Quality of Transport Models

In recent research projects we showed that if it is possible to improve the quality of transport model by adjusting the structure i.e. distribution of the (synthetic) car OD-matrix.

From comparisons with OViN, the national travel survey, we found that trips with travel distances less than 8 kilometer are under-represented in the mobile phone data set. This proved to be a bias that is hard to correct. In areas with cell towers covering a large area detection of short distance trips is not feasible. In cities with many cell towers covering smaller areas, short distance trips are easier detected. So, at this stage trip information from MPD-data is limited to trips above 8 kilometers.

Transport Model Rotterdam region

In Wismans et al (2018) the a priori model for the Rotterdam region was enriched with mobile phone data. Results show a better match of the assignment results of the enriched matrix with the counts, indicating a better quality of the matrix (see Figure 3.1).

Figure 3.1: Scatter plots assignment values versus counts for a priori model and enriched model (transport model Rotterdam region)

The RMSE of the enriched model versus counts is 13% better than the RMSE of the a priori model. This means we estimated a better performing a-priori OD matrix, which indicates that the model calibration effects will become smaller. Smaller calibration effects show an improvement in the predictive power of a transport model. Also, the observed differences in the structure of OD- matrices by the MPD-enrichment are consistent with the experience of regional traffic engineer experts.

Based on this research the Metropolitan region Rotterdam The Hague (MRDH) has acknowledged the added value of the use of mobile phone data and decided to use it in the update of their transport model.

Strategic national model system (LMS)

The Dutch national model (called LMS) is used for two main purposes: a) answering policy questions regarding major infrastructure investments, and b) assessing the effect of regional infrastructure projects in the exploration and planning phase. It is determined to what extent it is possible to increase the quality of the a-priori car matrix of the LMS-model at national level by the enrichment with mobile phone data, see last year’s ETC-paper (Joksimovic et al, 2017). The models with and without MPD-data enrichment were both calibrated.

The MPD-enrichment is performed by imputing the distribution of the MPD-data at municipal level in the LMS-model. This implies that the number of trips at municipal level is not changed, but the distribution of the trips from and to each municipality does. An example of such a change in the distribution is the relation between the city of Zoetermeer and The Hague. Zoetermeer was developed as suburb of The Hague. From a historic perspective many commuters living in Zoetermeer work in The Hague. This fact is clearly present in the mobile phone data, and much more than in the LMS a priori model. In a transport model it is hard to model these kinds of historical relations, which again shows the added value of incorporating the mobile phone data in the LMS model.

Changes in the OD-structure by the MPD-enrichment are not evident from the standard validation criteria that are used in the LMS-method. This can be explained by the fact that these validation criteria mainly focus on other aspects than the OD-structure. However, analysis shows that the MPD-enrichment leads to plausible improvements at a national level. For example, in Figure 3.2 the congestion in the afternoon peak period of the models with and without MPD-data are shown in the upper pictures. These are compared with the result of the calibrated model in the lower picture. The change in OD-structure by the MPD-data results leads to a picture with more congestion locations, which is more in line with the calibrated model.

Figure 3.2: Congestion of LMS model without & with MPD-enrichment compared with the calibrated model (afternoon peak period)

Also, a comparison was made with a commuter database from Statistics Netherlands (CBS). The commuter data contains information about the number of jobs of employees in municipalities where they work in relation to the municipality where they live. The information is therefore a matrix at municipal level. Although the commuter data is not mobility data, it provides a very good indication of the ratio (distribution) in terms of the number of trips between municipalities. Therefore, a comparison of the AM peak model without and with MPD-enrichment was made using a distance measure which describes for each municipality the mean average difference between the distribution of the model and the commuter data. Figure 3.3 shows the frequencies of municipalities of the index of this distance measure where an index value < 100 means that the model without MPD-enrichment is more in line with the commuter data and an index > 100 means that the model with MPD-enrichment is more in line with the commuter data. The figure clearly shows that the MPD-enrichment results in a better correspondence of the OD-matrix with commuter data.

Figure 3.3: Histogram of difference LMS models (without and with MPD) with commuter data

Determining Origins and Destinations at Specific Location

The Dutch National Data Warehouse for Traffic Information (NDW) commissioned a pilot study into the usability of mobile (phone) data for origin-destination applications. In this study NDW collaborated with regional directorates of Rijkswaterstaat and the Port of Rotterdam Authority. Nine use cases were defined aimed at mapping intersection flows, selected-link analysis and travel time analysis at different locations in the Netherlands.

Here we describe the use case ‘A29 Heinenoord-tunnel’, a tunnel which located south of Rotterdam and of great importance for the accessibility between Rotterdam and the south-western part of the Netherlands. Major maintenance will take place at the Heinenoord-tunnel in 2019-2020 with a very serious nine-month traffic impact. Therefore, it is desirable to have insight in the travel patterns of the users of this tunnel to determine the impact on the alternative routes during the maintenance. In this use case, a selected link analysis of the A29 at the Heinenoord-tunnel was requested to analyze origins and destinations of the traffic along this route. The analyses were carried out an average working day in September 2016 and in September 2017, an average weekend day in September 2017, and during a tunnel closure during two weekends at the end of June / beginning of July 2017.

Comparison with counts
To specify the quality of the traffic flows derived from MPD, the flows in the Heinenoord-tunnel were compared with counts from the NDW-database (loop detector data), see Table 3.1.

Period

Count (vehicles 24h)

      MPD-flow

difference

working day September 2016

49.210

50.300

2%

working day September 2017

53.800

49.900

-7%

weekend day September 2017

35.900

27.500

-23%

Table 3.1: Comparison count and MPD-flow

The differences between counts and flow are limited for the working day, but for the weekend day the difference is large. A possible explanation for this difference is that business users only use their private mobile device in the weekend and not their working device, which may result in inflated weekend counts. Further research is needed to explain this difference.

Selected link working day (September 2017)
The information of the selected link in northern direction is presented in two ways. In Figure 3.4 on the left shows the selected link at municipal level in the form of a ‘dot map’. In the figure on the right the information is shown at zonal level in the form of a ‘spot map’. The blue dots/spots show the share of the origins that travel through the tunnel and the orange dots/spots represent the destinations. The spot map offers more detail, but the numbers in the matrix can become so small that relationships can no longer be seen. The ‘dot cards’ are better suited for this. An illustration of this is The Hague. In the spot map there seems to be little relationship with areas within The Hague, while on the dot map there is a clear relationship with The Hague. The flow based on the MPD-data corresponds very well with traffic counts in the Heinenoord-tunnel.

Figure 3.4 Shares of origins and destinations in the Heinenoord-tunnel based on MPD-data


Comparison selected link average working day and weekend day (September 2017)
In Figure 3.5 the origins and destinations of the users of the Heinenoord-tunnel during a working day and a weekend day are compared. As described before the MPD-flow of the weekend day is beneath the counting value. However, under the assumption that this difference is general, the figure with the relative distributions on the origins and destinations remains usable. Again, the results are plausible. For example: in the weekend day the share is lower in the Harbour Area (part of Rotterdam) and higher in the city of Barendrecht (with large shopping areas including an IKEA store).

 

Figure 3.5: Comparison of origins and destinations of traffic in the Heinenoord-tunnel based on MPD-data during a working day and a weekend day (green dots: higher share in weekend day; red dots: higher share in working day)

Out-of-Home advertising

A new use of MPD concerns transport modelling for Out-of-Home (OOH) advertising. For OOH complete traffic flows, i.e. both motorized traffic and slow modes must be assessed at billboard locations. This information is used to determine the value of different advertising locations. Clearly, the availability of up-to-date, dynamic OD-data at a national scale is highly beneficial for this purpose.

In the previous paragraphs the structure of an estimated (modelled) OD-matrix of a transport model is adapted according to the MPD-data. For OOH advertising we analyzed whether it is possible to determine vehicle flows directly from the MPD-data. As mentioned in paragraph 3.1 short distance trips are underrepresented in MPD. Assigning the OD-matrix therefore will lead to underestimations of flows on urban roads. A generic algorithm was developed which determines flows from the MPD- data. In the algorithm short distance trips are added based on socio-economic data (e.g. inhabitants and number of jobs). We analyzed the assignment at national level in several regions in the Netherlands by comparison with traffic counts.

In Table 3.2 the correlation coefficients are shown for 4 different levels of type of roads in the Metropolitan region Rotterdam The Hague (MRDH) is for the 24h period and for the AM and PM peak periods. Figure 3.6 shows the corresponding scatter plots for the 24h period.

 

Table 3.2: Correlation coefficients MPD-flows vs. counts in Metropolitan region Rotterdam The Hague

Figure 3.6: Scatterplot MPD-flows vs. counts for 4 different type of roads in Metropolitan region Rotterdam The Hague

The results shown in the figure above are good for motorways and quite good for provincial roads and acceptable for urban access roads and residential roads. Considered that average count values have been used in this comparison together with the fact that the spread of flows on lower order roads is larger than for on higher order roads this is a promising approach.

Currently we are working on an (assignment) procedure at very detailed level which will result in traffic flows for every road on the Dutch network obtained directly from MPD-data and will assumable lead to even better results, especially on the roads of lower order.

OUR VISION AND FUTURE DEVELOPMENTS

Traffic flows
The results of the data analysis in the previous chapter show that by using MPD it is possible to estimate flows for each section of the Netherlands in the number of motor vehicles, for each hour of the day and for each day in the year (24/7/365). The data are completely based on measured data and give a truthful picture of the traffic situation. The underlying movement patterns are not mathematically estimated as with a traffic model. MPD can be used for improving the structure of OD-matrices in transport models as well as in direct use for determining traffic flows.

Transition in transport modelling
Considering the drawing up of a classic transport model a lot of effort (and therefore also costs) is needed in the determination of OD-matrices for the base year situation. Usage of MPD (probably in combination with other big data sources like FCD) opens the possibilities towards data-driven modelling which will lead to a major transition in transport modelling as is shown in Figure 4.1.

In current practice a synthetic model (OD-matrix) is drawn up for the base year situation. Using measurements (like traffic counts) this synthetic OD-matrix is calibrated and results in the a posteriori OD-matrix which describes the actual current traffic patterns. To determine the OD-matrix for the forecast year growth factors are applied on the a posteriori OD-matrix, where the growth factors are based on the ration of the synthetic OD-matrices for both the forecast year and the base year.

We believe that in the near future it will no longer be necessary to determine an OD-matrix using a model for the base year situation. Using the travel pattern information from MPD together with other data sources it will be possible to determine an OD-matrix which is completely based on measured data (a data-driven OD-matrix). For the determination of an OD-matrix for a forecast year a behavior model will however still be necessary needed to determine growth factors similar to current practice. The better the synthetic model of the base year can approach the data-driven base- year OD-matrix, the better the forecast model will be. Therefore, quality measures obtained from the data-driven OD-matrix can be used for validating the quality of the synthetic model.

So, decent transport modeling will stay very important in the future for many applications, but the modelling process however will change.

Figure 4.1: Transition in transport modelling

All kind of time periods
In addition to usual analysis in the transport domain for an average working day, it also opens options for analysis at other time periods. Gaining insights into travel patterns and traffic flows on shopping evenings, weekend days, public holidays and events is suddenly relatively easy. Also, the phasing of (re)construction works that often take place in the evening and the night period can be done. This is of high value not only in the transport domain but also in the domain of OOH.

Not only vehicle movements, but also train
At this stage we can determine in MPD if a trip is made by train or not. In this paper we focused on the non-train trips in MPD. But there is also interest from public transport companies. Recently, the MPD of train and non-train was used in a study in the eastern part of the Netherlands for concession classification of public transport. Current borders of the concession areas of the public transport providers run straight through urban areas. In this research star maps were made based on MPD which show the current transport tension between the defined urban areas and the surrounding municipalities (see Figure 4.2).

Figure 4.2: Transport tension in the east of the Netherlands for 3 urban areas

Future developments
The subdivision of non-train trips between passenger car and truck traffic (and other slow modes) is still under development. Because the travel characteristics of a device are known for one month it seems possible to determine trips that assumable correspond to trucks (e.g. regular trips towards specific destinations like industry locations, long distance trips, time of day). But also, a combination (fusion) with FCD is an option that is worth exploring.

Another topic we will work on is determining trips made by users with more than one device (e.g. private and working devices). If we can recognize the duplications the quality of the data for the weekend days will further improve.

Limitations of MPD
Of course, there are some limitations in MPD. As mentioned, short distance trips are currently underrepresented. Considering that the technology in the mobile phone network will develop we are sure that we will be able to determine the origins and destinations of trips more precise in the near future and therefore also the short distance trips.

BIBLIOGRAPHY

CBS (2018), Onderzoek Verplaatsing in Nederland 2017
Deloitte (2017)

Deloitte Global Mobile Consumer Survey 2017 – The Netherlands

Joksimovic, D., Friso, K., and Keij, J., (2017) Recent developments of big data in the Dutch national model – Study with mobile phone data, European Transport Conference 2017, Barcelona

Wismans, L.J.J., Friso, K., Rijsdijk, J., de Graaf, S.W., and Keij, J., (2018) Improving A Priori Demand Estimates Transport Models using Mobile Phone Data: A Rotterdam-Region Case, Journal of Urban Technology,
DOI: 10.1080/10630732.2018.1442075