Michael Baker - Thesis - Problems in Longterm Forecasting and Planning
About | Contents
Previous: 2. Vehicle Refuelling Infrastructure | This: 3. Transport Input-Output Study | Next: 4. Review of Lessons learnt
As I explained in Chapter 1, for the further work I intended to do on freight transport, I was going to use input-output tables to determine the net physical output of different commodities. This interest in freight transport and the use of input-output tables led to a third area of work. This was collecting data on the production and consumption of transport, and resource use by transport, in an input-output framework. The work was supported by the Transport and Road Research Laboratory and was principally of use to the Central Statistical Office in their construction of the 1974 input-output tables.
I thought that working on the 1974 tables would help me gain access to them. I also saw possibilities of relating freight transport demand to the physical output of, and inputs to, individual industries and hoped that working on the 1974 input-output tables might also help towards this.
The aim of the Transport Input-output (TIO) study was to provide data on resource flows associated with the transport sector of the UK economy compatible with the input-output tables compiled by the Central Statistical Office. Several input-output tables have been constructed for the UK. Those for 1954, 1963, 1968 and 1974 (Central Statistical Office 1961, 1970, 1973, 1981) Were based on detailed censuses of production. However the censuses of production do not cover the transport industries, which was the main reason why the TIO study was necessary. The principal objective was to prepare entries for the input-output tables. This work took so long that other objectives of extending the work to other years, such as 1968, and examining the linearity of the relationship between inputs and outputs was not possible.
The results of the TIO study reported in Baker (1979a, 1979b, and 1980) cover the financial, and some physical, inputs to, and outputs of, the transport industries. They also cover the road transport output of, and transport consumed by, other industries. The main sources used were published statistics. including the reports and accounts of nationalised undertakings. I also used some unpublished statistics.
In this chapter I briefly describe how I obtained my results and then describe lessons learnt about reliability (or otherwise) of officially collected and compiled statistics.
To facilitate my description of the work I did I shall first describe input-output tables, and the main sources used in their compilation.
Input-output tables are a means of showing the inter-relationships between the producers and consumers in the economy during a particular year. The principal difference between these tables and the more familiar analyses presented in the National Accounts is that the input-output tables show the flows between industries as buyers of each others' outputs in addition to the 'final' expenditure (by consumers, government, investors and on exports) and income identified in the National Accounts. In input-output terms the economy is considered as a system of industries linked together by flows of goods and services ...(Johnson 1976)Input-output tables were originally developed by Leontief et. al. (1951) for the American economy in the 1930's. The use of the inter-industry information meant that a much more complete picture of the economy could be built up with consequently improved possibilities for economic forecasting. In particular this method allows the implications of a change in demand for a particular commodity to be calculated for each industry in the economy rather than simply as an aggregate for the economy as a whole. In addition the input-output table is a system within which all the national accounts aggregates can be contained. Moreover, because it demands that inputs equal outputs for each industry it is a framework within which analysis of discrepancies between expenditure and income measures of the gross domestic product can be investigated in detail
There are three basic tables. These are the Output, Input and Import Tables. The first shows the output of commodities by each industry. The second shows the consumption of domestically produced commodities by each industry and final demand. Finally the third shows the consumption of imported commodities by each industry and final demand. The input table also shows imports of goods and services, net taxes on expenditure, and value added for each industry so that the total inputs equal the total outputs for each industry. Since the input table also shows the purchases of commodities by consumption, investment and exports (which make up final demand) the total production of a commodity from the output table is equal to the total consumption of each commodity in the input table. The import table is an expansion of the imports vector in the input table.
The classification of industries used in the 1974 input-output tables is based upon the Standard Industrial Classification 1968 (SIC, Central Statistical Office 1975). The industries are defined in terms of one or more Minimum List Headings (MLH) from the SIC. Commodities are defined as the principal product of an industry. Consequently the output of service industries, including the transport industries are considered to be commodities.
The major source of data from which input-output tables are constructed is the Census of Production and Purchases Inquiry (COP and PI, Business Statistics Office 1978 and 1979). The reports of the COP and PI cover all manufacturing industries. That is MLH's 101 to 603. The CoP does not cover Agriculture, Forestry and fishing. nor does it cover Transport, Distribution, nor other private and public services. The basic reporting unit for the Cop and PI is the establishment. Establishments are classified by their principal product and not by their ownership. For example the rail workshops of British Rail Engineering Ltd. are classified as MLHs 384 or 385, Locomotive and Railway Track Equipment or Railway Carriages and Wagons and Trams not as MLH 701 Railways. Since the major source of data for the input-output tables is based upon the establishment, so are the tables themselves.
From the PI can be built up a table of the consumption of commodities by each industry. However the PI does not distinguish between domestically produced and imported commodities. Consequently to obtain the input table, imports must be subtracted from the table of consumption. The basic source of data on imports is the Trade of the United Kingdom (Her Majesty’s Customs and Excise annual). This lists in very great detail the imports and exports of goods. Each commodity listed as an import is assigned to an industry or final demand which is assumed to import it.
As mentioned above the grouping into industry/commodity groups which I used was that used in the 1974 input-output tables. It was based upon single or groups of Minimum List Headings from the Standard Industrial Classification 1968 (Central Statistical Office 1975). An extract from the SIC, of Order XXII - Transport and Communication, rearranged into the input-output groupings is given in Table 3.1.
Table 3.1 SIC Order XXII - Transport and Communication
| I.O. No | Minimum List Heading | |
| 93 | 701 | Railways Railways including both the underground and surface railways operated by the London Transport Board. Ancillary undertakings, such as locomotive, carriage and wagon workshops, catering services, air, omnibus or steamer services, docks and canals, are classified in their appropriate headings. |
| 94 | 702 | Road Passenger Transport 1. Omnibus and tramway service The operation of omnibus, motor coach, trolleybus and tramway services. 2. Taxis and private hire cars The operation of taxi-cabs and private-hire cars: owner-drivers are included. Car hire is also included. |
| 703 | Road Haulage Contracting for General Hire or Reward Cartage and haulage contractors (whether using motor or hore-drawn vehicles) of all types, including furniture removers, mainly operating for general hire or reward. Hire of commercial vehicles is included. Establisments mainly carrying goods in connection with another business operated under common ownership or control are classified in Heading 704. |
704 | Other Road Haulage Cartage and haulage undertakings (whether using motor or hores-drawn vehicles) of all types mainly engaged in carrying goods in connection with another business operated under common ownership or control. |
| 95 | 705 | Sea Transport 1. Shipping company (shore establisments) The shore establishments of companies (including railways) operating sea-going ships for the conveyance of either passengers or cargo. The operation of fishing vessels is classified in Heading 003. 2. Shipping company (sea-going personnel) The crews of sea-going merchant ships, other than fishing vessels. 3. Pilotage The provision of pilots for sea-going ships. |
| 706 | Port and Inland Water Transport Harbour, dock, canal, lighthouse, lightship, etc. authorities, and establishments conducting marine salvage operations; the loading and unloading of vessels and the operation of tugs, lighters, barges, ferries, etc. in ports and inland waterways. The hiring of pleasure boats, punts, etc. is classified in Heading 882 |
|
| 96 | 707 | Air Transport Air line companies operating on regular schedules or on charter (including establishments of Commonwealth and foreign air lines in the United Kngdom), and aerodromes, including ariports, air traffic control centres, and communicaiton centers operated by the Board of Trade. Flying schools and flying and glider clubs are classified in Headings 709 and 882 respectively. |
| 709 | Miscellaneous Transport Services and Storage 1. Services incidental to transport Ship brokers, freight brokers, shipping agents, forwarding agents, travel ticket agents, tourist and excursion agents and similar establishemnts which facilitate the transport of passengers or goods but are not transport operatiors; flying schools, motoring schools, car parks, the road patrols and other motoring services of the motorists' organisations; the operations of toll roadsw and toll bridges. 2. Storage Warehhouses (including bonded warehouses), cold storage, furniture repositories, safe deposits, etc. 3. Other Providing messenger service or porterage; hiring hand trucks, barrows, tradesmen's cycles, bath-chairs, etc. |
|
| 97 | 708 | Postal Services and Telecommunications All Post Office establisments, except the factories manufacturing and repairing telephone and telegraph apparatus (classified in Heading 363) and the Post Office Savings Department and Post Office Giro (classified in Heading 861); cable and radio services (excluding broadcasting and radio relay services) and other telephone or telegraph services. |
Figures 3.1 and 3.2 show the numbers I was attempting to obtain in the TIO study.
Figure 3.1 Numbers required for the Output Table
Figure 3.2 Numbers required for the Input and Import Tables
On the output side I was attempting to obtain figures on the output of all commodities by the Transport industries. In total was 412 numbers (4 industries by 103 commodities). However the transport industries have very few outputs other than transport so that most of these numbers were zero. I was also trying to find the output of transport by non transport industries. This was 396 numbers (1 commodities by 99 industries).
Due to difficulties in obtaining data I assumed that the only transport which was produced by non transport industries was road haulage. That is I assumed all rail, ship, air and road passenger transport was only produced by the respective industries. This meant that I assumed that 297 of the required numbers were zero.
On the input side I was concerned with the inputs of commodities and value added (108) the transport industries (4) which meant I was looking for 424 numbers and also with the inputs of transport (4) to non-transport industries (99) which led to looking for 396 numbers.
To determine the output of road haulage by all industries I estimated the total of all inputs to industries for the provision of road haulage. Potentially this could have required 11124 numbers (108 commodities and value added by 103 industries).
Apart from the output of the Transport industries I was unable to obtain all of the numbers required. For all but the input of Transport to industries I was able to find the total inputs or outputs but was not able to get a complete breakdown into the required categories. For example I was only able to obtain one number representing the output of road haulage by Agriculture, Forestry and Fishing, where I required two, one for Agriculture and one for Forestry and Fishing.
The extent to which I was able to obtain all of the required numbers is illustrated in Figures 3.3 to 3.7. These show the proportion of the totals of input or output respectively (on the vertical axis) which were at the required level of disaggregation and by how much the others fell short of this (coverage factor, on the horizontal axis). The coverage factor is the ratio of numbers found to numbers required.
Figure 3.3 Extent to which numbers were found - Output of Transport (by non transport industries)
Figure 3.4 Extent to which numbers were found - Ouptut by Transport Industries
Figure 3.5 Extent to which numbers were found - Input of Transport to Industries
Figure 3.6 Extent to which numbers were found - Input to Transport Industries
Figure 3.7 Extent to which numbers were found - Input to the Road Haulage produced by all industries
Having become a World expert on UK transport statistics
(Johnson 1979,
see Appendix 5)
and been congratulated on the thoroughness and care
that have obviously been taken to arrive at ... very comprehensive
results [in the TIO study]
(Lockyer 1979,
see Appendix 5).
I still have
many reservations about the reliability of the work I did.
There were two basic lessons which I learnt from the TIO study. These were that statistics are often unreliable and that very little attention to assessing or specifying their reliability. In the following sections give examples of why statistics are unreliable and why I have concluded that too little attention is paid to reliability.
From my work on the TIO study I can identify five ways in which the way that statistics are compiled leads to their being unreliable. These can be summarised as secrecy, estimation of incomplete data, the tortuous recombination of data, dubious sources of data, and incomplete enumeration.
In the TIO study some of the data I obtained was released to me on the basis that I did not reveal its source. For some other data I was able to quote the source but was unable to quote the data without manipulating it sufficiently so that the original data could not be inferred from my report.
In the first case where I was unable to quote the source of my data I
had to resort to saying that CSO have provided . . estimates
(Baker 1979a).
The Input-output section of CSO for whom the TIO study was
conducted know what the source is and how reliable they think this source
is. However any one else who reads my report
(Baker 1979a)
will be unable
to either reproduce my work or to judge how reliable these estimates
might be.
In the other cases of secrecy I was able to quote my sources. However, because of the apparently sensitive nature of the data which had been collected (General Council of British Shipping 1977), and the rules governing confidentiality (Department of Transport 1979a), I was unable to quote the basic data from which I worked. Again in these two eases any one reading my report who does not have access to either of these two sources will be unable check for themselves whether the work I did was reasonable or accurate.
It is one of the basic assumptions underlying the operation of a free market (upon which economic theory is based) that everyone has access to perfect information. However these examples show some of the ways in which secrecy prevents free access to information which would be a necessary prerequisite for perfect information.
There are very few sources of data [1] which are based upon complete surveys. One of these is the Population Census. Most sources of data do not cover the whole of the population being surveyed and some way is used to make whole population estimates.
In the TIO study I came across several cases in which either I or others had had to make such estimates. I shall describe those necessary for data from the Census of Production, the Continuing Survey of' Road Goods Transport and the National Travel Survey.
The data which I used from the Census of Production (Business Statistics Office 1978 & 1979) was only collected for larger establishments (that is employing 50 or more people). To obtain estimates expenditure by all establishments I had to multiply the larger establishments' expenditures the ratio of total expenditure by all establishments, to that by larger establishments. This would only give a correct result if the patterns of expenditure enlarger and smaller establishments were the same. However I think that it is quite certain that they are not, but there is no way of knowing because a detailed breakdown of smaller establishments expenditures is not available.
The Continuing Survey of Road Goods Transport (CSRGT Department of Transport 1979a) is a sample survey. Of the 560.1 thousand goods vehicles over 3.5 tonnes Gross Vehicle Weight (GVW) in 1974 Department of Transport (annual a) 17640 were surveyed. The operator of each vehicle was required to provide details of the Work performed by the vehicle during one week. Approximately equal numbers of vehicles were sampled in each week of the year. The grossing factors for numbers of vehicles were approximately 30 and those for vehicle km were approximately 1500.
I used data from both the 1972/3 and the 1975/6 National Travel Surveys (NTS, Department of Transport 1979b). These were both sample surveys in which households were asked to give details of all journeys made by the members of the households during one week. As with the CSRGT the weeks were spread through out the year of the survey. The 1972/3 NTS covered a population of 16836 out of a UK population of 51.786 million and that for 1975/6 covered 27906 out of 52.306 million. This led me to use grossing factors of 160387 and 91735 respectively for data from the two NTSs.
I shall illustrate the necessity of having resort to resort to tortuous recombination of data by describing what I had to do to make estimates of the financial inputs to, and output of, road haulage.
What I wanted were the expenditures made by each of the 102 input-output industries in their operation of road haulage. I also wanted these expenditures to be broken down by the 102 input-output commodities and split between the provision of own account and public haulage operations. From the total of all expenditures (including profits) I would be able to estimate the output of road haulage by each industry.
The data which I used and its sources is shown in Table 3.2.
Table 3.2 Data sources on Road Haulage
| Source | Data on | Coverage | Data from |
| ACoP/PI | fuel, tyres + spares, repairs + maintenance, licences | Manufacturing Industry by MLH | survey of all larger establisments |
| Various | as for ACoP/PI | Distribution | as for ACoP/PI |
| CM | unit costs /km: fuel, lubricants, tyres, maintenance, depreciation /vehicle: insurance, licences, interest, rent + rates, wages |
GVW | Unknown |
| CSRGT | number of vehicles and vehicle km | SIC order by own account & public haulage and by GVW | Sample survey grossed to national total |
| OCinRFT | fuel + lubricants, spares, tyres, other materials, hiring vehicles, insurance, licences, rates, HP interest, other overheads, wages, depreciation | own account & public haulage in 1965 | sample survey grossed to national totals |
None of this data covered all inputs to the road haulage operation of all industries. The Annual Census of Production and Purchases Inquiry (ACoP/PI, Business Statistics Office 1978 & 1979) covered about three quarters of the cost of commodities purchased for road haulage operations by manufacturing industries. This was the only source of data which was sufficiently disaggregated to get details for individual IO industries. (I obtained similar data to that provided by ACOP/PI for Distribution from the surveys of Wholesaling and of Retail Distribution (Business Statistics Office 1979 and Business Statistics Office 1977).)
From the Continuing Survey of Road Goods Transport (CSRGT, Department of Transport 1979a) I was able to obtain estimates of the numbers of vehicles and vehicle km broken down by Gross Vehicle Weight (GVW) and SIC order. This was the only source of data which covered all road haulage.
From the Commercial Motor Tables of Operating Costs (CM, Commercial Motor 1975) I was able to obtain unit costs for the operation of road goods vehicles (per vehicle km and per vehicle) broken down by GVW. This was the only source of data which covered all of the inputs to the road haulage operation (that is the of all these costs is equal to the output of road haulage).
To make comparisons between the categories of expenditure from ACoP/PI and CH and those I wanted (i.e. in IO commodities) I used Operating Costs in Road Freight Transport (OCinRFT, Edwards and Bayliss 1970). However the data in this was for 1965.
The above can be summarised by saying:
The ways in which these sources were combined is illustrated in Figure 3.8.
Figure 3.8 How Data on Road Haulage was manipulated
Basicity I did the following.
This is just one example of the tortuous recombination of data out many which I could have chosen from the TIO study. In fact I had to employ similar procedures for each of the transport industries. Such tortuous recombination of data necessary when the data being used is from different sources. In such cases it is usual that the data was originally collected for different purposes, it has different coverages and uses different classifications.
In situations where the tortuous recombination of data is necessary it is often difficult to understand what was done. This is important because it then becomes very difficult to follow how the reliability of the input data effects the reliability of the output. Also, the more complex is the manipulation, the more chance is there for errors to be made. For example such errors include the misallocation of numbers to categories when converting from one classification system to another.
In the TIO study I had to resort to several dubious sources. That is sources which I either considered to be unreliable and/or I knew nothing of the way their statistics were compiled and/or I did not think the data was representative. In each of the cases where I did use a dubious source it was because it was the only source of data I could find. One example of a dubious source is Commercial Motor (1975) from which I obtained unit operating costs for goods vehicles (see above) and buses and taxis. In each issue of the Commercial Motor Tables of Operating Costs details are given on how the various costs have changed since the previous year. For example:
Following the average inflationary trend of 10 per cent, ... , insurance premiums have been increased in the main by that amount as has the rent and rates factor.(Commercial Motor 1975)
No indication is given how the data used in the Tables is collected but it seems unlikely that it is simply inflated by some arbiters amount each year, as the above might suggest, since such a procedure would lead to very inaccurate costs after several years. Another example of a dubious source 1 used was data provided to me by CSO on the inputs to Agriculture. In this ease I considered the data to be dubious because I was totally unaware of how it had been compiled and I knew of no surveys upon which it might have been based.
A third example of a dubious source was the use of a National Board for Prices and Incomes (NBPI, 1968) report on London taxi fares, I used it to make an estimate of the number of vehicle mites run by taxis per passenger mile. I did this so that I could convert estimates of passenger km, from the National Travel Survey (Department of Transport 1979b), to vehicle km. However I would be very surprised if the operation of London taxis led to the same ratio of vehicle to passenger km as that for taxis and hire cars nationally.
When using dubious sources such as these I feel uncomfortable because I don't know how reliable they are, and often suspect that they are of poor reliability.
Another reliability problem I came across in the TIO study was that of enumeration. For each of the industries I covered I had to try and ensure that I collected data on the whole industry. To do this I first had to list or enumerate the components of the industry. However, short of having a list of all establishments in the UK and assigning each one to an industry, there is no way of knowing if a list of the components of an industry is complete.
As an example of incomplete enumeration I recently found that the list of 'other railways' in Department of Transport (annual a) that I used was incomplete. In this instance the omission will have had negligible effect on the reliability of the estimates for railways since 'other railways' are such a small component of the total. However this illustrates the fact that it not possible to be certain that all components of an industry have been enumerated (without an inordinate amount of work, see above). In brief I can not know what I do not know.
In the TIO study I also came across a longstanding case of incomplete enumeration. For many years the Department of Transport (and its predecessors), in making tables of Freight Transport in GB by all modes (Department of Transport annual a, Central Statistical Office annual a), have used British Waterways (BWB) figures for total waterway freight transport. They have done this without making any reference to any other inland waterway operations. Baldwin (1977) has estimated (on the basis of a survey) that only a small part of inland waterway freight is carried on BWB waterways.
In the previous section I looked at some of the reasons why statistics are unreliable. In this section I shall look at the reasons why I and others have paid very little attention to specifying the reliability of statistics.
Probably the most important reason for not specifying the reliability of statistics is that there appears to be very little demand for estimates of reliability. For example when the TIO study was set up there was no request that estimates of reliability be made. The necessity for such estimates was only agreed upon later at one of the project progress meetings.
A second reason for little attention being paid to the reliability of statistical estimates is that it is generally acknowledged that to make rigorous or objective estimates of reliability is either very difficult or impossible. In this situation the best that can usually be done is for the person compiling the statistics to make subjective judgements of their reliability.
One of the few descriptions of the reliability of a set of official statistics is given in Maurice (1968). It was upon that description that I based my own classification of reliability in my report on the TIO study.
It will be clear from the review of the data used that it is impossible to calculate statistical margins of error of the kind that are derived from random samples for any of the aggregates or for most of their components. It is however possible, from knowledge of the data, to form very rough and subjective judgements of the range of reasonable doubt attaching to the estimates. This is done in the sections concerned with the detailed estimates. Wherever possible these judgements are standardised by the use of uniform gradings, as follows:(Baker 1979a)Margin of error A ± less than 3 per cent
B ± 3 per cent to 10 per cent
C ± more than 10 per centThe terms 'good', 'fair' and 'poor' are used at some points and are broadly equivalent to categories A, B and C. Like margins of error derived from random samples, these judgements do not represent absolute certainty. They may be taken to mean that, in the opinion of the author and in the present state of knowledge. there is, say, a 90 per cent chance that the true value of the figures referred to lies within the limits of the grading.
The main reason why so little attention is paid to specifying reliability is the amount of work which would be required to make anything other than subjective assessments. For example in the TIO study to determine the reliability of the road freight transport estimates (as described above) it would be necessary to conduct at the least a sample survey of all road freight transport. Such a survey would take longer and more effort than my original estimates. It would also yield results which were more reliable than the original estimates. However it might be very hard to estimate how reliable, since to do so might need an even bigger survey.
One way of getting an idea of the reliability of estimates is to make the estimates by using two or more independent methods. For example in my work on Road Haulage (see above) I made an estimate of the gross profit of the road haulage industry. After making a preliminary estimate of the inputs to and output of road haulage I received a request from F Johnson of the CSO for advice on sources of data on road haulage. He was interested in the gross profit in the Road Haulage industry and commented that my estimate was about twice the revenue estimate (see correspondence in Appendix 5). In my reply I sent details of my second round estimates in which my estimate of gross profit was about half of the first estimate. Up this point the two estimates were independent (in that, different sources were used and no communication had taken place between those making the estimates). However before finishing the work on road haulage I made a third round set of estimates (as shown in Appendix 5). In this third round the estimate of gross profit (£m 434.8) was very similar to that in the second round (£m 448.6). However had the revenue estimate been very different from my second round estimate it is likely that this would have influenced my third round estimate. It is likely that the nearness of the second round and revenue estimates reduced my attention to the gross profit estimate in the third round. The point of all this is to say that by the third round my estimate of gross profit was not independent of the revenue estimate just because I knew of its existence.
Independent estimates can be used as a check on their own reliability. However it is very difficult to arrive at indipendent estimates.
Finally when making subjective judgements of reliability I am very aware that I am likely to overestimate their reliability rather than underestimate. This is becuase the higher the reliability the more worthwhile is the result, and so the more worthwhile was the work involved. I don't like doing wothless work. I suspect that I may well have over estimated the reliability of the results in the TIO study for this very reason, and I also suspect that others are prone to the same pitfall.
From the TIO sudy I learnt lessons, or reinforced earlier lessons, on the nature of statistices. These lessons are that data is difficult to obtain, the data which does exist is not necessarily freely available, that it is often unreliable, that different sources often give different answers for the same thing. Finally I found that gathering and collating statistics can be very monotonous.
[1] That is data of the sort which is used in making long term forecasts (such as economic, transport and energy) for policy purposes.
[2] Together Retail and Wholesale distribution form one sic order which is also one IO industry.
About | Contents
Previous: 2. Vehicle Refuelling Infrastructure | This: 3. Transport Input-Output Study | Next: 4. Review of Lessons learnt
Copyright © Michael Baker 1981,2005. All Rights Reserved.