# ANNEX B. METHODOLOGICAL ANNEX

## 1. INTRODUCTION

This annex documents the methodology used in deriving the projections.

The module on production of tobacco leaf includes tobacco area and tobacco yield equations that determine domestic production. The impact and significance of various factors such as tobacco leaf price at farm level, technology, irrigation, possible competition with other crops, tobacco types, etc., are examined and their importance in modelling is considered. Stocks are not considered in detail but as a balancing item.

The demand for tobacco leaf is usually modelled as a function of income, price, population of age 15 and above, and a vector of shift variables that include learning, preferences, anti-smoking measures, advertising, etc.

Import and export demand functions are estimated and linked to the world price through price transmission equations. The world price is a weighted average of the unit values of exports from the four major exporters.

The model closes with trade using excess supply and excess demand functions for each country and for the world, balancing excess supply and excess demand and revising the world price appropriately.

## 2. A STANDARD COMMODITY MODEL

A standard commodity model can be specified either in linear or in logarithmic form as follows. The model is specified in A in linear form and in B in logarithmic form (double log). The model has 7 equations with 7 endogenous variables, six exogenous variables and 18 parameters. The model can be specified as a simultaneous or a recursive model. In this study it is specified as a recursive model.

A. Commodity model in linear form

 (1) (2) (3) (4) (5) (6) (7)

B. Commodity model in double log form

 (1) (2) (3) (4) (5) (6) (7)

Endogenous Variables

 At: area Yt: yield Dt: demand Xt: exports Pc: price of commodity It: ending stocks St: supply

Exogenous Variables

 Pw: world price P0: price index of other commodities T: time trend (technology index) E: private expenditure (GDP) N: population Z: demand shifters SSR: Self-Sufficiency Ratio PM: Border price of imports : World price in the world market relative to the domestic price

## 3. MODEL SPECIFICATION AND DATA

Given the large number of countries for which the model was estimated and the fact that the results were to be used in a projections model, we made a choice for a standard model specification in all countries and regions. This does not allow much flexibility in modelling production supply and consumption taxation regimes across countries that are quite different from one country to another. However, the double log specification is very convenient and suitable for such an extensive estimation effort that includes all countries of the world.

The data used in the estimation are from various sources, including USDA and FAO and are summarized in the Statistical Annex A. This data set does not include information on prices, either tobacco leaf prices (farm level) or consumption prices. Some information on prices was available for certain countries and is given in Table B.1.

### Table B.1 Farm level prices (US\$/kg)

 Year Average prices Malawi Zimbabwe Turkey United States 1980 1.35 1981 2.30 1982 2.02 1983 1.35 1984 1.39 1.66 4.00 1985 1.22 1.86 3.66 1986 1.60 1.89 3.41 1987 1.79 1.59 3.48 1988 2.06 2.42 3.56 1989 1.67 2.02 3.69 1990 2.17 1.86 3.78 1991 2.56 2.28 4.13 3.87 1992 1.90 1.82 4.05 3.91 1993 1.21 1.24 4.33 3.86 1994 1.41 1.73 3.81 3.90 1995 1.66 2.12 2.97 4.03 1996 1.94 2.94 3.32 4.14 1997 1.72 2.33 3.47 3.98 1998 1.36 1.72 3.36 4.04

Source: USDA

For those countries for which information on farm level prices was not available we used a farm price index derived from the export unit value deflated appropriately. We used the dollar deflator to deflate prices to constant US dollars. The deflator used is given in Table B.2. No attempt was made to construct a consumption price, because of lack of suitable data, at the cost of not introducing a price variable in the estimation of demand functions.

### Table B.2 US dollar deflator

 Year Deflator Year Deflator Year Deflator 1970 0.295 1980 0.581 1990 0.881 1971 0.310 1981 0.636 1991 0.913 1972 0.323 1982 0.675 1992 0.936 1973 0.341 1983 0.701 1993 0.958 1974 0.372 1984 0.728 1994 0.978 1975 0.407 1985 0.751 1995 1.000 1976 0.430 1986 0.767 1996 1.020 1977 0.458 1987 0.790 1997 1.038 1978 0.491 1988 0.817 1998 1.050 1979 0.532 1989 0.848

Source: OECD

## 4. THE METHODOLOGY OF PROJECTIONS

The projections were obtained following a methodology that essentially included three stages. The first stage included the construction of the model, the second stage included the calibration and validation of the model and the third stage included the simulations of the model that produced the projections. The endeavour was to base the projections as much as possible on an empirically estimated commodity model and on official forecasts for the exogenous variables for the simulations for the period up to 2010.

Undertaking an empirical estimation of the commodity model, however, for 164 countries would have been not only an impossible task, but also unnecessary. Thus, the model was estimated for the eight major countries that produce more than 80 percent of world tobacco leaf production and to a large extent are major consumers as well. The remaining countries were then aggregated into regional groups.

Thus, in the first stage production, demand, export and import equations were estimated for the major producing countries: China, India, Brazil, the United States, the EU, Turkey, Zimbabwe and Malawi. In addition, the same set of equations was estimated for the following groups of countries, other Europe, area of the former USSR, Oceania, other developed, other Africa, other Latin America, other Near East and other Far East. The specification of the model is given in section 5 and the estimation results using data for the period 1970-1998 are provided in section 6.

In the second stage the simulation model was constructed in an Excel spreadsheet including the standard list of the total number of countries (164) and all regional groupings. The model consists of five equations in double log form, i.e., demand, area, yield, exports and imports. The three year average 1997-1999 (“1998”) was used as base year of the model. Having estimated the model in double log form, these parameters are directly interpreted as elasticities which are constant throughout the data range and the projection period. These parameters were incorporated into the model.

In a limited number of cases the estimated parameters did not conform with a priori expectations from theory. One major constraint to improving the estimation results was that the functional forms were the same across countries to facilitate the construction of the model in a spreadsheet. In such cases where they had the wrong sign or magnitude they were replaced with parameters that are consistent with evidence available in the literature.

The most important of these cases is the lack of a price-related parameter in the estimated demand function. The inability to construct a consumption price for each country forced us to estimate simple Engel curves and introduce the respective price elasticities available from the literature directly as price parameters. Such could not be found at the country level and were assumed from evidence available (Zhang, 2000) to be -0.30 for developed countries and -0.80 for developing countries.

The model was calibrated to the base year (“1998”) by adjusting the intercept.

The model was also validated by examining predictive ability for the period 1992-1997. The validation results show that variability at the country level was high and predictive ability at the country level was rather poor but adequate. One of the reasons is that, except for the major eight countries, the parameters estimated are obtained from regional data and thus may not conform with country level conditions. This implies that projections at the country level may not be good enough to capture all country details and peculiarities. When, however, it comes to the regional level (such as Africa, Latin America, etc) or the world level, the predictive ability is very good because of averaging out of projection inaccuracies at the country level.

In the third stage, the model that was constructed in the first stage, and calibrated in the second stage was used for simulating the period 1999-2010. The projections were obtained for each country of the standard country list, using the constructed model and official forecasts for the exogenous variables, such as GDP and population for each country.

In addition, the results were examined for consistency with a priori knowledge about the various countries and their production, demand and trade conditions. Such country knowledge was incorporated into the projection results using judgement and discretion for countries for which the country level studies were available, such as China, Turkey, India, Zimbabwe, Malawi, and also for EU and the United States for which some general information is available from other sources. The consistency of the projection results was examined mainly by comparing the net trade position projected by production and demand for main producing and trading countries with the results of the independently projected exports and imports.

## 5. DATA SOURCES AND DEFINITIONS USED IN THE ESTIMATION OF THE COMMODITY MODEL

The model covers all major countries of the world. The standard country list used for the study is given in Annex A. It includes 164 countries grouped into developed and developing countries and further into the same regional groups as used in the analysis of past trends in Chapter 2. China is sometimes singled out because of its size.

Production data include tobacco leaf production in farm weight, as well as the area under tobacco in hectares and yield in farm weight for all countries of the world for the period 1970 to 2000. There are also detailed data by type of tobacco, such as Virginia, burley, etc. Consumption data are in dry weight for all countries of the standard country list. Similarly, there are export and import data (quantity and value) in dry weight for each country in the country list for the period 1970-1999.

Consumption and trade data are aggregate data providing no distinction between the various types of tobacco. Tobacco leaf consumption data at the country level are derived using a supply utilization account and data on stocks, production, imports and exports. Consumption data do not include 'on farm' consumption of home produced tobacco, which is believed to account for a substantial part of consumption in many developing countries.

Data on cigarette production and consumption exist for a number of countries, but they are incomplete. Therefore the analysis did not include cigarette demand and supply, but consumption of all tobacco products is translated into leaf equivalent. Thus, the analysis includes only one level (tobacco leaf).

The major deficiency of the data set and the study is the lack of price data. Price data at the farm level and for consumption are very scarce. There are several reasons explaining this deficiency. First, tobacco is a differentiated commodity with various tobacco types produced and traded. Countries and manufacturers buy particular types that they need to achieve the desired blend for the types of cigarettes they produce and therefore they may appear as both importers and exporters. Therefore, each price corresponds to a particular tobacco type, and prices of different tobacco types differ widely. Thus, there is no homogenous commodity called tobacco but various tobacco types traded internationally at prices that differ substantially.

Furthermore, there are substantial quality differences within each tobacco type. In addition, some leaf producing countries manufacture by direct contract so price data are not available, such as in Brazil. Producers in countries such as Zimbabwe and Malawi sell in auction markets and there are good price data for these countries. For Turkey also there are good price data maintained by the national tobacco monopoly. For the United States and the EU price data were obtained from other sources. For some other countries no price data are available for either production or consumption.

## 6. ESTIMATION RESULTS

The estimation results used for deriving the parameters to construct the commodity model are presented in this section. The major problem in the estimation was the lack of price data. Because of lack of data for consumption, the demand equation was estimated as a simple Engel curve. The prices used in the area equation are the deflated tobacco average prices, while for the export supply and import demand functions we used the deflated export unit value and deflated import unit value, respectively. SSR is the self-sufficiency ratio and is the quotient of production over consumption.

The t-statistic for the estimated parameter is reported in the brackets. The period of the data used was from 1970 to 1998 and the model was estimated by OLS.

Given the uniform specification used and the lack of long time series for certain variables, the results are generally considered adequate. The statistical fit was generally good, although in some cases such as the yield equation, there are cases in which the statistical fit is low. However, the yields are usually responsive to uncertain situations, such as weather conditions and other short term influences. For the purpose of this task the long term trend of the yields is adequate and therefore there was no price response specification for yields, but only a time trend was included.

The statistical significance of the estimated parameters was generally adequate, although there are several cases of low statistical significance or, worse, of wrong signs. We chose to report the results as they have been obtained to inform the reader. Some adjustments, however, were necessary as is generally the case for the construction of a model in order that it produces meaningful and operational results.

Some of the estimated import and export functions were problematic, in particular in cases where not much activity for imports or exports exists for the particular country. Again, in these cases some adjustments were made in the model so that parameters conform with theory, as is usually the case in a modelling exercise.

The statistical results for each country and for each group of countries are presented below.

United States

 Area: lnAt = 4.661 + (1.98) 0.303*lnAt-1 + (1.415) 1.333*lnPt + (2.176) 0.699*ln (2.512) R2 = 0.59 Yield: lnY = 0.807 + (2.4.35) 0.014*lnT (1.137) R2 = 0.04 Demand: ln(D/N)= 3.830 - (-9.196) 0.683*ln(E/N) (-4.727) R2 = 0.45 Exports: lnXt = 4.577 + (4.518) 0.208*lnXt-1 + (1.183) 0.249*ln(SSR) + (1.328) 0.433*lnPw/Px (2.461) R2 = 0.61 Imports: lnMt = 1.901 + (2.286) 0.668*lnMt-1 - (5.026) 0.209*ln(SSR) - (-0.516) 0.086*lnPM (-0.378) R2 = 0.62

EU

 Area: lnAt = 0.395 + (0.447) 0.859*lnAt-1 + (10.49) 0.220*lnPx (3.579) R2 = 0.89 Yield: lnY = 0.195 + (3.149) 0.165*lnT (6.920) R2 = 0.63 Demand: ln(D/N) = -7.200 + (-30.919) 0.383*ln(E/N) + (4.092) 0.197*Dummy1 - (5.061) 0.03*Dummy2 (-0.928) R2 = 0.74 Exports: lnXt = 0.505 + (1.132) 0.803*lnXt-1 - (9.124) 0.673*ln(SSR) + (-1.804) 0.415*lnPw/Px (1.962) R2 = 0.88 Imports: lnMt = 3.263 + (3.296) 0.526*lnMt-1 - (4.184) 0.438*ln(SSR) - (-1.902) 0.322*lnPM (-2.832) R2 = 0.81

Other Europe

 Area: lnAt = 0.873 + (1.388) 0.930*lnAt-1 + (17.429) 0.038*lnPX - (0.268) 0.029*lnT - (-1.124) 0.25*Dummy (-5.027) R2 = 0.96 Yield: lnY = 0.059 + (0.852) 0.055*lnT - (2.072) R2 = 0.13 Demand: ln(D/N) = 7.243 + (-11.139) 1.360*ln(E/N) - (2.240) 0.236*lnT (-2.820) R2 = 0.26 Exports: lnXt = 4.142 + (8.639) 0.107*lnXt-1 + (1.060) 0.665*ln(SSR) + (4.995) 0.745*lnPW/PX (4.062) R2 = 0.88 Imports: lnMt = 2.409 + (4.194) 0.556*lnMt-1 - (5.147) 0.342*ln(SSR) - (-3.532) 0.238*lnPM (-2.421) R2 = 0.90

Area of the former USSR

 Area: lnAt = 5.708 + (5.205) 0.486*lnAt-1 + (5.079) 0.337*lnPx (6.196) R2 = 0.92 Yield: lnY = 0.423 + (5.402) 0.010*lnT (0.345) R2 =0.004 Demand: ln(D/N) = -6.111 + (-46.128) 0.545*ln(E/N) - (2.373) 0.29*lnT (-4.399) R2 = 0.44 Exports: lnXt = 0.313 + (1.726) 0.227*lnXt-1 - (1.967) 0.226*ln(SSR) + (-0.363) 2.329*lnPw/Px (8.456) R2 = 0.91 Imports: lnMt = 2.397 + (3.907) 0.682*lnMt-1 - (5.854) 0.195*ln(SSR) - (-0.821) 0.528*lnPM - (-4.387) 0.019*T (-2.323) R2 = 0.87

Other developed

 Area: lnAt = 1.378 + (1.925) 0.855*lnAt-1 + (12.554) 0.196*lnPX (2.468) R2 =0.95 Yield: lnY = 0.059 + (0.849) 0.056*lnT (2.073) R2 = 0.13 Demand ln(D/N) = -3.114 - (-9.853) 1.315*ln(E/N) + (-8.925) 0.114*lnT (3.358) R2 = 0.91 Exports: lnXt = 2.88 + (3.675) 0.266*lnXt-1 - (1.356) 0.638*ln(SSR) + (-1.696) 0.162*lnPw/Px - (0.537) 0.18*lnT (-1.746) R2 = 0.25 Imports: lnMt = 2.372 + (3.481) 0.613*lnMt-1 - (5.002) 0.086*ln(SSR) - (-0.344) 0.291*lnPM (-1.365) R2 = 0.75

Zimbabwe

 Area: lnAt = 1.105 + (0.620) 0.903*lnAt-1 + (5.891) 0.023*lnPt (0.138) R2 = 0.80 Yield: lnY = 0.046 + (0.753) 0.246*lnT (10.425) R2 = 0.80 Demand ln(D/N) = -5.868 + (-19.916) 1.29*ln(E/N) + (1.361) 0.122*lnT - (1.844) 0.572*Dummy1 (-3.589) R2 = 0.39 Exports: lnXt = 0.491 + (0.862) 0.918*lnXt-1 - (8.136) 0.071*ln(SSR) + (-0.479) 0.244*lnPw/Px (0.836) R2 = 0.78 Imports: lnMt = 1.380 + (0.919) 0.529*lnMt-1 - (1.864) 0.609*ln(SSR) + (-0.748) 0.338*lnPM (0.742) R2 = 0.42

Malawi

 Area: lnAt = 8.722 + (2.897) 0.268*lnAt-1 - (1.048) 0.482*lnPt + (-2.563) 0.306*Dummy1 - (2.370) 0.218*Dummy2 (-1.641) R2 = 0.65 Yield: lnY = -0.891 + (-9.933) 0.278*lnT (8.063) R2 = 0.70 Demand ln(D/N) = -5.863 + (-2.595) 1.034*ln(E/N) + (0.888) 0.550*lnT (4.165) R2 = 0.43 Exports: lnXt = -0.043 + (-0.122) 0.936*lnXt-1 + (12.479) 0.182*ln(SSR) + (2.877) 0.145*lnPw/Px (0.633) R2 = 0.89 Imports: lnMt = 0.433 + (0.703) 0.937*lnMt-1 - (3.885) 0.024*ln(SSR) - (-0.097) 0.482*lnPM (-1.015) R2 = 0.61

Other Africa

 Area: lnAt = 2.214 + (1.332) 0.813*lnAt-1 + (5.847) 0.004*lnPx (0.120) R2 = 0.60 Yield: lnY = -0.568 + (-15.271) 0.082*lnT (5.780) R2 = 0.55 Demand ln(D/N) = -6.962 + (-22.908) 0.991*ln(E/N) - (2.244) 0.162*lnT (-7.873) R2 = 0.70 Exports: lnXt = 1.937 + (4.092) 0.326*lnXt-1 + (2.873) 1.210*ln(SSR) + (3.014) 0.985*lnPw/Px (5.708) R2 = 0.79 Imports: lnMt = 4.283 - (7.783) 0.093*lnMt-1 - (-0.734) 0.842*ln(SSR) - (-4.627) 0.195*lnPM (-1.296) R2 = 0.49

Brazil

 Area: lnAt = 3.066 + (1.972) 0.751*lnAt-1 + (6.002) 0.066*lnPx (1.348) R2 = 0.66 Yield: lnY = -0.159 + (-3.341) 0.185*lnT (10.084) R2 = 0.79 Demand ln(D/N) = -6.502 - (-57.909) 0.256*ln(E/N) (1.426) R2 = 0.07 Exports: lnXt = 1.566 + (2.154) 0.569*lnXt-1 + (4.046) 1.309*ln(SSRt) - (5.817) 0.167*lnPW/Px (-0.853) R2 = 0.85 Imports: lnMt = 0.592 - (0.284) 0.126*lnMt-1 - (-0.362) 1.004*ln(SSRt) - (-0.326) 0.639*lnPM (-1.171) R2 = 0.18

Other Latin America

 Area: lnAt = 5.779 + (3.201) 0.542*lnAt-1 + (3.782) 0.197*lnPx - (2.147) 0.109*lnT (-2.957) R2 = 0.85 Yield: lnY = 0.026 + (0.782) 0.084*lnT (6.364) R2 = 0.60 Demand ln(D/N) = -6.782 + (-21.414) 0.824*ln(E/N) - (1.279) 0.294*lnT (-6.034) R2 = 0.72 Exports: lnXt = 3.135 + (4.869) 0.281*lnXt-1 + (1.960) 0.141*lnSSR + (0.370) 0.496*lnPw/Px (4.566) R2 = 0.58 Imports: lnMt = 2.666 + (3.865) 0.433*lnMt-1 + (3.077) 0.479*ln(SSR) - (0.542) 0.744*lnPM (-2.824) R2 = 0.58

Turkey

 Area: lnAt = 13.783 - (9.165) 0.159*lnAt-1 + (-1.222) 0.528*lnPt + (3.329) 0.174*Dummy2 - (3.448) 0.108*Dummy3 (-1.846) R2 = 0.97 Yield: lnY = -0.716 + (-8.445) 0.227*lnT (6.969) R2 = 0.64 Demand ln(D/N) = -6.043 + (-63.852) 0.286*ln(E/N) - (1.339) 0.151*Dummy1 (-1.734) R2 = 0.13 Exports: lnXt = 2.913 + (3.173) 0.365*lnXt-1 - (1.833) 0.019*ln(SSR) + (-0.086) 0.143*lnPw/Px (0.454) R2 = 0.12 Imports: lnMt = 7.863 + (2.721) 0.480*lnMt-1 + (3.919) 0.670*ln(SSR) - (1.062) 3.510*lnPM (-2.240) R2 = 0.85

Other Near East

 Area: lnAt = 2.589 + (1.465) 0.756*lnAt-1 + (4.471) 0.042*lnPx (0.584) R2 = 0.69 Yield: lnY = -0.245 + (-3.323) 0.135*lnT (4.752) R2 = 0.45 Demand ln(D/N) = -8.793 + (-36.879) 0.691*ln(E/N) (2.340) R2 = 0.16 Exports: lnXt = 1.768 + (3.517) 0.275*lnXt-1 + (1.547) 0.832*ln(SSR) + (1.894) 0.544*lnPw/Px (1.809) R2 = 0.36 Imports: lnMt = 0.783 + (2.455) 0.761*lnMt-1 - (10.217) 0.291*ln(SSR) + (-2.386) 0.016*lnPM (0.190) R2 = 0.92

China

 Area: lnAt = 3.602 + (2.246) 0.759*lnAt-1 - (7.076) 0.311*lnPx (-1.480) R2 = 0.85 Yield: lnY = 0.893 - (9.225) 0.115*lnT - (-3.470) 0.070*Dummy1 - (-0.881) 0.214*Dummy2 (-4.771) R2 = 0.66 Demand ln(D/N) = -5.553 + (-53.692) 0.549*ln(E/N) (9.089) R2 = 0.75 Exports: lnXt = 0.294 + (0.655) 0.908*lnXt-1 + (8.809) 0.720*ln(SSR) + (1.555) 0.108*lnPw/Px (0.346) R2 = 0.76 Imports: lnMt = 0.523 + (1.120) 0.818*lnMt-1 + (5.291) 1.322*ln(SSR) - (1.780) 0.020*lnPM (-0.181) R2 = 0.61

India

 Area: lnAt = 12.601 + (50.235) 0*lnAt-1 + (1.788) 0.016*lnPx + (0.242) 0.028*Dummy1 - (0.482) 0.017*lnT (-0.473) R2 = 0.22 Yield: lnY = -0.381 + (-9.491) 0.205*lnT (13.237) R2 = 0.86 Demand ln(D/N) = -7.400 + (-116.50) 0.139*ln(E/N) (2.422) R2 = 0.18 Exports: lnXt = 3.745 + (5.450) 0.026*lnXt-1 + (0.161) 1.247*ln(SSR) + (2.887) 0.359*lnPw/Px (1.911) R2 = 0.31 Imports: lnMt = -1.924 - (-0.723) 0.057*lnMt-1 - (-0.055) 0.042*ln(SSR) - (-0.009) 0.154*lnPM (-0.286) R2 = 0.16

Other Far East

 Area: lnAt = 6.325 + (2.961) 0.510*lnAt-1 + (3.116) 0.205*lnPx (2.105) R2 = 0.55 Yield: lnY = -0.254 + (-4.408) 0.106*lnT (4.795) R2 = 0.45 Demand ln(D/N) = -6.771 + (-56.629) 0.011*ln(E/N) - (0.139) 0.085*lnT (-2.269) R2 = 0.53 Exports: lnXt = 4.711 - (5.238) 0.002*lnXt-1 + (-0.012) 1.220*lnSSR + (3.383) 0.143*lnPw/Px (0.686) R2 = 0.56 Imports: lnMt = 1.567 + (1.802) 0.721*lnMt - (5.799) 0.699*ln(SSR) - (-1.777) 0.160*lnPM (-0.560) R2 = 0.76

## 7. ADDITIONAL ESTIMATION RESULTS

An effort was made to estimate the required parameters for some other large countries that have an important position either in world tobacco production or consumption, such as Indonesia, Egypt, Algeria etc. The results however have not been promising and the model was maintained in its form with the parameters shown in section 6 above.