Skip to main content
Log in

Using Machine Learning to Capture Heterogeneity in Trade Agreements

  • Research Article
  • Published:
Open Economies Review Aims and scope Submit manuscript

Abstract

This paper uses machine learning techniques to capture heterogeneity in free trade agreements. The tools of machine learning allow us to quantify several features of trade agreements, including volume, comprehensiveness, and legal enforceability. Combining machine learning results with gravity analysis of trade, we find that more comprehensive agreements result in larger estimates of the impact of trade agreements. In addition, we identify the policy provisions that have the most substantial effect on creating trade flows. In particular, legally binding provisions on antidumping, capital mobility, competition, customs harmonization, dispute settlement mechanism, e-commerce, environment, export and import restrictions, freedom of transit, investment, investor-state dispute settlement, labor, public procurement, sanitary and phytosanitary measures, services, technical barriers to trade, telecommunications, and transparency tend to have the largest trade creation effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Notes

  1. We recognize that this implies that the provisions' determination and complexity are endogenously determined. However, the theoretical and empirical determination of what provisions are included in a trade agreement is beyond the scope of this paper.

  2. See Ding and He (2004) for the similarities between K-means clustering and principal component analysis.

  3. These data are similar to the database used in the KBG study. The main difference is that these data cover more periods. For each agreement, there are hyperlinks to associated pdf documents that contain the terms of the agreement or modifications to the agreement.

  4. Appendix A lists all the countries used in the empirical analysis.

  5. Appendix B lists the bilateral agreements for each pair and the years in force.

  6. We also conduct the exact analysis done in the main body of this paper in the robustness section by removing extremely rare and common words and two-word phrases, and the main results still hold.

  7. Appendix C discusses the methods in detail.

  8. Although we assume that there exist K centers, we will eventually update this on the basis of the value of our loss function and the application in hand.

  9. We could also use seven clusters as indicated by the elbow method; however, as we will discuss, the four and five clusters provide a cleaner economic interpretation.

  10. Although we only report results with only positive trade flows in this paper, we have also estimated PPML with the inclusion of zeros. These results are available on request.

References

  • Anderson JE (1979) A theoretical foundation for the gravity equation. Am Econ Rev 69:106–116

    Google Scholar 

  • Anderson JE, Milot CA, Yotov YV (2011) The incidence of geography on Canada’s services trade. National Bureau of Economic Research, Cambridge, MA

    Book  Google Scholar 

  • Anderson JE, van Wincoop E (2003) Gravity with gravitas: a solution to the border puzzle. Am Econ Rev 93(1):170–192

    Article  Google Scholar 

  • Anderson JE, Yotov YV (2016) Terms of trade and global efficiency effects of free trade agreements, 1990–2002. J Int Econ 99:279–298

    Article  Google Scholar 

  • Baier SL, Bergstrand JH (2007) Do free trade agreements actually increase members’ international trade? J Int Econ 71(1):72–95

    Article  Google Scholar 

  • Baier SL, Bergstrand JH (2017) Economic Integration Agreements: Historical Database of Entry into Economic Integration Agreements, 1960–2000. Ann Arbor, MI. https://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/29762

  • Baier SL, Bergstrand JH, Clance MW (2018) Heterogeneous effects of economic integration agreements. J Dev Econ 135(C):587–608

    Article  Google Scholar 

  • Baier SL, Bergstrand JH, Feng M (2014) Economic integration agreements and the margins of international trade. J Int Econ 93(2):339–350

    Article  Google Scholar 

  • Baier SL, Yotov Y, Zylkin T (2019) On the widely differing effects of free trade agreements: lessons from twenty years of trade integration. J Int Econ 116(C):206–226

    Article  Google Scholar 

  • Bergstrand JH (1985) The gravity equation in international trade: some microeconomic foundations and empirical evidence. Rev Econ Stat 67(3):474–481

    Article  Google Scholar 

  • Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771

    Article  Google Scholar 

  • Breinlich H, Corradi V, Rocha N, Ruta M, Santos Silva JM, Zylkin Tx (2021) Machine learning in international trade research –evaluating the impact of trade agreements. Discussion Paper in Economics (DP 05/21), University of Surrey

  • Ding C, Xiaofeng H (2004) K-means clustering via principal component analysis. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 29

  • Eaton J, Kortum S (2002) Technology, geography, and trade. Econometrica 70(5):1741–1779

    Article  Google Scholar 

  • Egger P, Larch M, Staub KE, Winkelmann R (2011) The trade effects of endogenous preferential trade agreements. Am Econ J Econ Pol 3(3):113–143

    Article  Google Scholar 

  • Feenstra RC (2006) Advanced international trade: theory and evidence. Princeton University Press

  • Frankel J, Stein E, Wei S-J (1995) Trading Blocs and the Americas: the natural, the unnatural, and the super-natural. J Dev Econ 47(1):61–95

    Article  Google Scholar 

  • Frankel J, Stein E, Wei S-J (1997) Regional Trading Blocs in the World Economic System. Peterson Institute for International Economics, Washington, DC

    Google Scholar 

  • Friedman J, Hastie T, Tibshirani R (2001) The Elements of Statistical Learning. Vol. 1, 10. Springer Series in Statistics. New York: Springer

  • Hartigan JA, Wong MA (1979) Algorithm AS 136: A K-means clustering algorithm. J Roy Stat Soc: Ser C (appl Stat) 28(1):100–108

    Google Scholar 

  • Head K, Mayer T (2014) Gravity Equations: Workhorse, Toolkit, and Cookbook. In: Gopinath G, Helpman E, Rogoff K (eds) Handbook of International Economics, vol 4. Elsevier, Amsterdam, pp 131–195

    Google Scholar 

  • Hofmann C, Osnago A, Ruta M (2019) The content of preferential trade agreements. World Trade Rev 18(3):365–398

    Article  Google Scholar 

  • Horn H, Mavroidis PC, Sapir A (2010) Beyond the WTO? An Anatomy of EU and US Preferential Trade Agreements. World Econ 33(11):1565–1588

    Article  Google Scholar 

  • Kohl T (2014) Do we really know that trade agreements increase trade? Rev World Econ 150(3):443–469

    Article  Google Scholar 

  • Kohl T, Brakman S, Garretsen H (2016) Do Trade Agreements Stimulate International Trade Differently? Evidence from 296 Trade Agreements. World Econ 39(1):97–131

    Article  Google Scholar 

  • Mattoo A, Mulabdic A, Ruta M (2022) Trade creation and trade diversion in deep trade agreements. Can J Econ 55(3):1598–1637

  • McLaughlin PA, Sherouse O (2016) QuantGov: A Policy Analytics Platform. QuantGov, October 31

  • Melitz MJ (2003) The impact of trade on intra-industry reallocations and aggregate industry productivity. Econometrica 71(6):1695–1725

    Article  Google Scholar 

  • Orefice G, Rocha N (2014) Deep integration and production networks: an empirical analysis. The World Economy. 37(1):106–36

    Article  Google Scholar 

  • Rosen H (2004) Free Trade Agreements as Foreign Policy Tools: The US-Israel and US-Jordan FTAs. In: Schott JJ (ed) Free Trade Agreements: US Strategies and Priorities. Peterson Institute for International Economics, Washington, DC, pp 51–77

    Google Scholar 

  • Salton G, McGill MJ (1983) Introduction to Modern Information Retrieval. McGraw-Hill, New York

    Google Scholar 

  • Santos Silva JMC, Tenreyro S (2006) The log of gravity. Rev Econ Stat 88(4):641–658

    Article  Google Scholar 

  • Tinbergen J (1962) Shaping the World Economy. Twentieth Century Fund, New York

    Google Scholar 

Download references

Acknowledgements

We thank the Editor, George Tavlas, and an anonymous reviewer for excellent comments that have enhanced the paper significantly. We also thank Patrick McLaughlin, Oliver Sherouse, Robert Tamura, Gerald Dwyer, Michal Jerzmanowski, Steven Johnson, Samuel Standaert, Yamin Ahmad, seminar participants at Clemson University as well as participants at the 2019 Georgetown Center for Economic Research conference and the 12th Southeastern International/ Development Economics Workshop. All remaining errors are ours.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Scott L. Baier.

Ethics declarations

Competing Interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Countries Included in the Gravity Dataset

Albania, Angola, Argentina, Armenia, Australia, Austria, Azerbaijan, Bahamas, Bahrain, Bangladesh, Barbados, Belarus, Belgium, Benin, Bhutan, Bolivia, Bosnia and Herzegovina, Brazil, Brunei Darussalam, Bulgaria, Burkina Faso, Burundi, Cambodia, Cameroon, Canada, Cape Verde, Central African Republic, Chad, Chile, China, Colombia, Comoros, Congo, Costa Rica, Côte d’Ivoire, Croatia, Cyprus, Czech Republic, Democratic Republic of the Congo, Denmark, Djibouti, Dominica, Dominican Republic, Ecuador, Egypt, Equatorial Guinea, Estonia, Ethiopia, Fiji, Finland, France, Gabon, Gambia, Georgia, Germany, Ghana, Greece, Grenada, Guatemala, Guinea, Guinea-Bissau, Honduras, Hong Kong SAR (China), Hungary, Iceland, India, Indonesia, Iran, Iraq, Ireland, Israel, Italy, Jamaica, Japan, Jordan, Kazakhstan, Kenya, Korea, Kuwait, Kyrgyzstan, Lao People’s Democratic Republic, Latvia, Lebanon, Lesotho, Lithuania, Luxembourg, Macao, Macedonia, Madagascar, Malawi, Malaysia, Maldives, Mali, Malta, Mauritania, Mauritius, Mexico, Moldova, Mongolia, Morocco, Mozambique, Namibia, Nepal, Netherlands, New Zealand, Niger, Nigeria, Norway, Oman, Pakistan, Panama, Paraguay, Peru, Philippines, Poland, Portugal, Qatar, Romania, Russia, Rwanda, St. Kitts and Nevis, St. Lucia, St. Vincent and the Grenadines, São Tomé and Príncipe, Saudi Arabia, Senegal, Sierra Leone, Singapore, Slovakia, Slovenia, South Africa, Spain, Sri Lanka, Sudan, Suriname, Swaziland, Sweden, Switzerland, Syrian Arab Republic, Tajikistan, Tanzania, Thailand, Togo, Trinidad and Tobago, Tunisia, Turkey, Turkmenistan, Uganda, Ukraine, United Kingdom, United States, Uruguay, Uzbekistan, Venezuela, Vietnam, Yemen, Zambia, and Zimbabwe

Appendix B Trade Agreements by Year of Enforcement

Before 1970: European Economic Community (EEC) (1957), European Community (1958), European Free Trade Agreement (EFTA) (1960), EEC-Turkey (Ankara Agreement) (1963), Southern African Customs Union (1969)

1972: EFTA-EEC (Austria), EFTA-EEC (Norway), EFTA-EEC (Portugal), EFTA-EEC (Sweden), EFTA-EEC (Switzerland)

1973: Caribbean Community (CARICOM), EC-Iceland

1974: EC-Cyprus

1975: EEC-Israel

1981: Gulf Cooperation Council, EEC-Greece

1983: Australia–New Zealand FTA

1985: EEC–Spain, Portugal, USA-Israel

1988: Andean Community (Cartagena Agreement), Canada–US Free Trade Agreement (1988)

1991: Common Market for Eastern and Southern Africa (COMESA)

1992: Association of Southeast Asian Nations (ASEAN), Central European Free Trade Agreement (CEFTA), EC–Czech Republic, EFTA-Slovakia, EFTA-Turkey, European Union Treaty

1993: Armenia-Russia, Czech Republic–Slovakia, EC-Hungary, EFTA-Bulgaria, EFTA–Czech Republic, EFTA-Hungary, EFTA-Israel, EFTA-Poland, EFTA-Romania, EFTA-Slovakia, EU-Poland FTA, Russia-Azerbaijan, Russia-Belarus, Russia-Kazakhstan, Russia-Tajikistan, Russia-Turkmenistan, Russia-Uzbekistan

1994: Baltic Free Trade Agreement–Industrial FTA, CARICOM-Colombia, EC-Bulgaria, European Economic Area, North American Free Trade Agreement, Russia-Georgia, Russia-Kyrgyzstan, Russia-Ukraine

1995: COMESA, EC-Israel, EC-Latvia, EC-Lithuania, EC-Turkey, EFTA-Slovenia, Mercosur (Argentina, Brazil, Paraguay, Uruguay), Mexico-Bolivia, Mexico-Colombia-Venezuela, Mexico–Costa Rica, West African Economic Monetary Union (WAEMU)

1996: Armenia-Kyrgyzstan, Armenia-Moldova, Azerbaijan-Ukraine, Bolivia-Chile, Canada-Chile, Czech Republic–Estonia, Czech Republic–Israel, EC-Morocco, EFTA-Estonia, EFTA-Latvia, Kazakhstan-Kyrgyzstan, Mercosur-Bolivia, Mercosur-Chile, Turkey-Israel, Turkmenistan-Ukraine, Uzbekistan-Ukraine

1997: Armenia-Turkmenistan, Armenia-Ukraine, Canada-Israel, Czech Republic–Latvia, Czech Republic–Lithuania, Czech Republic–Turkey, EFTA-Lithuania, EFTA-Morocco, Estonia-Slovenia, Estonia-Ukraine, Georgia-Azerbaijan, Georgia-Ukraine, Hungary-Israel, Israel–Slovak Republic, Kyrgyzstan-Moldova, Latvia-Slovenia, Lithuania-Slovenia, Macedonia-Slovenia, Poland-Israel, Poland-Lithuania, Slovak Republic–Estonia, Slovak Republic–Latvia, Slovak Republic–Lithuania, Turkey-Hungary

1998: Chile-Mexico, EC-Estonia, EC-Tunisia, India–Sri Lanka, Kyrgyzstan-Ukraine, Mercosur–Andean Community, Pan Arab Free Trade Agreement (PAFTA), Turkey-Bulgaria

1999: Armenia-Georgia, CEFTA-Bulgaria, EC-Slovenia, Egypt-Jordan, Egypt-Morocco, Hungary-Estonia, Israel-Slovenia, Kyrgyzstan-Uzbekistan, Lithuania-Turkey, Poland-Latvia, Turkey-Estonia, Turkey-Macedonia, Turkey-Poland, Turkey–Slovak Republic, SICA (Costa Rica, El Salvador, Guatemala, Honduras, Nicaragua, Panama)

2000: Bulgaria-Macedonia, Central African Economic and Monetary Community (CEMAC), EC-Mexico, EC–South Africa FTA, EFTA-Morocco, Georgia-Kazakhstan, Georgia-Turkmenistan, Hungary-Latvia, Hungary-Lithuania, Mexico-Israel, New Zealand–Singapore, WAEMU

2001: Bosnia and Herzegovina–Croatia, East African Community (EAC), EFTA-FYROM, Guatemala-Mexico, Honduras-Mexico, Southern African Development Community, Turkey-Latvia, Turkey-Slovenia

2002: Armenia-Kazakhstan, Bulgaria-Israel, Central America–Dominican Republic, CARICOM–Dominican Republic, Chile–Costa Rica, EC-Croatia, EC-Jordan, EC-Macedonia, EFTA-Croatia, EFTA-Jordan, EFTA-Mexico, Eurasian Economic Community, South African Customs Union, Turkey–Bosnia and Herzegovina, Turkey-Croatia

2010:ASEAN-China, ASEAN-India, ASEAN-Japan, ASEAN–New Zealand–Australia, Canada-Peru, China–Costa Rica, China-Peru, EFTA-Albania, Eurasian Economic Community Customs Union, India-Korea, India-Nepal, India-Thailand, Japan-Vietnam, Peru-Singapore, Switzerland-Japan

2011: Malaysia–New Zealand, Turkey-Jordan

2012: Albania-Iceland, Albania-Norway, D-8 Preferential Trade Agreement, EFTA-Colombia, EFTA-Peru, EU-Korea, Japan-Peru, Korea-EU, Korea-Peru, Malaysia-Chile, Malaysia-India

Appendix C From Text Documents to Numerical Feature Vectors

For either clustering or classification analysis, the text documents must first be converted to a vector of real numbers. We follow a three-step procedure commonly employed in natural language processing literature to transform text documents into numerical feature vectors. The first step involves assigning integer identification for each word or a two-word combination, commonly referred to as tokenization. The trade agreement documents were tokenized using unigram (single word) and bigram counts (two-word phrases). The words for tokenization are defined as sequences of two or more alphabetic characters, excluding stop words, such as pronouns, articles, and prepositions that carry little meaning in differentiating one set of documents from another. We also remove punctuation, numbers, and white spaces. The second step is to count the number of occurrences of these tokens for each document in the collection of documents, commonly referred to as the corpus. The final step is to normalize each document, so it has a feature matrix of fixed size and to weight tokens that occur in the majority of documents with diminishing importance. We use the tf-idf scheme developed by Salton and McGill (1983) to obtain weights for each token.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baier, S.L., Regmi, N.R. Using Machine Learning to Capture Heterogeneity in Trade Agreements. Open Econ Rev 34, 863–894 (2023). https://doi.org/10.1007/s11079-022-09685-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11079-022-09685-3

Keywords

JEL Classification

Navigation