Skip to main content

Detecting DeFi securities violations from token smart contract code

Abstract

Decentralized Finance (DeFi) is a system of financial products and services built and delivered through smart contracts on various blockchains. In recent years, DeFi has gained popularity and market capitalization. However, it has also been connected to crime, particularly various types of securities violations. The lack of Know Your Customer requirements in DeFi poses challenges for governments trying to mitigate potential offenses. This study aims to determine whether this problem is suited to a machine learning approach, namely, whether we can identify DeFi projects potentially engaging in securities violations based on their tokens’ smart contract code. We adapted prior works on detecting specific types of securities violations across Ethereum by building classifiers based on features extracted from DeFi projects’ tokens’ smart contract code (specifically, opcode-based features). Our final model was a random forest model that achieved an 80% F-1 score against a baseline of 50%. Notably, we further explored the code-based features that are the most important to our model’s performance in more detail by analyzing tokens’ Solidity code and conducting cosine similarity analyses. We found that one element of the code that our opcode-based features can capture is the implementation of the SafeMath library, although this does not account for the entirety of our features. Another contribution of our study is a new dataset, comprising (a) a verified ground truth dataset for tokens involved in securities violations and (b) a set of legitimate tokens from a reputable DeFi aggregator. This paper further discusses the potential use of a model like ours by prosecutors in enforcement efforts and connects it to a wider legal context.

Introduction

Decentralized Finance (DeFi) refers to a suite of financial products and services delivered in a decentralized and permissionless manner through smart contractsFootnote 1 on a blockchainFootnote 2. Ethereum is a leading example of such a blockchain. The promoters of DeFi have proclaimed it to be the future of finance (Gapusan 2021), an assertion supported by an increase in its market capitalization of more than 8000% between May 2020 and May 2021 (Wintermeyer 2021). Unfortunately, criminal activity in the DeFi ecosystem has grown along with its value. As of August 2021, 54% of cryptocurrency fraud was DeFi-related, compared to only 3% in the previous year (CipherTrace 2021). Furthermore, a vast number of new DeFi projects are created daily and anyone is permitted to create them. Collectively, these present challenges for law enforcement. The volume of projects, coupled with the magnitude of criminal offenses, makes the development of an automated fraud-detection method to guide investigative efforts particularly critical.

Securities violations are a category of crime that affects the cryptocurrency space (Eversheds Sutherland Ltd 2018; Musiala et al. 2020; Podgor 2019). Securities violations refer to offenses related to the registration of securities and misrepresentation related to the purchase or sale of securities, including pyramid schemes and foreign exchange scams (FBI 2021). Preliminary empirical research on decentralized exchanges (one of DeFi’s core product offerings) points to the prevalence of specific types of securities violations on these platforms (such as exit scamsFootnote 3, advance fee fraudFootnote 4, and market manipulation) (Xia et al. 2021). Others have chronicled securities violations, such as Ponzi schemes involving decentralized applications (dApps) (Hu et al. 2021).Footnote 5 This limited empirical work suggests possible approaches for identifying general scam tokens and certain types of securities violations, such as Ponzi schemes, in the wider cryptocurrency universe. Although we acknowledge the realm of work using opcode-based features to identify malicious activity (see, for example, Santos et al. (2013)), research has not yet explored the automated detection of securities violations (a) defined generally (rather than specific types of securities violations such as scam tokens or Ponzi schemes), (b) across a broader subspace of the DeFi ecosystem (i.e., ERC-20 tokens on all DeFi platforms, instead of a single decentralized exchange), and (c) alongside detailed analyses of the ERC-20 tokens’ smart contract code. An automated approach is preferable because of the sheer volume of DeFi projects that exist and are being created.

Against this background, we seek to answer the following research questions: (1) Is a machine learning approach appropriate for identifying DeFi projects likely to violate U.S. securities laws?Footnote 6 (2) What are the reasons, at the feature level, for which such a model is or is not successful for this classification problem? This study presents and critically evaluates the first method for the automated detection of various types of securities violations in the DeFi ecosystem based on their token’s smart contract code, providing a tool that may identify starting points for further investigation. The contributions of this study are as follows.

  • We build a classifier to detect DeFi projects committing various types of securities violations. Our work is the first to expand existing machine learning-based classification models to encompass multiple types of securities violations.

  • We use and make available a new dataset of violations verified by court actions.

  • Our work is the first to prioritize the explainability of classification decisions in terms of opcode-based features.

Finally, our results contribute to the theory and practice of financial markets. Forecasting, detecting, and deterring financial fraud is critical for maintaining overall financial stability (Shams et al. 2021). In particular, “frauds harm the integrity of financial markets and disrupt the mechanism of efficient allocation of financial resources” (Shams et al. 2021). This is particularly pertinent in the cryptocurrency space (Shams et al. 2021), especially as these markets have become more entwined with traditional markets (Wang et al. 2022).

Decentralized finance (DeFi)

DeFi refers to a collection of financial products and services made possible by smart contracts built on various blockchains, most commonly the Ethereum blockchain. DeFi offers traditional financial products and services such as loans, derivatives, and currency exchange in a decentralized manner through smart contracts. DeFi is an open-source, permissionless system that is not operated by a central authority. Rather than transacting with one another through an intermediary, such as a centralized exchange, user interactions occur through dApps created by smart contracts on a blockchain (Schär 2021). This section describes our DeFi system model and briefly outlines its main components. Because it is the subject of our research, we focus our explanation on the Ethereum-based DeFi space, although DeFi exists on various blockchains.

Ethereum-based DeFi system model

Before explaining DeFi in more detail, we define our system model. The Ethereum-based DeFi system model can be conceptualized as a five-layer system consisting of network, blockchain consensus, smart contract, DeFi protocol, and auxiliary services layers (Zhou et al. 2023). The network layer is concerned with communicating data across and within the various layers. It involves several elements, including network communication protocols and the Ethereum network. In particular, it includes communication among Ethereum peers/nodes. The consensus layer refers to the consensus mechanism of the Ethereum blockchain (Zhou et al. 2023). At the time of our research, this was still Proof-of-Work (Wood 2021). The consensus layer also encompasses nodes’ actions that rely on the consensus mechanism such as “data propagation,” “data verification,” executing transactions, and mining blocks. Although the first two layers are implied in our research, this study primarily focuses on the smart contract layer. This includes the smart contract code that creates the ERC-20 tokens from which we derived our dataset, and which creates the dApps that use these tokens. The smart contract layer also includes transactions executed by smart contracts, the Ethereum Virtual Machine (EVM) state, and state transition upon the execution of DeFi transactions. The DeFi protocol layer refers to decentralized applications with which users interact, whereas the auxiliary service layer involves services that facilitate DeFi’s functioning, such as wallets and off-chain oracles. We describe these elements in more detail below and provide a visual representation in Fig. 1.

Fig. 1
figure 1

Ethereum-based DeFi system model. Five layers of the Ethereum-based DeFi system (from bottom to top): network layer, blockchain consensus layer, smart contract layer, DeFi protocol layer, and auxiliary services layer

Ethereum

Ethereum functions as a distributed virtual machine and is the platform on which most of the DeFi ecosystem currently operates. This study focuses on explaining the features of Ethereum that are most relevant to the functioning of DeFi.Footnote 7

In addition to holding balances, Ethereum accounts can store smart contract code and other information. A smart contract is a computer program that automatically performs certain actions when specific conditions—such as payments—are met (Narayanan et al. 2016). Smart contract code is immutable and publicly available on the blockchain. Smart contracts allow parties who do not trust one another to enter into contracts. Rather than trusting each other or a third party to execute the contract, smart contracts ensure that the terms will be executed as coded into the contract (Bartoletti et al. 2020a).

Before smart contracts are executed, they must be compiled for deployment and understanding by the Ethereum Virtual Machine (EVM). Once compiled, Ethereum smart contracts are represented as an array of bytes, often referred to as “bytecode”. The EVM is a stack-based environmentFootnote 8 with a 256-bit stack size. It reads bytecode as operational codes (opcodes), which are sets of instructions (from a set of 144 possible instructions). Opcodes include actions such as retrieving the address of an individual interacting with the contract, various mathematical operations, and storing information (Crytic 2021; Wood 2021).

Tokens

Many DeFi projects also have associated tokens created by smart contracts, which either entitle holders to something within the dApp (analogous to a video game’s in-game currency) or serve as “governance tokens.” For example, UNI is the governance token for the Uniswap decentralized exchange (Uniswap 2021). Another example is the SCRT token, which, in addition to being a governance token, is required to pay transaction fees on the Secret network (Secret Network 2021). Holders of governance tokens can vote on the future of projects and their voting power is proportional to the number of governance tokens they hold. Most non-NFT DeFi tokens on Ethereum follow the ERC-20 (Ethereum Request for Comment) standard, which facilitates interoperability among projects. The ERC-20 standard allows token capabilities, such as transferring among accounts, maintaining balances, and supplying tokens (BitcoinWiki 2021). In this study, we focus on ERC-20 tokens.

dApps

Developers create dApps that serve as the interfaces to execute these smart contracts. DeFi’s current core product offerings, including decentralized exchanges (dexes), lending products, prediction markets, insurance, and other financial products and services, are delivered through dApps (Hertig 2020). Table 1 lists the primary products that constitute the DeFi ecosystem.

Figure 2 illustrates the process of dApp creation and execution through smart contracts, using Uniswap (a popular dex) as an example. As shown in Figure 2, DeFi users must have a Web3 software wallet to hold DeFi tokens and interact with the dApps. These wallets can be considered akin to mobile banking applications and exhibit similar features (sending transactions, showing balances, etc.). However, unlike banking applications, users retain custody of their funds and can send transactions and execute other functions directly rather than through an intermediary institution (Ethereum 2021). Using cryptographic digital signatures, users approve connections to their Web3 wallets, “sign in” to dApps, and approve interactions with the smart contracts on these platforms through their wallet.

Fig. 2
figure 2

DApp creation and functioning. The first box contains an excerpt from the Solidity code for the exchange function of the popular dex Uniswap. The second box shows the same code compiled into EVM-readable bytecode. The third box shows the transaction that deployed this code to the Ethereum blockchain, thereby creating the dApp. The branch of Fig. 2 labelled “front-end” shows Uniswap’s user interface for a sample exchange operation of 645.49035 USDT to ETH. Below the exchange interface is a screenshot of the Web3 wallet MetaMask. To execute a transaction like the exchange depicted above, the user must connect their Web3 wallet to the relevant dApp via the wallet’s browser extension. From there, they can approve the transaction. After the transaction is executed, the user’s MetaMask wallet automatically reflects the new balances of these cryptocurrencies. On the back-end, the aforementioned exchange takes the form of bytecode (depicted on the “back-end” branch of the figure), which is executed by the EVM. The final box shows the hash of the executed transaction exchanging USDT for ETH. Full details of this exchange can be found at https://etherscan.io/tx/0x7febc16c960a177077ddf0562c9ba21ac9bd5585bacf969d88a6b678e756081a. The full Solidity source code for the Uniswap V2 smart contract can be found at https://etherscan.io/address/0x7a250d5630b4cf539739df2c5dacb4c659f2488d#code

Table 1 Overview of DeFi products

U.S. securities laws

An understanding of U.S. securities law is necessary before defining our DeFi threat model. DeFi has raised alarm in regulatory circles owing to concerns over the potential conflict of DeFi tokens with the existing U.S. securities laws (Blockchain Association 2019). U.S. securities are primarily governed at the federal level by the Securities Act and Exchange Act, although the Sarbanes-Oxley Act, Trust Indenture Act, Investment Advisers Act, and Investment Company Act are also relevant. The Securities and Exchange Commission (SEC) and Financial Industry Regulatory Authority (FINRA) enforce these laws (Practical Law Corporate & Securities 2021).

The Securities Act pertains to the offering and sale of securities. One of the key provisions charged in cryptocurrency cases is Section 5, which requires the registration of the offer and sale of securities and stipulates specific provisions thereof (Practical Law Corporate & Securities 2021).Footnote 9 Other sections detail the required registration informationFootnote 10, and exemptionsFootnote 11 (Practical Law Corporate & Securities 2021 ). Various SEC enforcement actions have successfully argued that ERC-20 tokens are securities (see, for example, Securities and Exchange Commission v. LBRY (2022)), arguing that they constitute investment contracts (U.S. Securities and Exchange Commission 2019). For further details on the application of the Howey Test (the SEC’s criteria for determining whether a digital asset constitutes a security), see (U.S. Securities and Exchange Commission 2019)).

The Exchange Act specifies the reporting requirements of public companies and regulates securities trading through securities exchanges. It also oversees securities fraud. Under Sections 10(b) and 10b-5, fraud and manipulation in relation to buying or selling securities are illegal. One cannot make false or misleading statements (including omissions) in relation to the sale or purchase of securities, including those exempt from registration under the Securities Act. Sections 12 and 15 of the Exchange Act discuss the registration of securities, securities exchanges, brokers, dealers, and analysts (Practical Law Corporate & Securities 2021). Section 12 regulates the registration of initial public offerings, which is relevant for cryptocurrency initial coin offerings (ICOs).Footnote 12 Finally, Section 13 relates to companies’ reporting obligations under the Exchange Act (Practical Law Corporate & Securities 2021).

In practice, in addition to various registration and reporting violations, U.S. securities laws tend to cover the following fraudulent conduct: high-yield investment programs, Ponzi schemes, pyramid schemes, advance fee fraud, foreign exchange scams, and broker embezzlement (FBI 2021).Footnote 13 Financial frauds have been shown to impact financial stability, market integrity, and resource allocation and, in the case of cryptocurrency frauds, to have an impact on markets in traditional finance (Shams et al. 2021; Xin et al. 2018).

Threat model

We define our threat model in line with the U.S. securities laws described above. Considering this, an “incident” is any activity that is in violation of these laws, such as failing to register a token as a security or an ICO, or committing securities fraud. These actions may be the result of intentionally malicious behaviors or ignorance of the law. Both cases “result in an unexpected financial loss” for users (Zhou et al. 2023). While we do not have an estimate of all losses incurred as a result of DeFi securities violations, the Finiko Ponzi scheme, for example, took $1.1 billion from victims in 2021 and rug pulls stole $2.8 billion worth of funds from victims in 2021 (Chainalysis 2022). Vulnerabilities that could lead to such incidents exist at the smart contract, protocol, and auxiliary layers of the DeFi system. The smart contract layer includes both the creation of the ERC-20 token itself (in the case of registration violations) and any malicious elements coded into smart contracts such as Ponzi schemes, advance fee fraud, or certain types of exit scams. At the protocol layer, market manipulation is the primary attack vector (Zhou et al. 2023). Finally, at the auxiliary layer, both “operational vulnerability” (such as price oracle manipulation) and “information asymmetry” (such as smart contract honeypots) are observed (Zhou et al. 2023). Information asymmetry primarily occurs in securities fraud. Users are often unable to (or do not take the time to) analyze DeFi protocol smart contracts (and the related security risks) before allowing them to utilize their assets. Users’ “understanding of a contract operation” is more likely to come from project marketing materials than from the contract source code itself (Zhou et al. 2023).

This threat model involves several assumptions. The first assumption is that, based on the classification by the SEC of the tokens used to construct our dataset, the DeFi tokens in question are securities under U.S. law. As previously discussed, precedent has been established in this regard. Notably, this also means that many otherwise legitimate projects may operate contrary to U.S. securities laws because they are not appropriately registered. The second assumption is that, in the case of fraud coded into smart contract code, the developers of the DeFi tokens violating securities laws behave maliciously, rather than their violations being the result of errors. Therefore, patching or fixing smart contract code (as discussed in Rodler et al. (2021) and Ferreira Torres et al. (2022)) is not a suitable method for addressing this threat. Registration violations may result from malicious intent or naïve behavior.

The final assumption is that prevention measures have failed in these instances and, therefore, the primary course of justice is detection and prosecution. While prevention is certainly preferable, it is not possible to prevent all crimes. Therefore, detection and prosecution remain important ways to remedy the threats discussed above. Figure 3 shows a visual representation of the threat model.

Fig. 3
figure 3

Threat model. Threat model for violations of U.S. securities laws including attacker motivation, incident impact, elements of the system affected, attack vectors, mitigation, and assumptions

Related work on detecting fraud on ethereum

Previous studies have used machine learning to detect specific types of securities violations and fraud on Ethereum. The most common type of securities violation examined in the literature is the smart contract Ponzi scheme (Chen et al. 2021a; Cai et al. 2018; Chen et al. 2018; Fan et al. 2021; Hu and Xu 2021; Hu et al. 2021; Jung et al. 2019; Liu et al. 2022; Wang et al. 2021; Zhang et al. 2021). The various machine learning algorithms employed in these studies to identify such Ponzi schemes—and their relative performance—can be found in Table 2. We acknowledge the existence of various sequential machine learning studies in other contexts (many of which feature more sophisticated classification models). We also note the use of machine learning in cryptocurrency trading research (see, for example, Fang et al. (2022) for a comprehensive review and Sebastião and Godinho (2021) as an empirical example thereof). However, this review is limited to studies that apply sequential machine learning techniques to detect fraud on Ethereum, as the classification problems they seek to solve are most similar to ours.

Table 2 Related work detecting Ethereum Ponzi schemes

Previous studies on smart contract Ponzi schemes have examined code-based features (Chen et al. 2021a; Hu and Xu 2021; Fan et al. 2021), transaction-based features (Hu et al. 2021), or both (Wang et al. 2021; Liu et al. 2022; Jung et al. 2019; Zhang et al. 2021; Chen et al.  2018, 2019). Code-based features include the frequency with which each opcode appears in a smart contract and the length of the smart contract bytecode (Jung et al. 2019). Transaction- and account-based features refer to those such as the number of unique addresses interacting with a smart contract and the volume of funds transferred into and out of a smart contract (Jung et al. 2019). One study (Chen et al. 2021a) identified four specific Ponzi scheme typologies based on bytecode sequences.

In addition to smart contract Ponzi schemes, other studies that have examined smart contracts use machine learning to detect general fraud and scams (Chen et al. 2021b; Ibrahim et al. 2021; Lašas et al. 2020; Li et al. 2021; Xia et al. 2021; Fan et al. 2022); advance fee fraud (Wilder 2020); smart contract honeypots (Hu and Xu 2021; Chen et al. 2020); ICO scams (Karimov and Wójcik 2021; Wu et al. 2020); and “abnormal contracts” causing financial losses (Aljofey et al. 2022). Notably, one study (Aljofey et al. 2022) also used features based on contract source code as well as those based on opcodes and transactions.

Most existing studies in this field have examined Ethereum smart contracts in general, but some specifically refer to dApps and DeFi in their work (Fan et al. 2021; Hu et al. 2021; Wang et al. 2021; Li et al. 2021; Xia et al. 2021), although they do so to varying degrees, and occasionally conflate DeFi with Ethereum more broadly. Notably, one study (Hu et al. 2021) used machine learning to classify different types of dApp smart contracts into various categories, including gaming, gambling, and finance.

Gaps and issues

Although the results of the studies described in Table 2 suggest that approaches of this nature can perform well at this task, several points of caution have also been raised. The literature concerning smart contract Ponzi scheme detection points to issues of overfitting owing to the imbalance of classifications in many datasets (Fan et al. 2021). Studies have addressed this issue using over- and under-sampling techniques (Chen et al. 2021a; Fan et al. 2021; Wang et al. 2021; Zhang et al. 2021). Other scholars (Chen et al. 2021a) have criticized the interpretability of results based on opcode features; that is, why the presence of certain opcodes points to criminality. Li et al. (2022) emphasize the importance of interpretability of machine learning models applied in financial contexts. However, these studies have given little consideration to whether machine learning techniques are necessary for this task, or superior to potentially simpler approaches. Although the applied methods undoubtedly show high performance, it is possible that similar metrics could be achieved without recourse to these types of techniques.

Another issue in previous studies is the repeated use of two particular datasets (Chen et al. 2018; Bartoletti et al. 2019). Of the studies cited above, four used the Bartoletti et al. (2019) dataset (Chen et al. 2021a; Hu and Xu 2021; Liu et al. 2022; Jung et al. 2019), two used the (Chen et al. 2018) dataset (Wang et al. 2021; Zhang et al. 2021), and one study combined both datasets and added additional data (Fan et al. 2021). Although using the same datasets may be helpful for comparing performance, it may be less useful in practice for combating fraud, with any shortcomings of these datasets having a polluting effect on the literature. In fact, when manually inspecting the dataset of Bartoletti et al. (2019), one study (Chen et al. 2021a) identified issues involving duplication and bias. Finally, although it is used to classify general fraud rather than Ponzi schemes, one article used proprietary company data (Li et al. 2021), which hinders the reproducibility and evaluation of the results.

The majority of existing related studies have used smart contracts in general, as opposed to ERC-20 token smart contracts. The only other study that specifically examined DeFi token smart contracts using machine learning is Xia et al. (2021). However, their work focused on scam tokens in general (rather than securities violations) and on a single dApp (the dex Uniswap).

Finally, literature reviews in this field highlight the importance of considering cybersecurity concerns and risk mitigation in cryptocurrency and blockchain research (Xu et al. 2019; Fang et al. 2022).

Aims of this paper

This study seeks to fill these gaps by (a) evaluating whether a machine learning approach is appropriate for identifying DeFi projects likely to engage in securities violations, (b) examining securities violations more comprehensively (rather than just scam tokens or Ponzi schemes), (c) investigating these violations across Ethereum-based DeFi rather than specific subspaces, such as decentralized exchanges, and (d) examining the code-based features identified by our model to better explain its performance. In addition, we develop an entirely new dataset of violating and legitimate tokens.

Method

This study derives methods from prior research on the detection of Ethereum smart contract Ponzi schemes, adapting an approach that performs well in that context (Jung et al. 2019) and applying it to DeFi projects engaging in securities violations. We built various classification models based on the features extracted from DeFi tokens’ smart contract code to classify the tokens into two categories: securities violations and legitimate tokens. Figure 4 illustrates the proposed method.

Fig. 4
figure 4

Methods for detecting securities violations from DeFi token smart contract code

Data collection

To answer our question of whether it can be determined that a project may be engaging in securities violations from its token’s smart contract code using machine learning, we first required ground truth sets of both securities violations and legitimate tokens. One source of this information is the token lists compiled by DeFi projects or companies around particular themes. One function of token lists is to help combat token impersonation and scams. Reputable token lists provide users with some assurance that the tokens that appear are not fraudulent. Uniswap posts lists contributed by projects in the community and users generally follow lists from projects that they trust (Uniswap 2020). The lists contain information such as project websites (important for avoiding phishing attempts), symbols, and smart contract addresses.

Securities violations

The Blockchain Association (BA), a lawyer-led blockchain lobbying organization, created a list of ERC-20 tokens subject to U.S. SEC enforcement actions.Footnote 14 At the time of our study, this list contained 47 tokens and these served as our ground truth for identifying projects engaged in securities violations. Many of these actions involve Initial Coin Offerings (ICOs), primarily in the context of companies or individuals failing to register their token as a security when required and/or making fraudulent misrepresentations in connection with the said token (e.g., Coinseed Token, Tierion, ShipChain SHIP, SALT, UnikoinGold, Boon Tech, and others) (U.S. Securities and Exchange Commission 2022). Other violations include Ponzi schemes (e.g., RGL) and market manipulation (e.g., Veritaseum) (Securities and Exchange Commission v. Natural Diamonds Investment Co., et al. (2019a), Securities and Exchange Commission v. Reginald Middleton, et al. (2019b)). The defendants in these cases are distinct, thereby supporting the independence of the tokens in our violation dataset. We acknowledge the limitations of using such a small set; however, to date, this set comprises all SEC actions involving DeFi tokens. Therefore, it provides a more credible ground truth dataset than searching for individual investment scams and securities violations on blockchain forums (as other datasets do, including (Bartoletti et al. 2019)), because it is more systematic and does not involve any subjective judgment of wrongdoing. Notably, the nature of the list itself highlights the need for a systematic detection method. Most of the actions were either derived from the U.S. government whistleblower program or were well-publicized scams, suggesting that enforcement is currently reliant on these sources (U.S. Securities and Exchange Commission 2021b).

Legitimate projects

The nature of the DeFi industry means that a substantial proportion of tokens may be of questionable validity, even if they have not been formally identified as violations. This poses a challenge when building a dataset of legitimate tokens. If we were to randomly sample all projects, it is likely that problematic tokens would be included, compromising our analysis. Therefore, we adopted an alternative approach that included only tokens for which we had some evidence of credibility. To do this, we used the token list maintained by the DeFi platform Zapper.Footnote 15 Zapper is a DeFi project aggregator that allows users to monitor their liquidity provision, staking, yield farming, and assets across different DeFi protocols. As of November 2021, Zapper had over one million users, $11 billion worth of transaction volume, and raised $15 million in venture funding (Zapper 2021). While they do not claim to provide financial advice to users, they make an effort to internally vet the projects they list, opting for those with audited contracts and reputable teams (zes 2020). To be clear, inclusion in this list does not provide any indication of the “quality” of a token—it is not analogous to a list of “blue chip” stocks—but simply an indication of authenticity. The Zapper list contains 2,146 ERC-20 tokens, which we used as the ground truth for the legitimate tokens. This may not be as representative of DeFi tokens in general as a random sample, but it is the best available source of tokens that have some marker of credibility.

Final dataset

We extracted the smart contract addresses for the ERC-20 tokens on both lists and combined them into a single dataset, with a binary indicator added to flag violations. This provided an initial dataset of 2,193 smart contract addresses. Seven tokens were present in both lists. These likely represent tokens that are otherwise legitimate but violated U.S. securities laws by failing to register as securities. Since the Blockchain Association list is verified by court actions, we removed these from our “legitimate” token set. This also shows that our dataset captures projects that occupy the “middle ground” with respect to legitimacy, rather than only at extremes of offending and non-offending. Thus, the final dataset consisted of 2186 tokens (47 of which were the subject to an SEC case in the U.S., where the individuals or company that created or marketed the token broke securities laws). Our final dataset (including the features described below) can be found here: https://osf.io/xcdz6/?view_only=5a61a06ae9154493b67b24fa4979eddb.

Features

Next, we used the Web3 Python package (web3.py. ethereum 2023) to extract the bytecode for each token in our dataset. We used an Infura node that allows users to interface with the Ethereum blockchain through nodes that the company runs, using their API.Footnote 16 Our classification features come from the token smart contract bytecode we collected. We opted to use only code-based features (rather than transaction-based features, for example), following other recent studies that achieved high levels of performance (including 100% precision and recall in Chen et al. (2021a)) in classifying Ethereum-based smart contract Ponzi schemes (Chen et al. 2021a; Fan et al. 2021; Hu et al. 2021). Furthermore, using only code-based features allows for classification as soon as smart contracts are deployed (Jung et al. 2019), rather than waiting to examine the characteristics of associated transactions, and permits the analysis of smart contracts with few transactions (Chen et al. 2021a).

For this initial analysis, our aim was to maintain the classifier’s simplicity and computational inexpensiveness. It is also the first classifier for Ethereum-based DeFi securities violations more broadly; hence, our aim was to obtain a baseline for this novel classification problem in order to determine whether it is suited to machine learning, rather than improving the state-of-the-art for previously addressed problems (as (Fan et al. 2021; Chen et al. 2021a) and others have done for smart contract Ponzi scheme classification).

The EVM bytecode can be computationally “disassembled” into its corresponding opcodes. This process is illustrated in Fig. 4. Following (Jung et al. 2019), we included a feature in our classifier for each opcode that appeared in our smart contracts, representing the frequency with which the opcode appeared in any given smart contract. We used the Pyevmasm Python package (Crytic 2020) to disassemble each contract’s bytecode into its equivalent opcodes and then used a counter to determine the number of times each opcode appeared in the contract.

Feature exploration

Prior to building any classification models, we used Elastic Net regression (with \(alpha=0.001\) and the data mean-centered and normalized using Scikit-learn’s StandardScaler)Footnote 17 to investigate the importance of these features. Table 3 lists the top 10 non-zero coefficients of the model.

Table 3 Feature exploration using Elastic Net regression

Overall, 55 features had non-zero coefficients in our Elastic Net regression model. However, none of these coefficients were particularly large. This is expected because each line of the Solidity code is ultimately translated into several opcodes, which means that multiple opcodes can capture the same behavior or action. Table 4 describes the opcodes listed in Table 3.

Table 4 Opcode descriptions (Crytic 2021)

Classification

First, we used a random forest classifier to determine whether a project was potentially engaged in securities violations. We chose a random forest classifier for the following reasons.

  1. 1

    Research involving data similar to ours achieved the best classification results with a random forest, compared with other classifiers ((Xia et al. 2021); precision: 96.45%, recall: 96.79%, F-1: 96.62%).

  2. 2

    While initial work on smart contract Ponzi schemes (Jung et al. 2019) has been optimized in later studies (for example, Fan et al. (2021)), our goal is to achieve a baseline of performance for classifying Ethereum-based DeFi securities violations. Previous work (Jung et al. 2019) found the random forest algorithm performed the best on their dataset, when compared with other standard classification algorithms (J48 decision tree and stochastic gradient descent).

  3. 3

    Given our primary goal to determine if machine learning methods are suitable for developing a classifier that is useful for law enforcement investigations, the use of a model with greater transparency and traceability is most informative. Prior research using machine learning techniques in financial applications (such as Li et al.’s (2022) research, which explores a novel approach to clustering using ten different financial datasets) also underscores the importance of feature interpretability in these contexts (Li et al. 2022).

Given the classification imbalance in our data, we used downsampling of the majority class to balance it with the minority class. Specifically, we randomly sampled 47 smart contracts from the majority class (i.e., from the \(n=2,139\) legitimate contracts) and ran a random forest classifier on the resulting balanced dataset (i.e., 47 violations versus 47 legitimate tokens). This procedure was repeated 100 times with different random samples and the average performance of these 100 iterations was reported.Footnote 18

For each iteration, we used 70% of our data to train our model and 30% for our test set, following previous studies on classifying smart contract Ponzi schemes (Wang et al. 2021). We calculated the accuracy, weighted precision, recall, and F-1 score to evaluate our model (Prellberg and Kramer 2020). We calculated the means of these metrics across 100 iterations to obtain the final performance scores. We analyzed the average feature importance across 100 iterations of our model and then built several subsequent models based on this information.

For the aforementioned reasons, we focused on random forest classification. To comprehensively answer our first research question regarding the suitability of machine learning for this classification task, we needed to build multiple kinds of models, including a simpler approach. Therefore, we also built logistic regression models using our data. Similar to our random forest classifier, we used downsampling across 100 iterations, a 70–30% train-test split, and calculated the accuracy, weighted precision, recall, and F-1 score metrics. After analyzing feature importance, we constructed further models using different sets of features.Footnote 19

Results

Classification

Random forest

Although we used the results of our Elastic Net-based feature exploration as input for some of our models, we performed further feature exploration with random forest models since none of the coefficients were notably large. We built our initial classification model using the frequency of all opcodes contained in our dataset (142 features), employing bootstrapped undersampling to evenly balance the classes in our dataset over 100 iterations. As is evident from the evaluation metrics shown in the top row of Table 5, we achieved satisfactory performance with this model compared with our baseline (50%). We then calculated the relative importance of the features included in the model. The results are listed in Table 6.

Table 5 Random forest model performance with undersampling
Table 6 Feature importance for random forest models with undersampling

Next, we built models using only the 10 features with the highest importance in our original model, the three most important features, and the single most important feature (CALLDATASIZE). The weighted precision, recall, F-1 score, and accuracy are presented in Table 5. We also built models with all the non-zero coefficients of our Elastic Net regression model (a total of 55 features), calculated the feature importance for this model, and then used this information to build models with the 10 features with the highest importance in this 55-feature model, the top three features, and the top feature (LT). Finally, we built a model using the top 10 non-zero coefficients in our Elastic Net regression model as our features. We assessed the feature importance for all subsequent models, as reported in Table 6. Table 7 lists the opcodes whose frequency in the smart contracts was determined to be of high importance to the models that were not previously described.

Table 7 Additional opcode descriptions (Crytic 2021)

Using the F-1 score as our primary metric, we achieved the best performance with RF2, which was our 10-feature model built using the top 10 features from our full-feature model (RF1). This model performed relatively well (F-1 score of 80%) compared with our baseline of 50%. Three features—the frequencies of CALLDATASIZE, LT, and CALLVALUE—were the most important across all random forest models except RF9.

Logistic regression

To answer our research question of whether machine learning is appropriate for identifying DeFi projects likely to be engaging in violations of U.S. securities laws, we built a logistic regression model to see if a simpler model could correctly classify our data. We used the same bootstrapped undersampling as we did when constructing our random forest models, subsequently calculated the feature importance, and built further models accordingly. We report the accuracy, and weighted precision, recall, and F-1 scores for these models in Table 8 and the feature importance for the top 10 features in Table 9. Table 10 lists the opcodes among the features reported in Table 9 that have not been previously described.

Table 8 Logistic regression model performance with undersampling
Table 9 Feature importance for logistic regression models with undersampling
Table 10 Additional opcode descriptions (Crytic 2021)

Overall, the logistic regression models performed worse than the random forest models. Using the weighted F-1 score as our primary metric, our best-performing logistic regression models were those with the most features, namely, our 142-feature (LR1, with an F-1 score of 73.8%) and our 55-feature models (LR4, with an F-1 score of 72.4%). The other logistic regression models performed closer to our baseline (50%).

There was little overlap in the most important features of our logistic regression and random forest models (except in those built with the top 10 features of our Elastic Net regression model). EXP was one of the most important features in certain random forest and logistic regression models (RF1, RF2, LR1, LR2, LR3). CALLVALUE was among the top 10 most important features for all our models, aside from those built using the top 10 non-zero coefficients of our Elastic Net regression model. CALLER, which was among the top 10 most important features for five of our random forest models, also had high levels of feature importance in logistic regression models LR4, LR5, LR6.

Using the weighted F-1 score as our primary metric, we achieved the best performance (an F-1 score of 80%) with RF2, which is a 10-feature random forest model. Therefore, this was the final model.

Opcodes

To better understand the performance of our final model, we compared the frequencies with which the 10 opcodes from our final model occurred in each of our classes (violations and legitimate tokens). A t-test was conducted to assess whether the average frequencies were significantly different. The findings from these comparisons are reported in Table 11, and they support the analysis of feature importance in our final model. The mean frequencies of each feature in our final model (reported in Table 11) were significantly different between the securities violations and legitimate token sets, with a much larger effect size for the most important feature (CALLDATASIZE) than for the other features.

Table 11 Mean comparisons of opcode frequencies and t-test results with Cohen’s d effect size

Analyzing solidity code

To better understand the top three features of our final model, we randomly selected five contracts from our set of securities violations and five contracts from our set of legitimate tokens and analyzed their Solidity code. The contracts, frequencies with which our model’s top three features occurred in their code, the version of Solidity in which their code was written can be found in Table 12.

Table 12 Features and Solidity version for contracts analyzed

We used Etherscan,Footnote 20 the Ethereum blockchain explorer, to obtain the Solidity code for each of these tokens. Next, we used RemixFootnote 21, an Ethereum Integrated Development Environment that allows users to write, compile, deploy, and debug Ethereum-based smart contracts, including in virtual environments, to analyze the code. We compiled and deployed each smart contract using the Remix virtual machine.

We used Remix’s “debugger” tool to analyze the transactions deploying each section of the compiled contracts. The debugger tool allows users to examine the opcodes for each transaction chronologically and highlights the corresponding line of the Solidity code for each opcode (each line of the Solidity code is compiled as several opcodes). It also provides information on the functions with which the transaction interacts, local Solidity variables, Solidity state variables, and other information (Remix 2022a, b). However, given that our goal was to better understand the features of our final classification model, our analysis focused on the opcode tool.

We examined all elements of the smart contracts involved in their deployment transactions. Each time one of our target opcodes appeared, we noted the specific aspect of the token smart contract and the corresponding line of the Solidity code.

Though it is difficult to ascertain patterns that may be picked up by our classifier from a visual examination of our code, four of our five violating contracts (Dropil, Tierion Network Token, OpportyToken, and Boon Tech) had the same line of code that was resolved to the CALLVALUE opcode in the SafeMath portion of the smart contract: library SafeMath {. In our legitimate token smart contracts, the CALLVALUE opcode was present in only one of the five token contracts (the OST contract) in the SafeMath part of the contract. SafeMath is part of the OpenZeppelin smart contract development library, which allows developers to import standard, vetted, and audited Solidity code, such as for ERC-20 tokens (OpenZeppelin 2023). The SafeMath library, in particular, provides overflow checks for arithmetic operations in Solidity; arithmetic operations in Solidity “wrap” on overflow, which can lead to bugs and which attackers could exploit (OpenZeppelin 2023). SafeMath solves this issue by reverting transactions that result in operational overflow (OpenZeppelin 2023).Footnote 22

Additionally, when subsequently specifying the implementation of SafeMath for various arithmetic operations, the violating tokens use the “constant” function modifier, as opposed to the “pure” modifier used in the legitimate token smart contracts.Footnote 23 These modifiers dictate whether a given function affects the global state of Ethereum. The use of “constant” indicates that no data from the function is saved or modified, while “pure” adds the attribute that the function also does not read blockchain data (Nabi 2022). While both attributes specify that the function will not write to the Ethereum state, in the case of “pure,” the function also does not read state variables (Modi 2018). The “pure” attribute, being stricter about state modification, provides stronger assurance that arithmetic operations resulting in overflow will not (incorrectly) modify the contract’s state. Legitimate token contracts would intuitively provide this additional security and specificity.

We previously noted that each line of Solidity code in a smart contract resolves to multiple opcodes. Our smart contract analysis highlights this point. In previous iterations of our model, JUMPDEST was among the most important features. Various lines of the Solidity code, such as Contract BasicToken is ERC20Basic }, involve both this opcode and CALLVALUE. However, we did not notice any distinctions in these lines of code between the violating and legitimate token classes.

CALLVALUE and LT were present numerous times in the aspects of the smart contracts we analyzed with the Remix debugger tool. However, based on our disassembly of these tokens’ bytecode, the full frequencies were not present in the portions of the smart contracts we analyzed. The CALLDATASIZE opcode was absent. However, we were unable to execute other transactions in the Remix virtual environment to analyze the entirety of the smart contracts (although we explored a significant portion thereof). Furthermore, given our aim of using code-based features to develop a classifier that can be used immediately upon the deployment of a token contract, these are the aspects that are most relevant for our purposes.

Comparing smart contracts

Given our conclusions regarding the implementation of a particular library in the code as a distinguishing factor between violating and legitimate token contracts, we sought to delve further into potential code reuse as a reason for the performance of our classifier. Prior studies have found that 96% of Ethereum smart contracts contain duplicative elements (although it is unclear if this is the case in the Ethereum-based DeFi ecosystem specifically) (He et al. 2019). In this sense, if legitimate projects borrow code from other legitimate projects and projects violating securities laws borrow from other violating projects, smart contracts within each class will have a high degree of internal consistency.

Cosine similarity for solidity code

We used cosine similarities to analyze code reuse among the Solidity code of the smart contracts that we analyzed individually. To do this, we tokenizedFootnote 24 the Solidity code using FastText (Meta Research 2023) and then calculated the cosine similarities between the vectors for each possible combination of the 10 contracts. The cosine similarity measures the level of similarity between two vectors and is bound to the range of −1 to 1. A cosine similarity of −1 means that the two vectors are perfectly opposite, 1 means they are identical, and 0 means that the two vectors are orthogonal to one another (Han et al. 2012). If code reuse were indeed a possible explanation, we would expect the difference between the cosine similarities within each class and those between violating and legitimate contracts to be more pronounced for violating smart contracts than for legitimate smart contracts. Subsequently, we compared the means of the cosine similarities for each token class. Our results are reported in Table 13.

Table 13 Cosine similarity of smart contract Solidity code

The Solidity code was not significantly different among our classes (at least as per the cosine similarity). We noted generally high levels of similarity among smart contracts. Because of the existence of token standards, this is as expected. Based on these results, it is unlikely that, in the case of these 10 contracts, code reuse explains our classifier’s performance.

Cosine similarity of feature-based vectors

Next, we compared the opcode frequencies for the smart contracts to one another using cosine similarity. This comparison was intended to assess whether the opcodes generated from the tokens’ Solidity code suggested code reuse could impact our classifier’s performance. In this experiment, we used the cosine similarity of the vectors of the features rather than the Solidity code itself, in contrast to our first cosine similarity experiment, which revealed significant similarities among ERC-20 token smart contracts owing to code and token standards. We hypothesized that using opcode frequencies would better capture the nuances in the code. We converted the frequencies of the opcodes in each smart contract into vectors and compared them. We calculated the cosine similarity for each possible combination of (a) violating smart contracts, (b) legitimate smart contracts, and (c) violating and legitimate smart contracts (“inter-class”). For each set, we obtained the average of the calculated cosine similarities, the results of which are presented in Table 14. We also compared the cosine similarities for the set of violating contracts and the set of legitimate tokens with the interclass cosine similarities using t-tests. These results are also reported in Table 14.

Table 14 Cosine similarity of smart contract opcode frequencies

Our results show that legitimate smart contract opcodes are slightly more similar than violating contract opcodes (0.054 and 0.040, respectively). The violating contracts’ opcodes are less similar to each other than to legitimate contracts’ (at least per cosine similarity, which is 0.052 for the inter-class cosine similarity and 0.040 for the violating class). This suggests that there may be slightly more code reuse among legitimate contracts than violating ones. However, the cosine similarity for both groups was rather small, as were the effect sizes, suggesting that overall code reuse is unlikely to be the primary reason for our classifier’s performance (although this does not preclude the possibility of certain elements of the code, such as the use of the SafeMath library, being at least partially responsible).

Discussion

This study sought to determine whether it is useful to build a machine learning classifier to detect DeFi projects engaging in various types of securities violations from their tokens’ smart contract code. Governments are currently struggling to manage fraud in the DeFi ecosystem, particularly because these platforms do not require “Know Your Client” (KYC) information; hence, a classifier could serve as a triage measure. In addition, we created a new dataset with a verified ground truth. Our research is also novel in its use of the ERC-20 token smart contract code to attempt to detect fraud across the Ethereum-based DeFi ecosystem and its deeper analysis of the features of our final model. Finally, our research contributes to the existing body of work on financial markets as well as its practical applications in that predicting and detecting fraud is crucial for financial stability, market integrity, and market efficiency, particularly in the cryptocurrency space (Shams et al. 2021).

Ultimately, we found that DeFi securities violations were detectable. We developed a suitable starting point for this classification problem that performed significantly better than the baseline (80% F-1 score against a baseline of 50%). Our performance was not as high as that of other models for similar classification problems; however, as described below, this may be due to overfitting and the datasets used in other studies. Previously developed baseline models for related classification problems exhibited performance that was more consistent with ours.

Regarding the second research question, further analysis at the feature level indicated that the success of our model may, in part, be related to the state-based attributes of the functions in the SafeMath library.

Comparisons with prior research

Because our study is the first to attempt to classify DeFi securities violations more broadly, we were unable to compare our model’s performance with that of previous studies. We do note that other studies that successfully built high-performance classifiers for a related classification problem used more complex methods, which improved on several previous studies that addressed the same classification problem (Chen et al. 2021a). In addition, many previous studies, like ours, used data with imbalanced classes, but they did not always account for this in building their models. Fan et al. (2021) criticized previous studies, including (Jung et al. 2019), on this basis. This may have caused the high-performance metrics reported by other studies to be slightly misleading. For example, only 3.6% of smart contracts in the (Chen et al. 2018) dataset are Ponzi schemes.

In contrast to previous studies, our study considered whether machine learning classification is necessary for this task or whether a simpler model may suffice to solve the same problem. Previous studies have found that models built using logistic regression, for example, are much less effective than more complex models (Xia et al. 2021). We also found that this is true.

Chen et al. (2021a) noted the overall lack of interpretability of the results of classifiers with code-based features. The opcodes themselves have no obvious interpretation with respect to illegal activity, but, equally, there are no opcodes that offer a straightforward interpretation-i.e., there is no “STEAL” opcode or similar. The opcodes whose frequencies constituted the features of our final model were CALLDATASIZE, LT, CALLVALUE, SWAP3, EXP, CALLER, SHR, NUMBER, PUSH5, and ADDRESS. The only method to draw definitive conclusions from opcode-based features is to dissect the Solidity code from which they were compiled. Although it would have been impractical to dissect all 2,186 smart contracts in our dataset, we gleaned some insights about our three most important features from a selected subset. Specifically, we observed that developers of violating tokens may implement the SafeMath library differently in their code. In particular, they appear to use the “constant” modifier when describing how arithmetic operation overflows should be handled, which offers weaker assurances about the lack of state modification by these functions. It is, therefore, intuitive that the use of the “pure” function would be associated with the legitimate tokens in this case. However, we note that our analysis of transactions deploying the compiled Solidity code for certain contracts does not capture all opcodes whose frequencies were among the top 10 most important features in our model. Future research could also utilize other frameworks for analyzing token transfer behavior from token bytecode, such as TokenAware (He et al. 2023). This would be particularly useful in instances in which contract Solidity code is not publicly available.

The use of code-based features

As in previous studies (Jung et al. 2019; Chen et al. 2021a), we emphasized the usefulness of our classifier immediately upon the deployment of the smart contract to the Ethereum blockchain, regardless of how many wallets interact with it. This is one of the key advantages of using only code-based features for classification, rather than transaction- or account-based features. This makes such a model useful not only as a retroactive tool for investigators, but also for preventing future fraud. This also enables investigators to monitor projects that may engage in future securities violations. Because investigations and prosecutions take a long time (up to several years for complex cases), it is important for prosecutors to be able to gather evidence as early as possible. However, we acknowledge that code-based analysis is merely one of many techniques and future research could explore alternatives.

Potential applications of our model

Our results highlight the importance of exploring the use of computational triage systems in the enforcement process. This is particularly important given that U.S. enforcement agencies seem to rely heavily on submissions to their whistleblower programs (Commodity Futures Trading Commission 2019), so a computational model could reduce reliance on whistleblowers and avoid the government needing to pay out a portion of the funds successfully recovered to whistleblowers (which can be millions of dollars (U.S. Securities and Exchange Commission 2021a)).

A classifier of the type we examined here would be more useful as a triage measure rather than as a source of evidence because of issues surrounding the admissibility of machine learning-generated evidence in U.S. courts and the risk of misclassification. There may be questions about its admissibility under the Fifth Amendment, Sixth Amendment, and Federal Rules of Evidence; however, legal scholars do not ultimately consider these as impediments to its admission (Nutter 2018).Footnote 25 However, even if this is admissible, questions remain about its weight in court. In particular, explaining such evidence to a judge and jury (especially the “black box” calculations involved in developing machine learning models) may lead to it being discounted. There is variation in the levels of trust among jurors in machines in general, and jurors must further trust expert testimony, which explains machine learning tools (Nutter 2018). This is exacerbated by the need for prosecutors to explain complex concepts related to cryptocurrencies in these cases. A machine learning-based tool would likely lead investigators to more compelling transaction-based or qualitative evidence (e.g., marketing material), which can be more easily understood by a judge and jury, and has been effective in prosecuting cryptocurrency-based financial offenses in the past (see United States v. Constanzo (2018), United States v. Murgio (2016)).

Considering the ambiguity around the Hinman standardFootnote 26 in determining whether a project is sufficiently decentralized to avoid being classified as a security, a machine learning model could also serve as an additional tool in developers’ arsenal for determination thereof. Finally, a machine learning model may be useful for people interested in participating in the DeFi ecosystem to research the validity of new projects to help protect themselves from fraud.

Limitations and future research

Our study has several limitations. The first is the potential to overfit the model, particularly in the case of imbalanced data (Fan et al. 2021). Because we do not have a separate dataset with known labels on which to test our model, overfitting could remain an issue despite our mitigation efforts. Ultimately, we chose a verified ground truth dataset that was significantly smaller than our set of legitimate tokens. We do acknowledge that our dataset may make this classification problem simpler than it is in reality. We chose a list of generally reputable projects for our legitimate token set and those subject to government enforcement action for securities violations. Given the experimental nature of DeFi (and the high risk appetite of its participants), and the initial inclusion of a few violating tokens in the legitimate set, this set is likely to capture projects that exist in the middle, rather than at only extremes of offending and non-offending, which may be harder to classify. However, it would still be useful for future research to develop more datasets of DeFi securities violations to further test and refine our models using more advanced sequential machine learning techniques. As the number of verified violations increases, future research could also explore DeFi securities violations using more granular classes. These classes could show different patterns for various types of securities violations and further aid in prevention and detection efforts. It would also be useful for future research to compare the use of code-based features with models using account-based features, or a combination thereof.

Given the seeming importance of the implementation of the SafeMath library, in part, to our classifier’s performance, it may be the case that this classifier is less effective in classifying tokens created with Solidity versions 0.8.0 or later. However, this code does not account for all the most important features. Future research should examine this issue using an expanded dataset.

There are some limitations around the use of a classifier like this in practice. Our future research will specifically explore how to use such a classifier to investigate and build a viable legal case. This will involve performing manual, in-depth analyses on flagged tokens when applying our classifier to other datasets and interactions with their smart contracts (similar to Xia et al.’s (2021) work on Uniswap scam tokens).

Chen et al. (2021a) raised the issue of bad actors using adversarial obfuscation methods to trick classifiers like the one we propose, and Li et al. (2022) acknowledged this risk in the broader context of anti-fraud measures. We did not explicitly account for this possibility in building or model, nor did we test our model against known obfuscation techniques. This could be a useful avenue for future studies.

Further analysis of violating contracts, for example, using methods for analyzing token operational behavior from bytecode (such as TokenAware, which has successfully been applied to discrete instances of fraud (He et al. 2023)) would be fruitful. Analysis of smart contract code using the methods we proposed in this study, on a much larger scale, could also prove useful. Manually analyzing the token smart contracts using Remix added important insights to this study in terms of interpreting the results of our machine learning models. There is value in extending this further, including in other studies on machine learning-based Ethereum fraud detection. We encourage other scholars to pursue this line of research using our publicly available dataset.

Finally, the jurisdictional focus of our research was limited in scope due to the nature of our dataset. Legislation related to cryptocurrencies is changing frequently. In particular, the European Union has recently passed the Regulation on Markets in Cryptoassets (MiCA)(Scicluna and Debono 2023)Footnote 27. A global approach incorporating legislation from multiple jurisdictions is a potential future goal. However, this would bring challenges because the legality of particular applications—equivalent, in technical terms, to the labels of training data—may vary across jurisdictions and change over time. Overcoming this challenge may require the adaptation of approaches from other domains.

One further prospect is that the output of a classification system such as this may be useful in detecting flaws or risks in novel DeFi functions. While the model would be trained on violations that had previously been detected, it is possible that cases may be identified as risky even if they do not correspond to known flaws, but rather because they have underlying similarities to existing cases. In such situations, the manual inspection of cases identified as potentially fraudulent may offer insights into new forms of offending. This type of work contributes to preventing DeFi fraud and the associated losses to individuals. Research on the prevention of such offenses is a crucial complement to the detection work presented in this study.

Conclusions

Our final model achieves good performance (80% F-1 score against a baseline of 50%) in classifying DeFi-based securities violations based on ten features from the projects’ tokens’ smart contract code: the frequencies with which the CALLDATASIZE, LT, CALLVALUE, SWAP3, EXP, CALLER, SHR, NUMBER, PUSH5, and ADDRESS opcodes occurred in the contract. We achieved a higher performance with this random forest model than with logistic regression models, leading us to conclude that this classification problem is well suited to machine learning. Our study is novel because it provides a deeper analysis of the opcode-based features responsible for the performance of our classifier. Although this does not account for all the features, the implementation of the SafeMath library in token smart contracts appears to play a role. Despite the apparent influence of this aspect of the token smart contract code on our classifier, the cosine similarity analyses did not suggest that the overall code reuse was the primary reason for its performance. Overall, a computational model like ours would be highly useful for investigators as a triage tool but could be circumvented by nefarious developers in the future. Therefore, it is important to augment the model as further DeFi projects engaging in securities violations are revealed. This study constitutes the first classifier of securities violations in the emerging and fast-growing Ethereum-based DeFi ecosystem and is a useful first step in tackling the documented problem of DeFi fraud. Our work also contributes a novel dataset of DeFi securities violations with a verified ground truth and connects the use of such a classifier with a wider legal context, including how law enforcement can use it from the detection to prosecution stages of a case.

Availability of data materials

The datasets generated and analysed during the current study are available in the OSF repository, Detecting DeFi Securities Violations https://osf.io/xcdz6/?view_only=5a61a06ae9154493b67b24fa4979eddb.

Notes

  1. Smart contracts are programs stored on a blockchain that automatically perform specified actions when certain conditions are met (Bartoletti et al. 2020a).

  2. A blockchain is a secure, decentralized database comprised of entries called blocks, which are cryptographically connected to one another through a hash of the previous block, thereby ensuring its security and resistance to fraud. In the case of cryptocurrencies, blockchains serve as a decentralized, distributed public ledger that records all transactions (Binance Academy 2021; Narayanan et al. 2016). In this sense, blockchains underpin the “decentralized” nature of “decentralized finance,” as they allow users to transact with one another in a trustless manner without the need for an intermediary financial institution.

  3. Exit scams, also referred to as “rug pulls”, involve developers of a project stealing all funds invested into their project (Kamps et al. 2022).

  4. Advance fee fraud refers to a scammer convincing a victim to transfer an amount of money in exchange for returning the original amount plus a premium. The fraudster simply takes the original funds (Trozze et al. 2022).

  5. DApps are the user interfaces of DeFi-based products and services.

  6. Developers write smart contracts in a high-level programming language called Solidity (Cai et al. 2018). Smart contracts are responsible for DeFi’s application infrastructure and creating cryptocurrency tokens.

  7. For further details on Ethereum, see the Ethereum Yellow Paper (Wood 2021).

  8. In stack-based programming, “all functions receive arguments from a numerical stack and return their result by pushing it on the stack.” These specific functions come from a set of pre-defined functions (Perkis 1994).

  9. For cryptocurrency case law involving the Securities Act, see (Securities and Exchange Commission 2021, 2019, 2018).

  10. Sections 7 and 10.

  11. Section 3, Section 4, Regulation S, Rule 144A, Regulation D, Rule 144, Rule 701, Section 28.

  12. For cryptocurrency case law involving the Exchange Act, see (Securities and Exchange Commission 2021, 2019, 2018).

  13. For definitions of these offenses, see (Trozze et al. 2022).

  14. https://tokenlists.org/token-list?url=https://raw.githubusercontent.com/The-Blockchain-Association/sec-notice-list/master/ba-sec-list.json.

  15. https://tokenlists.org/token-list?url=https://zapper.fi/api/token-list

  16. https://www.infura.io/.

  17. Elastic Net regression combines the ridge penalty (which reduces coefficients of correlated variables) and the lasso penalty (which chooses one of the correlated variables and eliminates the others). The alpha value sets this penalty, with \(alpha=0\) for full ridge regression and \(alpha=1\) for lasso (Hastie et al. 2023). We chose \(alpha=0.001\) and used Scikit-learn StandardScaler to pre-process our data to enable convergence of our model. The StandardScaler pre-processes the features in a dataset by “removing the mean and scaling to the unit variance” (scikit-learn developers 2023).

  18. Undersampling, combined with properly executed cross-validation, performs well on highly imbalanced datasets (Blagus and Lusa 2015). Whereas other works related to ours (Fan et al. 2021) used the Synthetic Minority Over-Sampling Technique (SMOTE) to train imbalanced data, we chose the more conservative undersampling method. The SMOTE combines majority class undersampling and minority class oversampling, and synthesizes additional data for the minority class (Chawla et al. 2002).

  19. We did not build any more complex machine learning models (like neural networks) to answer our first research question because our dataset was much smaller than those traditionally used to train deep learning models. Neural networks are much less interpretable than simpler machine learning models (Choi et al. 2020) and would therefore be less suitable for our purposes (where the results could potentially be involved in legal proceedings) in any case.

  20. https://etherscan.io/.

  21. https://remix.ethereum.org/.

  22. Notably, the SafeMath library was rendered superfluous by Solidity releases 0.8.0 and above (0.8.0 was released in December 2020 (Solidity Team 2020; Solidity Dev Studio 2020). We initially hypothesized that violating tokens could utilize older versions of Solidity rather than legitimate ones because of the lengthy nature of the U.S. justice process. However, further inspection of the Solidity code for each token in our sample revealed that they all relied on Solidity versions between 0.4.13 and 0.5.0 (as shown in Table 12), and all but Gladius include it in their code.

  23. In later versions of Solidity, the “constant” modifier was changed to “view” (The Solidity Authors 2023).

  24. “Token” is used here in the sense of natural language processing, to refer to a portion of text (i.e., a word). It differs from the use of “token” in the rest of this article.

  25. Though a complete discussion of the admissibility of machine learning evidence is outside of the scope of this paper, we provide a brief introduction here. The Fifth Amendment relates to an individual’s right to due process (this could arise in the context of the “black box” of machine learning calculations) and the Sixth Amendment includes the Confrontation Clause. This “black box” not only refers to inexplicable machine learning algorithms but also lay people’s lack of understanding about how these algorithms work. The Confrontation Clause requires experts to testify in person and submit to cross-examination. However, this is unlikely to be an issue, as the testimony of a machine learning expert should be satisfactory. The Federal Rules of Evidence around relevance, prejudice, and authenticity may be pertinent as well. Lawyers must further prove the accuracy of the evidence (Grossman 2021). Some argue that under Rule 702 and Daubert v. Merrell Dow Pharmaceuticals, machine learning evidence is admissible as expert testimony. Through Daubert, the court developed four considerations for evaluating expert testimony (Nutter 2018).

  26. The “Hinman standard” refers to William Hinman’s 2018 speech which considered the level of decentralization of a project critical to determining whether it should be classed as a security (Blockchain Association 2019).

  27. Currently, the only implemented EU regulations that apply to cryptocurrencies relate to money laundering. The EU’s securities and investment regulations do not currently apply (Kolinska 2022).

Abbreviations

DeFi:

Decentralized finance

dApps:

Decentralized applications

Opcode:

Operational code

EVM:

Ethereum virtual machine

Dexes:

Decentralized exchanges

KYC:

Know your client

SEC:

Securities and exchange commission

FINRA:

Financial industry regulatory authority

ICO:

Initial coin offering

ERC-20:

Ethereum request for comment

BA:

Blockchain association

SMOTE:

Synthetic minority over-sampling technique

References

Download references

Acknowledgements

The authors also thank Antonis Papasavva and Antoine Vendeville for their contributions to our code.

Funding

This work was funded by the UK EPSRC grant EP/S022503/1 that supports the Centre for Doctoral Training in Cybersecurity at UCL.

Author information

Authors and Affiliations

Authors

Contributions

AT: conceptualization, data collection, analysis, interpretation, and drafting the final manuscript. BK and TD: conceptualization, study design, and feedback on the manuscript. All authors have reviewed the final manuscript.

Corresponding author

Correspondence to Arianna Trozze.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Trozze, A., Kleinberg, B. & Davies, T. Detecting DeFi securities violations from token smart contract code. Financ Innov 10, 78 (2024). https://doi.org/10.1186/s40854-023-00572-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40854-023-00572-5

Keywords