The TangleSerguei Popov October 1, 2017. Version 1.3AbstractIn this paper we analyze the mathematical foundations of IOTA, a cryp-tocurrency for the Internet-of-Things (IoT) industry. The main feature of thisnovel cryptocurrency is the tangle, a directed acyclic graph (DAG) for stor-ing transactions. The tangle naturally succeeds the blockchain as its nextevolutionary step, and o ers features that are required to establish a machine-to-machine micropayment system.An essential contribution of this paper is a family of Markov Chain MonteCarlo (MCMC) algorithms. These algorithms select attachment sites on thetangle for a transaction that has just arrived.1 Introduction and description of the systemThe rise and success of Bitcoin during the last six years proved that blockchain tech-nology has real-world value. However, this technology also has a number of drawbacksthat prevent it from being used as a generic platform for cryptocurrencies across theglobe. One notable drawback is the concept of a transaction fee for transactions ofany value. The importance of micropayments will increase in the rapidly developingIoT industry, and paying a fee that is larger than the amount of value being trans-ferred is not logical. Furthermore, it is not easy to get rid of fees in the blockchaininfrastructure since they serve as an incentive for the creators of blocks. This leadsto another issue with existing cryptocurrency technology, namely the heterogeneousnature of the system. There are two distinct types of participants in the system, thosewho issue transactions, and those who approve transactions. The design of this sys-tem creates unavoidable discrimination of some patricipants, which in turn createsa.k.a. mthcl; author’s contact information:

[email protected] icts that make all elements spend resources on con ict resolution. The afore-mentioned issues justify a search for solutions essentially di erent from blockchaintechnology, the basis for Bitcoin and many other cryptocurrencies.In this paper we discuss an innovative approach that does not incorporate blockchaintechnology. This approach is currently being implemented as a cryptocurrency callediota [1], which was designed speci cally for the IoT industry. The purpose of thispaper is to focus on general features of the tangle, and to discuss problems that arisewhen one attempts to get rid of the blockchain and maintain a distributed ledger.The concrete implementation of the iota protocol is not discussed.In general, a tangle-based cryptocurrency works in the following way. Insteadof the global blockchain, there is a DAG that we call the tangle. The transactionsissued by nodes constitute the site set of the tangle graph, which is the ledger forstoring transactions. The edge set of the tangle is obtained in the following way:when a new transaction arrives, it must approve two1 previous transactions. theseapprovals are represented by directed edges, as shown in Figure 12. If there is nota directed edge between transaction A and transaction B, but there is a directedpath of length at least two from A to B, we say that A indirectly approves B. Thereis also the \genesis“ transaction, which is approved either directly or indirectly byall other transactions (Figure 2). The genesis is described in the following way. Inthe beginning of the tangle, there was an address with a balance that contained allof the tokens. The genesis transaction sent these tokens to several other \founder“addresses. Let us stress that all of the tokens were created in the genesis transaction.No tokens will be created in the future, and there will be no mining in the sense thatminers receive monetary rewards \out of thin air“.A quick note on terminology: sites are transactions represented on the tanglegraph. The network is composed of nodes; that is, nodes are entities that issue andvalidate transactions.The main idea of the tangle is the following: to issue a transaction, users mustwork to approve other transactions. Therefore, users who issue a transaction arecontributing to the network’s security. It is assumed that the nodes check if theapproved transactions are not con icting. If a node nds that a transaction is incon ict with the tangle history, the node will not approve the con icting transactionin either a direct or indirect manner3.1This is the simplest approach. One may also study similar systems where transactions mustapprove k other transactions for a general k 2, or have an entirely di erent set of rules.2Time always increases from left to right in each gure.3If a node issues a new transaction that approves con icting transactions, then it risks that othernodes will not approve its new transaction, which will fall into oblivion.2As a transaction receives additional approvals, it is accepted by the system witha higher level of con dence. In other words, it will be di cult to make the systemaccept a double-spending transaction. It is important to observe that we do notimpose any rules for choosing which transactions a node will approve. Instead, weargue that if a large number of nodes follow some \reference“ rule, then for any xednode it is better to stick to a rule of the same kind4. This seems to be a reasonableassumption, especially in the context of IoT, where nodes are specialized chips withpre-installed rmware.In order to issue a transaction, a node does the following:The node chooses two other transactions to approve according to an algorithm.In general, these two transactions may coincide.The node checks if the two transactions are not con icting, and does not approvecon icting transactions.For a node to issue a valid transaction, the node must solve a cryptographicpuzzle similar to those in the Bitcoin blockchain. This is achieved by nding anonce such that the hash of that nonce concatenated with some data from theapproved transaction has a particular form. In the case of the Bitcoin protocol,the hash must have at least a prede ned number of leading zeros.It is important to observe that the iota network is asynchronous. In general, nodesdo not necessarily see the same set of transactions. It should also be noted thatthe tangle may contain con icting transactions. The nodes do not have to achieveconsensus on which valid5 transactions have the right to be in the ledger, meaningall of them can be in the tangle. However, in the case where there are con ictingtransactions, the nodes need to decide which transactions will become orphaned6.The main rule that the nodes use for deciding between two con icting transactions isthe following: a node runs the tip selection algorithm7 (cf. Section 4.1) many times,and sees which of the two transactions is more likely to be indirectly approved by theselected tip. For example, if a transaction was selected 97 times during 100 runs ofthe tip selection algorithm, we say that it is con rmed with 97% con dence.Let us also comment on the following question (cf. [4]): what motivates the nodesto propagate transactions? Every node calculates some statistics, one of which is4We comment more on this at the end of Section 4.15Transactions that are issued according to the protocol.6Orphaned transactions are not indirectly approved by incoming transactions anymore7As mentioned above, there is a good reason to assume that other nodes would follow the samealgorithm for tip selection.3how many new transactions are received from a neighbor. If one particular node is\too lazy“, it will be dropped by its neighbors. Therefore, even if a node does notissue transactions, and hence has no direct incentive to share new transactions thatapprove its own transaction, it still has incentive to participate.After introducing some notation in Section 2, we discuss algorithms for choosingthe two transactions to approve, the rules for measuring the transaction’s overallapproval (Section 3, especially Section 3.1), and possible attack scenarios (Section 4).Also, in the unlikely event that the reader is scared by the formulas, they can jumpdirectly to the \conclusions“ at the end of each section.It should be noted that the idea of using DAGs in the cryptocurrency spacehas been around for some time, see [3, 6, 7, 9, 12]. Speci cally, [7] introduces theGHOST protocol, which proposes a modi cation of the Bitcoin protocol by makingthe main ledger a tree instead of a blockchain. It is shown that such a modi cationreduces con rmation times and improves the overall security of the network. In [9]the authors consider a DAG-based cryptocurrency model. Their model is di erentthan our model for the following reasons: the sites of their DAG are blocks instead ofindividual transactions; the miners in their system compete for transaction fees; andnew tokens may be created by block miners. Also, observe that a solution somewhatsimilar to ours was proposed in [6], although it does not discuss any particular tipapproval strategies. After the rst version of this paper was published, several otherworks that aim to create a DAG-based distributed ledger have appeared, e.g. [8]. Wealso reference another approach [2, 10] that aims to make Bitcoin micropaymentspossible by establishing peer-to-peer payment channels.2 Weights and moreIn this section we de ne the weight of a transaction, and related concepts. Theweight of a transaction is proportional to the amount of work that the issuing nodeinvested into it. In the current implementation of iota, the weight may only assumevalues 3n, where n is a positive integer that belongs to some nonempty interval ofacceptable values8. In fact, it is irrelevant to know how the weight was obtained inpractice. It is only important that every transaction has a positive integer, its weight,attached to it. In general, the idea is that a transaction with a larger weight is more\important“ than a transaction with a smaller weight. To avoid spamming and otherattack styles, it is assumed that no entity can generate an abundance of transactionswith \acceptable“ weights in a short period of time.8This interval should also be nite | see the \large weight attack“ in Section 4.4One of the notions we need is the cumulative weight of a transaction: it is de nedas the own weight of a particular transaction plus the sum of own weights of alltransactions that directly or indirectly approve this transaction. The algorithm forcumulative weight calculation is illustrated in Figure 1. The boxes represent trans-actions, the small number in the SE corner of each box denotes own weight, and thebold number denotes the cumulative weight. For example, transaction F is directlyor indirectly approved by transactions A;B;C;E. The cumulative weight of F is9 = 3 + 1 + 3 + 1 + 1, which is the sum of the own weight of F and the own weightsof A;B;C;E.Let us de ne \tips“ as unapproved transactions in the tangle graph. In the toptangle snapshot of Figure 1, the only tips are A and C. When the new transaction Xarrives and approves A and C in the bottom tangle snapshot, X becomes the onlytip. The cumulative weight of all other transactions increases by 3, the own weightof X.We need to introduce two additional variables for the discussion of approval algo-rithms. First, for a transaction site on the tangle, we introduce itsheight: the length of the longest oriented path to the genesis;depth: the length of the longest reverse-oriented path to some tip.For example, G has height 1 and depth 4 in Figure 2 because of the reverse pathF;D;B;A, while D has height 3 and depth 2. Also, let us introduce the notion ofthe score. By de nition, the score of a transaction is the sum of own weights ofall transactions approved by this transaction plus the own weight of the transactionitself. In Figure 2, the only tips are A and C. Transaction A directly or indirectlyapproves transactions B;D;F;G, so the score of A is 1+3+1+3+1 = 9. Analogously,the score of C is 1 + 1 + 1 + 3 + 1 = 7.In order to understand the arguments presented in this paper, one may safelyassume that all transactions have an own weight equal to 1. From now on, we stickto this assumption. Under this assumption, the cumulative weight of transaction Xbecomes 1 plus the number of transactions that directly or indirectly approve X, andthe score becomes 1 plus the number of transactions that are directly or indirectlyapproved by X.Let us note that, among those de ned in this section, the cumulative weight is(by far!) the most important metric, although height, depth, and score will brie yenter some discussions as well.51111311311412911 6131041113113117451214 9161333ABCDXEFFigure 1: DAG with weight assignments before and after a newly issued transac-tion, X. The boxes represent transactions, the small number in the SE corner of eachbox denotes own weight, and the bold number denotes the cumulative weight.6913113117ACgenesisBDEFGFigure 2: DAG with own weights assigned to each site, and scores calculated forsites A and C.3 Stability of the system, and cutsetsLet L(t) be the total number of tips in the system at time t. One expects that thestochastic process L(t) remains stable9. More precisely, one expects the process to bepositive recurrent, see Sections 4.4 and 6.5 of [11] for formal de nitions. In particular,positive recurrence implies that the limit of P[L(t) = k] as t!1 should exist andbe positive for all k 1. Intuitively, we expect that L(t) should uctuate around aconstant value, and not escape to in nity. If L(t) were to escape to in nity, manyunapproved transactions would be left behind.To analyze the stability properties of L(t), we need to make some assumptions.One assumption is that transactions are issued by a large number of roughly indepen-dent entities, so the process of incoming transactions can be modeled by a Poissonpoint process (cf. e.g. Section 5.3 of [11]). Let be the rate of that Poisson process.For simplicity, let us assume that this rate remains constant in time. Assume thatall devices have approximately the same computing power, and let h be the averagetime a device needs to perform calculations that are required to issue a transaction.Then, let us assume that all nodes behave in the following way: to issue a transac-tion, a node chooses two tips at random and approves them. It should be observedthat, in general, it is not a good idea for the \honest nodes“ to adopt this strategybecause it has a number of practical disadvantages. In particular, it does not o erenough protection against \lazy“ or malicious nodes (see Section 4.1 below). On theother hand, we still consider this model since it is simple to analyze, and may provideinsight into the system’s behavior for more complicated tip selection strategies.9Under an additional assumption that the process is time-homogeneous.7Next, we make a further simplifying assumption that any node, at the momentwhen it issues a transaction, observes not the actual state of the tangle, but the oneexactly h time units ago. This means, in particular, that a transac