Sei sulla pagina 1di 41

Big Players in Bitcoin: Estimating the

Association between Large Holders’ Activity


and Price Volatility

William Reginald

Master’s Thesis Final Draft

August 1st , 2018

Bitcoin is a peer-to-peer payments network maintained without any central author-


ity. I utilize the institutional structure of Bitcoin, namely the blockchain technology, to
study the relationship between price volatility and the activity of the biggest players.
Surprisingly, the overarching finding is largest holders are associated with lower price
volatility relative to those further down the top of the distribution. This finding is pre-
served when addresses are stratified by groups, based on their corresponding position
in the income distribution. Several extensions to the baseline model are presented and
limitations of the analysis are discussed. In addition, I develop a roadmap to further
progress the cryptocurrency literature, by describing the Continuous Income Distri-
bution, and I conclude with a discussion of applying empirical analysis to alternative
cryptocurrencies.

1
1 Introduction

Bitcoin was developed as a proposed solution to the double spending problem faced by all

cash systems. Previously, this problem necessitated a central authority to ensure all trans-

actions were sent from accounts with sufficient balances; in order to avoid individuals from

paying for multiple transactions, nearly simultaneously, with the same money. Nakamoto’s

primary contribution was developing a decentralized network protocol designed to maintain

security of the cash system (Nakamoto 2009). By aligning the incentives of users, software

developers and mining nodes who maintain the network, Nakamoto has created a global

community connected via a currency supported without a sovereign state. Ironically, the

paper was first published online shortly after the beginning of the 2008 Financial Crisis and

when the public’s trust in traditional financial institutions began to dwindle. Bitcoin has

since gained interest from many adopters, for a variety of use cases spanning remittances,

payments and speculation to money laundering and cybercrime.

Bitcoin recently experienced another meteoric price increase as its market capitalization

grew to over $300B USD, surpassing the traditional financial institution Bank of America and

nearly J.P. Morgan too1 . Subsequently, its price decreased substantially and as of July 2018

remains around $7000 USD. This aggregate volatility is reflected by short-term volatility,

as many days throughout the sample period the highest and lowest price Bitcoin was sold

at differed by over $500 or roughly 5%. In general, cryptocurrencies have been regarded

as more volatile than other historical risky financial instruments, such as internet stocks
1
https://coinmarketcap.com/currencies/bitcoin/

2
throughout the late 1990’s. The characteristics that could give Bitcoin underlying value,

such as user and merchant adoption rates, energy costs of production, the number of active

wallets, positive regulatory news, innovative use cases etc., do not vary frequently enough to

justify the rapid daily swings in price. Indeed, these short-term swings led Yermack (2013)

to conclude Bitcoin does not satisfy the unit of account criterion, implying it is a speculative

asset rather than a currency. The first transaction using Bitcoin exemplifies this: in 2010

someone purchased two pizzas for approximately 10000BTC, valued then around $30, which

would be worth around $70,000,000 in July 2018 and over $200,000,000 in December 20172 .

This paper aims to study the association between the actions of the richest addresses

and the short-term price volatility of Bitcoin. A common theme in the cryptocurrency

community is that the big players, informally known as Whales, are the characters often

behind price fluctuations. Many early adopters of Bitcoin acquired large quantities at low

prices throughout 2009 to 2012 and became immensely wealthy as the price recently sky-

rocketed. Consequently, I propose there are two transmission channels for their activity to

effect price:

• Direct: Taking large positions, enduring short-run losses, to influence the price in their
preferred direction (i.e. a large player with a bullish outlook could market-buy a large
quantity of Bitcoin, over-paying in the short-run to increase their overall wealth if the
price increases).

• Indirect: Signalling their outlook by engaging in transaction activity that appears


in Bitcoin’s blockchain (i.e. transferring large quantities across addresses could be
interpreted as a bullish or bearish sign to the broader market).

I develop an empirical model that relates the actions of the largest players, visible via Bit-
2
https://www.coindesk.com/he-paid-how-much-coindesk-releases-bitcoin-pizza-day-price-tracker/

3
coin’s blockchain, to price volatility to capture these effects. The main finding is that it is

the players below the very top of the distribution that have the highest degree of association

with increases in price volatility. Further, higher volatility is not monotonically associated

with more activity, as the busiest addresses do not necessarily have the largest, positive

coefficients. Rather, the timing of the activity seems to be more important, indicating that

brains may trump brawn for large players’ signals to reach the broader market. These re-

sults are robust to several measures of volatility, sources of price data and time frequencies

of activity. The baseline empirical procedure yields many interesting extensions and possible

applications to alternative cryptocurrencies.

The paper proceeds as follows: section two provides a background to Bitcoin, section

three presents the dataset and summary statistics, section four outlines the empirical model

and baseline results, section five discusses possible extensions and section six concludes.

2 Background

2.1 Institutional Setting

In simple terms, Bitcoin is comprised of several key components: the blockchain technol-

ogy, the Proof-of-Work (PoW) encryption process and the mining network (Nakamoto 2009).

The blockchain is a decentralized public ledger containing a record of all transactions and

balances between Bitcoin addresses. In a traditional banking setting, this would be equiva-

lent to the records of each account’s past transactions and balances. An address is simply

4
a 26-35 alphanumeric identifier, analogous to an email address where the inbox of emails

is equivalent to the addresses’ balance of Bitcoin. One of Nakamoto’s primary objective

was to reduce the trust and reliance the public places on traditional financial institutions.

His insight was to create an immutable, time-stamped, pseudo-anonymous and continuously

growing dataset, that is universally agreed upon and updated by the network. In the ag-

gregate, every node has a copy of the same data and trust in Bitcoin has been effectively

decentralized.

In order to maintain trust and align the incentives to maintain the network, Nakamoto im-

plemented the PoW process. Using cryptography, the existing blockchain can be repeatedly

updated and verified for correctness, in an efficient manner, yielding a virtually tamper-proof

system. The PoW process involves a injective and non-invertible hashing function, which

takes the existing blockchain and a proposed new block of transactions, that have been suc-

cessfully encrypted, as inputs. The function outputs an updated blockchain ready for the

next block of encrypted transactions to be appended. The underlying machinery is much

more complex, but this cursory explanation is sufficient to understand the role cryptography

plays in Bitcoin.

Thirdly, the mining network is a set of globally distributed nodes, that each compete

to find the solution to the hashing function, thereby officially recording the new block of

transactions (Nakamoto 2009). One only requires a computer with good internet access and

enough space to store the 150GB blockchain dataset to become a node in the network3 .The

node that first solves the computationally-intensive cryptography problem processes the new
3
Download Bitcoin’s core protocol here: https://bitcoin.org/en/download.

5
block, appends it to the updated blockchain, earns the fees associated with the transactions

recorded in the new block and also earn the newly minted Bitcoin. The latter compensation

method aligns the incentives of the miners and simultaneously forms an ingenious solution

for controlling inflation, from the perspective of Bitcoin as a currency system. The PoW

process can monitor the network capacity, known as the Hash Rate, and scale the degree

of computational difficulty. This ensures blocks are not created too quickly, which would

result in new Bitcoin entering circulation too fast as well. It is this feature that I view

akin to central banks controlling the money supply of their currencies. I find this structure

of Bitcoin as a compelling reason to view Bitcoin as a currency rather than speculative

asset. See Berentsen and Schar (2018) for a comprehensive and approachable background to

cryptocurrencies, as it uses many diagrams to develop a more concrete understanding.

The most relevant features of Bitcoin are: price volatility, availability of data on the public

ledger and the limited supply of Bitcoin in circulation. Figure One plots two measures of

Bitcoin’s price volatility, throughout the sample period of July 2017 through June 2018. As

seen below, the volatility peaks in the December through January period, which corresponds

to the meteoric rise and fall of Bitcoin’s price. There are many instances where the daily

volatility exceeds $500 and occasionally $1000. During the most chaotic period, this daily

volatility often exceeded 10% of the lowest price of the day. The price volatility is treated

more rigorously in Section 3.3 but the primary takeaway is that Bitcoin’s price has been

subject to tremendous volatility.

6
Blockexplorers allow one to search the blockchain instantly for each address and those

related by transaction history, as well as their balance history. The pseudo-anonymity fea-

tures occurs because each address is publicly visible but there is no direct connection to who

owns the address. Software like blockexplorers imbue the community with a higher degree

of connectivity than in traditional banking settings, which supports my argument that the

actions of large players can propagate faster and induce a higher degree of price volatility.

The website Bitinfocharts.com uses software to query addresses in the blockchain by balance

size and provides an income distribution that is constantly updated as blocks are added.

There are many other data sources, for characteristics of Bitcoin relevant to other litera-

tures, publicly available and this ought to motivate the interest of more academic literature.

For example, there are data on the aggregate mining capacity of Bitcoin and the distribution

sub-divided by mining pool4 . This data could be relevant for an empirical industrial organi-

zation study of the mining network, its degree of competitiveness and the return of investing

in mining technology. Many other applications to political economy, economic development,

international finance and monetary policy can be explored via the available data.

Moreover, Bitcoin was designed such that the maximum amount that can enter the money

supply via mining is 21,000,000BTC. There is already over 17,500,000BTC in existing cir-

culation, implying that over time relatively much less Bitcoin enter the money supply with

each new block successfully appended (Nakamoto 2009, Blockchain.info). Since the income

distribution is heavily skewed towards the top, there are relatively much less Bitcoin enter-

ing the circulating supply. Further, Bitcoin costs much more than the prices faced by early
4
https://www.blockchain.com/en/pools

7
adopters, which implies the barriers to moving up the distribution are quite high. These fea-

tures support the argument that the early adopters’, who have primarily become the modern

larger holders, actions bear significant weight in influencing Bitcoin’s price fluctuations. As

an example, amidst the price decline of spring 2018 there were concerns of a large holder in

Japan taking 16,000 BTC out of cold storage5 . Roughly, this is currency that was effectively

taken out of circulation in Bitcoin’s early stages and is being reintroduced now, effectively in-

creasing the money supply by a non-trivial quantity and likely resulting in negative pressure

on its price.

2.2 Literature Review

Yermack (2013) provides preliminary academic analysis into viewing Bitcoin as an asset

or currency. His work also gives perspective to how far the cryptocurrencies have developed

in modern markets. Yermack concluded that Bitcoin failed to satisfy the three criteria of

currencies: medium of exchange, unit of account and store of value. This analysis was

based on a recent hack of the most prominent exchange, Mt. Gox, and the observation

that most other exchanges experienced low volume and liquidity. Berentsen and Schar

(2018) provide an updated outlook and risk analysis, concluding despite high price volatility,

cryptocurrencies are a new financial asset class and that the Blockchain technology will likely

have future implications beyond Bitcoin. Together, these contrasting papers highlight the

progression cryptocurrencies have made towards mainstream adoption. While each criterion

of being a currencies has not been entirely met, Bitcoin, and the cryptocurrency ecosystem
5
https://cryptoslate.com/tokyo-whale-prepares-8000-btc-dump-as-crypto-bloodbath-continues/

8
as a whole, have made immense strides forward.

As a cash system, Bitcoin and other decentralized payment networks have many implica-

tions for central banking and monetary economics. Chiu and Koeppl (2017) develop a general

equilibrium monetary model of Bitcoin and find that the resource intensive Proof-of-Work

process yields a welfare loss of consumption. The authors argue that a prominent alterna-

tive, the Proof-of-Stake (PoS) process could be implemented by Bitcoin to utilize resources

more efficiently. Indeed, the annual amount of electricity consumed by the Bitcoin network

is comparable to Denmark and larger than Greece6 . This is due to the computationally dif-

ficult task miners undertake when finding the solution to the hashing function. Specialized

mining rigs, known as Application-Specific Integrated Circuits (ASICs), have been developed

to reduce the immense energy consumption used in the mining process. Alternatively, PoS

proposes to allow for more weighting for validating transactions to be proportional to how

much holdings one has. Proponents of PoW argue that the energy consumption trade-off

should be favoured over PoS, as PoS concentrates the consensus mechanism at the top of the

income distribution. Consequently, smaller users and adopters are at the mercy of trusting

these large holders’ to maintain the network, rather than PoW relying on competition despite

high energy consumption. Moreover, Raskin and Yermack (2016) study the implications of

cryptocurrencies from the perspective of payments systems and conclude that the technol-

ogy is still yet to be fully developed for other uses. To summarize, central banks seem to

be exploring cryptocurrencies cautiously and future developments are required to appreciate

the true impact cryptocurrencies will have on the monetary system.


6
https://arstechnica.com/tech-policy/2017/12/bitcoins-insane-energy-consumption-explained/

9
Budish (2018) develops conditions to study the mining incentives and the susceptibility

of the Bitcoin network to a majority attack. This occurs when over 51% of the network’s

computing power is controlled by one group and an alternate version of the Blockchain, with

false transactions, is accepted by consensus as correct. This allows for double-spending,

as multiple recipients will be waiting for the same funds to arrive. The double-spending

problem is a serious drawback of the decentralized network, as a similar attack happened

to Bitcoin Gold in May 2018, the 26th largest cryptocurrency at the time7 . Other smaller

cryptocurrencies that have experienced a majority attack in the past year are Verge and

Krypton. With that said, Bitcoin has a much deeper mining network and estimates suggest

that one would need over $6.8 billion in mining equipment and would have to service a daily

electricity bill of 93 million kWh8 . Together, these immense costs align the incentives of the

attacker to maintain the network and preserve the value of Bitcoin, rather than manipulating

the blockchain to steal Bitcoin and cause the community to devalue it.

Furthermore, a key feature of the decentralized mining network is the consensus mecha-

nism, where nodes of the network agree on a common version of the blockchain and compete

to update that copy. Satoshi describes the mechanisms’ design as, “nodes can leave and

rejoin the network at will, accepting the proof-of-work chain as proof of what happened

while they were gone. They vote with their CPU power, expressing their acceptance of valid

blocks by working on extending them and rejecting invalid blocks by refusing to work on

them. Any needed rules and incentives can be enforced with this consensus mechanism”
7
http://fortune.com/2018/05/29/bitcoin-gold-hack/
8
https://gobitcoin.io/tools/cost-51-attack/

10
(Nakamoto 2009). Catalini and Gans (2018) view this consensus mechanism as a new mar-

ket design solution, as it facilitates transactions without a central platform operator. The

authors explain that the costs of verification and networking are reduced by this market

design and conclude that Bitcoin has the opportunity to advance the current approaches

to data, privacy and ownership. I argue that the consensus mechanism, and the associated

culture of information transmission, amplify the activity signals of the big players and their

propagation across the financial markets.

Moreover, Athey et al. (2016) study the market for Bitcoin, its price determination

and usage from the consumers’ perspective. The author’s conduct an empirical analysis of

the blockchain, using heuristics to match multiple addresses to the same wallet. Further,

the authors’ develop a model of the Bitcoin/USD exchange rate, equivalent to price, as

proportional to the ratio of volume to effective supply in circulation. The price implied by

their model matches some of the overarching trends of Bitcoin’s actual price but do not

capture any of the short-term volatility, shown in Figure Two. Two key implications of their

results are:

• Big players with dormant addresses9 could introduce a large quantity of Bitcoin into
the existing circulating supply. This would dramatically reduce velocity and increase
supply, putting downward pressure on the price. This mechanism is consistent with
both channels described above, as merely a transfer from a dormant address could
indicate the circulating supply will soon increase.

• One must incorporate other features into their model to account for the short-term
volatility in Bitcoin’s price.
9
A dormant address is an address contains a large quantity of Bitcoin, but has no recent transaction
history. The owner of the address could be holding long-term or have lost access, implying those Bitcoin are
effectively removed from the money supply, having the opposite effect as described.

11
Together, these observations motivate the study of high frequency data and formulating a

hypothesis, such as the activity of the big players, to capture the high degree of variability.

Zimmerman (2018) is a recent working paper also aimed at addressing Bitcoin’s short-

term price volatility. Zimmerman studies the fees paid by users for each transaction and

posits “a speculator with a stronger signal about future price will pay a higher fee in order

to trade more rapidly on the private information. This means that more extreme signals

are incorporated into the price more quickly, causing price volatility” (Zimmerman 2018).

The author develops a theoretical model with speculators and transactors, yielding the em-

pirical prediction: greater (less) demand for blockchain space reduces (increases) blockchain

capacity and raises (lowers) fees for transactions to be confirmed on the blockchain, causing

price volatility to be increased (decreased). In the data, this would be observed as peri-

ods with blocks using the highest amount of space are associated with high levels of price

volatility. The preliminary results are ironically the opposite; the coefficient on block size is

both negative and statistically significant10 . With that said, Zimmerman uses quite a long

sample, July 2010 through March 2018, during which Bitcoin evolved enormously and this

likely introduces noise into his estimates.

Zimmerman and I have consistent mechanisms, as the large players are likely to pay

relatively much higher fees for their transactions. This is because fees are priced in levels and

do not scale with the size of the transaction. Hence, as transactions and holdings increase,

fees are a relatively smaller share of the user’s portfolio. Thus, if large holders’ activity

is driving the price volatility and they are also likely to be paying relatively higher fees,
10
So the blocks with larger higher block size or usage were actually associated with lower price volatility.

12
then Zimmerman’s model may capture the same causal channel. Since blocks are processed

and appended to the blockchain approximately every ten minutes, the daily average used in

Zimmerman’s model is likely a noisy measure that’s obscures the results too. Two innovations

I employed to improve precision are using hourly data, which does not suppress the short-

term volatility like daily price measures, and studying a more relevant sample period. The

baseline empirical model employed in Zimmerman (2018) is given by:

P riceV olt = α + βBlocksizet + t (1)

This research project was originally motivated to extend the analysis of Gandel et al.

(2018), aiming to address a related question using modern data. The authors examine the

cryptocurrency exchange Mt. Gox, which experienced a hack and subsequent decline in

2014. The exchange’s data was dumped, yielding a matched dataset 18 million buy and

sell transactions, occurring between April 2011 and November 2013. This dataset includes

trading activity that is settled at the exchange level rather than being mined and recorded

on the blockchain, as well as User ID numbers. The latter is crucial, as this is a feature of

the exchange and allows the authors to match transactions to the same actor11 . The authors

aim to study the suspicious trading activity of two players and its association with Bitcoins’

price movements. The paper finds a large increase in the price of Bitcoin on the days each

player was active. The average daily price volatility was approximately 1% if neither player
11
Using only blockchain data, in order to link transactions one must employ heuristics, such as those used
in Athey et al. (2018). Further, these heuristics use probabilistic matching, whereas matching on User ID
results in near-perfect matches

13
was active and increased by 4% for each player, a result that is both economically and

statistically significant (Gandel et al. 2018).

Two things to bear in mind are that the Mt. Gox exchange comprised 80% of the total

Bitcoin transaction volume and the total market capitalization of Bitcoin was around $1

billion. Together, these features imply large actions on Mt. Gox could materialize directly

into price changes. In the modern Bitcoin market, there are many exchanges and no exchange

is responsible for a large share of the global volume. Moreover, due to exchange arbitrage, the

relative pressure from all other exchanges maintains any singular exchange from deviating

far from the global average. Consequently, I was motivated to look at the trading activity of

the largest players in the aggregate, to operationalize the long-held theory that large players

influence Bitcoin’s price volatility.

2.3 Research Question

The primary research question I study is there a high degree of association between

Bitcoin’s price volatility and the activity of big players? Upon analyzing the income dis-

tribution for Bitcoin, it is highly skewed and less than 0.1% of all addresses contain over

20% of the existing money supply. Consequently, the largest holders may influence the price

in two ways: directly, by putting upward or downward pressure via large trading positions

and indirectly, by signalling to market that a large quantity of Bitcoin has been moved. In

the latter case, this could be moved from cold storage or another dormant wallet, thereby

increasing the circulating supply of Bitcoin and creating downward pressure on the price.

14
This paper makes three contributions to the cryptocurrency literature. First, the dataset

of transactions for the wealthiest addresses is publicly accessible and readily available for

related analysis. Moreover, I have extended the analysis of Gandel et al. (2018) to more

recent data and utilized additional modelling specifications; namely multiple time frequencies

and dummy variables for many actors. Lastly, I have progressed the empirical literature on

cryptocurrencies by providing a roadmap for fruitful future research projects.

3 Dataset

3.1 Price Data

Taking an alternative approach to Zimmerman (2018) and Gandel et al. (2018), by using

higher frequency data, I developed an hourly time series of the price of Bitcoin in U.S.

dollars. Hourly exchange data is available on https://www.cryptodatadownload.com/ and I

computed a simple average of the hourly high and low prices across four major exchanges:

Kraken, Bitfinex, Coinbase and Bitstamp12 . These exchanges accounted for roughly 60% of

the total Bitcoin volume during the sample period13 . Further, due to the deep and liquid

Bitcoin market, the arbitrage opportunities across exchanges are relatively low and hence

these measures are likely quite representative of the hourly high and low prices of Bitcoin.
12
That is the highest and lowest price on each exchange e in period h, P riceHighe,h and P riceLowe,h .
13
https://data.bitcoinity.org/markets/volume/

15
3.2 Income Distribution and Transaction Data

The blockchain contains the complete record of balances and transactions between all

public addresses. Consequently, one can study the distribution of Bitcoin across addresses,

as well as track the activity and balances of addresses across time. Figure Three shows the

income distribution for the 73 largest holders included in the empirical analysis.

Note that users can make a new address for each transaction, so the bottom of the distri-

bution is likely over-stated. Further, the income distribution changes over time, as there are

very low barriers to transferring large quantities of Bitcoin. All income distribution data

is as of July 1st, 2018 and specifically block number 530063. In section 5.2, I discuss the

concept of a Continuous Income Distribution that links addresses that transacted a majority

of their balances together, to build a much richer transaction history.For example, the sixth

richest address, 183hmJGRuTEi2YDCWy5iozY8rZtFwVgahM, received 85,000BTC or ap-

proximately $500,000,000 on July 1st, 2018 and has no previous transaction history. The

address that sent the transaction, 1KAt6STtisWMMVo5XGdos9P7DBNNsFfjx7, has a much

richer transaction history and would be more representative in the analysis. This example

highlights that the results may be sensitive to the income distribution, as it changes over

time and can suppress previous activity. The sample period used in this analysis would in-

clude the former address, while future work would observe an income distribution containing

the latter address or potentially subsequent transactions.

Moreover, using a blockexplorer, the individual transaction data was extracted for each

16
address in the blockchain containing over 10,000BTC in balance. Each address corresponds

to a position in the distribution and the N th richest address corresponds to a dummy variable

defined as:



1, if transaction occurs in hour h


P layerNh = (2)


0, otherwise

Transactions recorded were limited to 10BTC minimum14 , as signals from big players likely

need to be of moderate size. Outlier addresses with large quantities of transactions15 were

excluded, as there were too many observations concentrated on a few players. Further work

handling the highly-active large players, namely exchanges, ought to be studied.

3.3 Final Dataset and Summary Statistics

The resulting dataset contains hourly observations for 73 active addresses, accounting for

approximately twelve percent of all Bitcoin in circulation16 , across the sample period of July

1st, 2017 through June 30th, 2018. There are 120 addresses with balances over 10000BTC

but the excluded addresses were not active over the sample period. Many addresses acquired

Bitcoin in its early stages and have remained dormant for five years or longer. As previously

argued, this feature reduces the circulating supply and likely amplifies the effects of the

actions of the large players. The key variable of interest, price volatility, was computed two
14
As of July 2018, this was worth approximately $65,000 USD.
15
Typically exceeding 300, but some addresses had over 10,000 transactions.
16
According to bitinfocharts.com and includes entire supply, the addresses contain a much larger share of
the circulating supply when you exclude lost Bitcoin and dormant wallets.

17
ways and in both levels and percent. The intra-hourly measure, equation (2), captures the

direct effect on price volatility from the actions of the big players, while the inter-hourly

aims to capture the indirect market signalling effect of the big players’ activity.

P riceHighh − P riceLowh
P riceV ol1h = (3)
P riceLowh
P riceHighh − P riceHighh−1
P riceV ol2h = (4)
P riceHighh−1

Table One summarizes the activity for each of the big players, as well as their balances

and other characteristics. As seen below, it is not the largest holders but rather the medium

ones that are the most active. Interestingly, several addresses with similar balances tend to

act in the same fashion, indicating they could belong to the same holder. Further analysis

akin to Athey et al. (2016) could be conducted to match some addresses to the same owner.

18
Figure 1: Two measures, introduced in Section 3.3, of the hourly price volatility in $USD of
Bitcoin through the sample period of July 2017 through June 2018. The numbers correspond
to the hourly observation at the beginning of each month, so the hourly observation number
4321 corresponds to December 1st .

voltime.png

19
Figure 2: Plots the price of Bitcoin from 2012-2015 and two results of the velocity models
from Athey et al. (2018).

AtheyVol.png

20
Figure 3: The distribution of Bitcoin across the 73 richest addresses, as of July 1st , 2018.

incomedist.png

21
Figure 4: Table containing the position, address name, number of transactions in the same
period, balance in Bitcoin and USD, as of July 1st , 2018.

table.png

22
4 Empirical Model and Results

4.1 Specification

The baseline model studied was inspired by Gandel et al. (2018) and the model specifi-

cation is given by:

RateChanget = β0 + β1 M arkust + β2W illyt + δControlst + ηt (5)

Their model includes controls for hacking events, which were not statistically significant and

are less relevant in the recent data. Both Zimmerman (2018) and this model relate a measure

of price volatility to a constant, capturing average price volatility, and a key variable that

encapsulates their respective empirical hypothesis. These models highlight the relatively

simple econometric techniques employed in the cryptocurrency literature. The estimated

coefficients in equation (5) represent the additional price volatility conditional on days the

respective player was active.

The baseline empirical model studied is given by:

73
X
P riceV olJh = γ0 + γk P layerk,h + ζh (6)
k=1

where γ0 represents the average hourly price volatility, conditional on none of the big players

being active. Under the empirical hypothesis, should be positive and relatively low. Further,

23
each γi represents the differential hourly price volatility conditional on the hourly observa-

tions player i is active; likewise each should be positive and significant. As evident in Figure

Three, the income distribution has a group structure, at different levels of address balances.

Equation (7) aims at capturing the association at the group level. The 6th , 25th , 45th and 73rd

mutually exclusive group cut-offs were selected based on similar balance size and robustness

checks around other cutoffs did not meaningfully change the results.

P riceV olJh = θ0 + θ1 T op6h + θ2 T op25h + θ3 T op45h + θ4 T op73h + νh (7)

4.2 Results

Figure Five contains the regression results for the hourly individual address data, across

four measures of price volatility. For the sake of brevity, I have omitted addresses whose

coefficients were not statistically significant. The general result is addresses tend to be

significant across all measures of volatility if they are significant. Secondly, the intra-day

measure in levels, shown in the second and seventh columns, has the most statistically

significant. The signs of the coefficients are stable across the two specifications, further

suggesting accuracy in the results. Lastly, the magnitudes of the intra-day volatility tend

to be larger than the inter-day, suggesting that the markets tended to fluctuation within

an hour significantly but tend to trend in the same direction overtime. This is because the

latter measure compares high prices across days and lower volatility implies the high prices

tended to move rather smoothly over time.

24
Figure 5: Regression estimates for the hourly frequency data, at the individual address level.

hrlyind.png

25
Regarding the signs of coefficients, surprisingly the largest addresses all have negative co-

efficients. Generally, most addresses have moderate size coefficients and are negative. In-

terestingly, some addresses with large positive coefficients are 24, 27, 40, 41, 50 and most

notably, 62. The constant in the first specification is 128, implying that the average hourly

difference between the high and low price sold across the prominent exchanges varied by

128$, conditional on none of the large players being active. Under the empirical hypothesis,

this can be viewed as the average price fluctuation absent any signals from the large player’s

trading activity. Considering that the price in levels varied immensely across the sample

and on average was approximately $10,000, this equates to around 1% and is in-line with

the estimates in Gandel et al. (2018). Each coefficient is interpreted as the differential price

volatility, conditional on the hours the address was active. For example, the coefficient on

address 50 implies that on average the high and low price varied by $460, an additional $330

compared to the sample average. Similarly, the coefficient in the eighth column, 0.0497,

implies that the average additional price volatility on the days the 50th actor was active was

5% higher than the average of 1.5% of the hourly low price.

Figure Six plots all 73 coefficients, with colours corresponding to the level of statistical

significance. Green implies the coefficient is significant at the 1% level, yellow at 5% and red

at 10%. When visualizing the results as a whole, there are not many large, green, positive

bars. This is further indication that perhaps a smaller subset of the largest players’ activity

is associated with a high degree of price volatility.

Figure Seven contains the regression output corresponding to equation (7), the group

26
specification. The same themes highlighted above are broadly present. The richest ad-

dresses maintain negative coefficients and middle groups tend to have the largest, positive

coefficients. An interpretation I have developed is that the largest addresses have consistently

negative coefficients due to lower price elasticity. Due to the addresses being the wealthiest,

they are the least sensitive to price changes and are associated with lower price volatility.

4.3 Robustness

For several robustness checks, I aggregated the transaction and price data to the 6-

hourly and daily time frequencies. Each address dummy variable now equals one if the

address was active during any of the hourly observations within each six hour observation.

The daily transaction data was defined the same way and the price volatility measures were

simply averaged for each new observation. Figure Eight contains the individual address

estimates at the 6-hourly frequency. Moreover, Figure Nine and Ten contain the group

estimates at the 6-hourly and daily frequencies, respectively. The main observations are

broadly consistent across these alternate specifications. The coefficients tend to be smaller

in magnitude, while maintaining sign and degree of statistical significance. One interesting

observation is that address 62 is highly positive and significant at the hourly frequency but

less so when aggregated by time level and across all measures of price volatility.

Another robustness check I conducted used alternate daily price data17 , to compare the

magnitudes to the aggregated data. Most coefficients maintained sign and significance, but

gained magnitude. I interpreted this as evidence that averaging the volatility measures
17
https://www.coindesk.com/price/

27
suppresses the price volatility.

28
Figure 6: A bar graph plotting the coefficients corresponding to column two and seven of
Figure Five. The dependent variable is PriceVol1, the intrahourly volatility measure in levels.

coefficients.png

29
Figure 7: The regression output corresponding to equation (7), aggregated into groups at
the hourly time frequency.

hourly.png

30
Figure 8: Regression estimates for the 6-hourly frequency data, at the individual address
level.

6hrindividual.png

31
Figure 9: The regression output corresponding to equation (7), aggregated into groups at
the 6-hourly time frequency.

6hrly.png

32
Figure 10: The group regression output aggregated at the daily time frequency.

daily.png

33
5 Discussion

5.1 Limitations and Shortcomings

A common theme across time frequencies is the trade-off between the frequency of ob-

servations and the level of aggregation of addresses. For example, one can only obtain group

estimates when the data is aggregated to the daily level. As most addresses are active

throughout a significant portion of the sample period, at the daily level the data is not

granular enough. Including all 73 addresses with only 365 daily observations induces multi-

colinearity across regressors and yield little statistical significance. When extending similar

analysis to a much broader set of addresses, one should pay attention to the balance of aver-

age number of observations per address to total time observations. On a related note, careful

attention should be exercised when aggregating the price data. The trade-off is accurately

capturing the short-term volatility representative of the time frame and allowing time for

the transaction activity signals to dissipate. Further analysis employing alternate measures

of price volatility, perhaps using the logarithim approach in Zimmerman (2018), ought to be

conducted too.

Regarding the transactions data, each dummy variable was coded to include any trans-

action over 10BTC. Consequently, the current specification does not account for varying de-

grees of activity. This is relevant because a larger transaction has a larger effect on the price

directly and also, indirectly in the form of a larger signal. Yet, this model would view a com-

mon transaction of 15BTC or approximately $100,000 as equivalent to the transaction of over

34
34,000BTC, roughly equal to $250,000,000, by 1MAhRt279uYmVC1dUxKR6dWwEULBJT34Nh

on May 18th. On a similar note, many transactions within the same hourly period are re-

garded together as one singular observation. Despite an actor having a reason to send more

than one large transaction, this feature of the data is also unaccounted for under the current

specification. Further work is required to develop transaction activity measures that account

for these features of the data.

Unlike Gandel et al. (2018) having access to the user IDs in the Mt. Gox exchange

dataset, there is no clear way to link multiple addresses to an individual. As a result, the

empirical analysis was conducted under the assumption that each address is not directly

connected to another. When studying a much broader set of addresses, using heuristics as in

Athey et al. (2016) to match addresses is vital. Further, one should consider that including

many addresses can result in both sides of transactions to be included in the data, yielding

another source of multicolinearity. This was likely not a significant problem in this analysis,

as only 73 addresses were included and most addresses transacted with a much broader set

of excluded addresses.

Lastly, an important set of transactions not observed in this analysis are Over-The-

Counter (OTC) transactions. These transactions involve two or more parties privately

agreeing to transfer Bitcoin for fiat currencies or other cryptocurrencies18 . OTC transac-

tions occur off the blockchain and beyond the view of the econometrician. This feature of

the data is akin to the traditional financial markets and particularly in derivatives. The
18
Bitcoin is often paired with USD, EUR, JPY, KRW and CHF fiat currencies, as well as many other
countries. It is also the most common currency pair for over 1000 alternate cryptocurrencies, along with
Ethereum (ETH).

35
prime consequence of interest is poor price discovery, which likely adds to the overall price

volatility.

Further thought should be paid to the income distribution and how it varies over time.

Using the snapshot as of Block 560003 is relatively arbitrary and further study of the changes

in the income distribution are paramount. Particularly, linking addresses that previously

held large quantities of Bitcoin to form a Continuous Income Distribution would be a major

contribution to the cryptocurrency literature. For example, address 1MAhRt279uYmVC1...

received a large transaction from a series of transactions via one-off address, thereby obfus-

cating the data19 . Consequently, the continuous distribution would include one address with

a linked transaction history, to better capture the effect this actor had on the price volatility

of Bitcoin. In order to fully appreciate this concept, consider the following explanation, “We

define an electronic coin as a chain of digital signatures. Each owner transfers the coin to the

next by digitally signing a hash of the previous transaction and the public key of the next

owner and adding these to the end of the coin.” (Nakamoto 2009). This quote highlights

that Bitcoin aren’t physical, discrete objects but rather a chain of agreements to the bal-

ance of an address. With this formulation in mind, linking addresses to obtain a complete,

time-continous income distribution is feasible and would better capture the true effects of

the actions of the large players on its price volatility.

19
1MAhRt... received 34000BTC from 1NRDQ8..., who received a nearly identical quantity from
1GTYSS..., who only had one other transaction receiving the 34,000BTC from 1CiAzy.... The last ad-
dress held a balance majority of the sample period, so creating a continuous income distribution would
involve linking these addresses as one.

36
5.2 Extensions and Roadmap Ahead

There are several ways to extend the empirical analysis, regarding data. First, a parallel

analysis of the most active accounts ought to be conducted. These accounts were viewed as

outliers, since they often had hundred or thousand times as many observations per address.

Consequently, many addresses that were significant when the outliers were excluded no

longer had evidence to conclude they were statistically distinct from zero. Many of these

addresses belong to exchanges or other Bitcoin companies, such as gambling websites and

mining pools. News about events may materialize in the blockchain, such as a cryptocurrency

exchange sending a large quantity of Bitcoin out of cold storage to help fulfill orders to leave

their exchange, implying money could be leaving the aggregate cryptocurrency markets and

resulting in lower price volatility.

Moreover, developing a model to study the dynamics of the income distribution and

enriching the set of addresses used in the analysis is a very meaningful extension of the data.

Gandel et al. (2018) use dummy variables for important events, such as DDos attacks on

the Mt. Gox exchange, as control variables in their analysis20 . While these controls weren’t

significant in their analysis and are less relevant in the modern ecosystem, finding data on

other events or characteristics of Bitcoin to control for would help to improve the empirical

model.

A different type of extension involves applying this procedure to other cryptocurrencies

that have their own blockchains. An important distinction exists between currencies and to-
20
Essentially attacks via the internet where the exchanges were overloaded with information and forced
temporarily offline, leading to short-term illiquidity.

37
kens, where the former possess their own forms of blockchains and the latter are built upon

other blockchains. The procedure of studying the income distribution and the association

between price volatility and the activity of the big players relies solely on the blockchain

technology. Consequently, using Bitinfocharts.com, one can find the income distribution

and transaction data for many other large cryptocurrencies, namely: Bitcoin Cash, Litecoin,

Dash, Dogecoin and many others. Future work could compute the Gini Coefficient for each

cryptocurrency and draw inferences based on the dynamics of each income distribution. Fur-

ther, comparing the association between price volatility and the activity of big players across

cryptocurrencies may yield a range of magnitudes that are both economically significant and

reasonable in size.

6 Conclusion

This paper presents an empirical procedure, leveraging the institutional structure of novel

Blockchain technology, for studying the association between Bitcoin’s price volatility and the

activity of the largest holders. I found robust evidence for the association of price volatility

with several key players, notably below the top of the distribution. More generally, I aimed

to motivate the studies employing more sophisticated econometric techniques and broader

sets of addresses. Combining these results with alternative cryptocurrencies will yield a

range of reasonable magnitudes for empirical results and a better understanding of Bitcoin’s

price volatility.

Bitcoin’s supporters have described it as the next generations’ major innovation, equiv-

38
alent in magnitude to the Internet21 . The cryptocurrency space has been the subject of

fast-growing academic literature, formalizing and operationalizing theories that have long

existed on platforms such as Twitter, Medium and Reddit. While progress has been made,

technological advance and regulatory changes still currently outpace academic research. The

second objective this paper satisfied was paving a roadmap for future work. In the presence

of many open questions and an abundance of data, the cryptocurrency ecosystem is ripe for

further academic study. Future contributions, such as a dynamic income distribution, will

have profound implications on the long debated question of the extent of the Whales’ ability

to influence Bitcoin’s price.

21
https://medium.com/@andrewcretin/its-2018-blockchain-is-on-it-s-way-to-become-the-new-internet-
7055ed6851e

39
7 References

7.1 Literature

Athey, Susan and Parashkevov, Ivo and Sarukkai, Vishnu and Xia, Jing, Bitcoin Pricing,

Adoption, and Usage: Theory and Evidence (August, 2016). Stanford University Grad-

uate School of Business Research Paper No. 16-42.

Berentsen, Aleksander and Schar, Fabian, A Short Introduction to the World of Cryptocur-

rencies (February 2018). Federal Reserve of St. Louis Review, Vol. 100, 1-16.

Catalini, Christian and Gans, Joshua S., Some Simple Economics of the Blockchain (Septem-

ber 2017). Rotman School of Management Working Paper No. 2874598; MIT Sloan

Research Paper No. 5191-16.

Chiu, Jonathan and Koeppl, Thorsten V., The Economics of Cryptocurrencies – Bitcoin and

Beyond (September 2017). Queen’s University Working Paper No. 1389.

Gandel, Neil and Hamrick, JT and Moore, Tyler and Oberman, Tali, Price Manipluation in

the Bitcoin Ecosystem (January 2018). Journal of Monetary Economics, Vol. 95, 86-96.

Nakamoto, Satoshi, Bitcoin: a peer-to-peer electronic cash system (March 2008).

https://www.bitcoin.org/bitcoin.pdf

Raskin, Max and Yermack, David, Digital Currencies, Decentralized Ledgers, and the Future

of Central Banking (May 2016). NBER Working Paper Series 22238.

40
Yermack, David, Is Bitcoin a Real Currency? An Economic Appraisal (December 2013).

NBER Working Paper Series 19747.

Zimmerman, Peter, Blockchain and Price Volatility (June 2018). Working Paper, Said Busi-

ness School, University of Oxford.

7.2 Data Sources

Blockchain.info - blockexplorer used to verify accuracy of the transaction data

Bitinfocharts.com - Income distribution and transaction data

Coinmarketcap.com - Volume data and used to verify accuracy of price data

CryptoDataDownload.com - All hourly price data from major exchanges

41

Potrebbero piacerti anche