Loading output library...

First we get data by using API.

Here in the code we using `pickle`

to serialize and save the downloaded data as a file, which will prevent the script from re-downloading the same data each time run the script. The function return the data as a pd dataframe.

Loading output library...

Loading output library...

Since there are some hitch in the dataset, we don't want them to impact the result of our analysis, so we need to import the data from other exchange markets and fill out the spikes.

Loading output library...

After filling out the gaps in the dataset, we try to plot the data first, now these data shown below are from four different exchange markets.

Loading output library...

The goal is to remove all the 0's in the dataset to make sure the precision of the analysis, since the bitcoin price never has 0 as its value.

Loading output library...

After removing all the 0s in the dataset, now we calculate the average price for each cryptocurrency for later use.

Loading output library...

Our ultimate goal is to find out the relationships between the different currencies, and then to decide what to do next based on the results.

Loading output library...

Loading output library...

We now have crypto-btc, and btc-usd, we could have crypto-usd, as a more convinient way in terms of understanding the data.

Loading output library...

Here we want to dig some more insights from the dataset between the cryptocurrencies.

It's time to do the corellation analysis among the cryptocurrencies.
There are similar fluctuations appear along the timeline, thus we could use corr() in pandas to do the correlation analysis, which computes the pearson correlation coefficient for each col in the dataframe.

Compute correlations directly on a non-stationary time series can give biased corr values. We will work around it by using pct_change() method, which will convert each cell in the dataframe from an absolute price value to a daily return percentage.

Loading output library...

Loading output library...

Loading output library...

Loading output library...

Loading output library...

Since in the market there are many analysis talking about the relationship between the different crypocurrencies, but some of them don't have data to support their conclusions. Here we are using data and visualization tools to display some straightforward insights from the raw data in the market. The above process could be considered as a EDA.

The correlations between the cryptocurrencies from 2016 to 2019 is getting more and more stronger.Reasons could be as follows:

- more and more attensions to the cryptocurrencies/blockchain
- hedging funds also has certain impact on the crypto ...

- blockchain mining datasets
- stocks, commdenities, to see the correlations
- train a ml model to predict price，(CNN, RNN ...)
- trading bot, chatting bot
- is quant investment making money? (based on the historical data)