-
Notifications
You must be signed in to change notification settings - Fork 145
Add dashboard post #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2264833 to
ea311e1
Compare
|
@jamesob @moneyball @harding - would love your thoughts about this if you have time to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it! I've added a few more nits inline. Feel free to ignore if you disagree.
|
|
||
| The dashboard is live and is showing these stats (and more) at [dashboard.bitcoinops.org](https://dashboard.bitcoinops.org). | ||
|
|
||
| The dataset used for the dashboard is also saved as a directory of JSON files, updated nightly. You can download the zipped dataset from our [S3 bucket](http://dashboard.dataset.s3.us-east-2.amazonaws.com/backups/bitcoinops-dataset.tar.gz). If you just want the JSON file for a specific block, you can get it at: `http://dashboard.dataset.s3.us-east-2.amazonaws.com/blocks/BLOCK_NUMBER.json` If you'd rather get the data from your own full node (it might take a few days), you can use the code with instructions at [github.com/bitcoinops/btc-dashboard](https://github.com/bitcoinops/btc-dashboard). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: s/ If you'd rather get the data from/. If you'd rather recreate the data on/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit on the nit: s/the data on/the data using/
| An [example JSON file with an explanation of added fields](https://github.com/bitcoinops/btc-dashboard/blob/master/STATS_TRACKED.md) is available in the same repository. | ||
|
|
||
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at existing tools for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/existing tools/an existing tool/
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at existing tools for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. | ||
|
|
||
| After exploring these options, I decided I can probably get most of the stats needed for the dashboard through RPCs from bitcoind. I wrote some simple code (using btcd's RPC client) to get the stats I needed from each transaction with the `getblock` and `getrawtransaction` RPCs. This code worked pretty well for smaller blocks, but when I tried using it to get the stats of a more recent block - which often have over a thousand transactions - I quickly noticed a problem. My code was calling `getrawtransaction` for every single transaction input in the block. Asking bitcoind to do thousands of RPCs for every single block quickly becomes unsustainable, and was taking a very long amount of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tense: I decided I can -> I decided I could
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at existing tools for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. | ||
|
|
||
| After exploring these options, I decided I can probably get most of the stats needed for the dashboard through RPCs from bitcoind. I wrote some simple code (using btcd's RPC client) to get the stats I needed from each transaction with the `getblock` and `getrawtransaction` RPCs. This code worked pretty well for smaller blocks, but when I tried using it to get the stats of a more recent block - which often have over a thousand transactions - I quickly noticed a problem. My code was calling `getrawtransaction` for every single transaction input in the block. Asking bitcoind to do thousands of RPCs for every single block quickly becomes unsustainable, and was taking a very long amount of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a/very long amount of time/very long time/
| {:.post-meta} | ||
| *by [Marcin Jachymiak](https://github.com/marcinja)<br>Intern at Bitcoin Optech* | ||
|
|
||
| {% include articles/dashboard-writeup.md hlevel='##' %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to put this in a separate include file I think. The only reason we did that for the Xapo consolidation article was so it could be included in a newsletter.
Just put the markdown directly in this file.
| ``` | ||
| block, err := rpcclient.GetBlock(blockHash) | ||
| for _, tx := range block.Transactions { | ||
| for _, input := range tx.TxIn { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mixture of tabs and spaces is breaking alignment here. Just use spaces.
| ``` | ||
|
|
||
|
|
||
| I then started looking at the `getblockstats` RPC for some additional stats while I figure out a way around this problem. [`getblockstats` is an RPC in `bitcoind`](https://github.com/bitcoin/bitcoin/pull/10757) that gives several useful stats for any given block. For an idea as to what it tells us about a block, the dashboard uses it to show evolving fee-rates, changes in the size of the UTXO set, and the number of inputs, outputs, and transactions for every block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/For an idea as to what it tells us about a block/For example/
| ``` | ||
|
|
||
|
|
||
| I then started looking at the `getblockstats` RPC for some additional stats while I figure out a way around this problem. [`getblockstats` is an RPC in `bitcoind`](https://github.com/bitcoin/bitcoin/pull/10757) that gives several useful stats for any given block. For an idea as to what it tells us about a block, the dashboard uses it to show evolving fee-rates, changes in the size of the UTXO set, and the number of inputs, outputs, and transactions for every block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest you remove the backticks around bitcoind, since you haven't styled it bitcoind elsewhere in the document.
| ``` | ||
|
|
||
|
|
||
| I then started looking at the `getblockstats` RPC for some additional stats while I figure out a way around this problem. [`getblockstats` is an RPC in `bitcoind`](https://github.com/bitcoin/bitcoin/pull/10757) that gives several useful stats for any given block. For an idea as to what it tells us about a block, the dashboard uses it to show evolving fee-rates, changes in the size of the UTXO set, and the number of inputs, outputs, and transactions for every block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/several useful stats for any/useful statistics/ (to avoid the repetition of 'stats' in the sentence.
| The patch to `getblockstats` with these extra stats (and some more) is [publicly available](https:///github.com/bitcoinops/bitcoin/tree/expand-getblockstats). I also [patched the btcd RPC client](https://github.com/bitcoinops/btcd/tree/dashboard-rpc) for convenience to allow usage of `getblockstats` in the code I wrote. | ||
|
|
||
| ### Using bitcoind RPC efficiently | ||
| Using the `getblockstats` RPC to get data from the entire history of Bitcoin can still be pretty slow! Running the code to backfill data from all blocks on my desktop, it would have taken weeks to get all the data. After profiling bitcoind while it responded to `getblockstats` RPCs, we saw that as expected most of its time was spent retrieving transactions from the `tx_index`, so we configured bitcoind to better handle all these database reads: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove the backticks from tx_index here.
harding
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments, but it's all nitpicky stuff. Good post!
| This summer I was an intern for [Bitcoin Optech](https://bitcoinops.org), working on a Bitcoin metrics dashboard. In this post I'll be describing the purpose of the dashboard and how it was implemented. | ||
|
|
||
| The goal of the Optech dashboard is to show a variety of metrics of how effectively blockspace is being used. Important metrics are those that show activity like: | ||
| - [batching](https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#Payment_batching) (combining the outputs of what would otherwise be many different transactions into a single transaction), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two minor formatting suggestions for this list:
- Provide a brief description of segwit adoption so all of the list entries have both a term and a definition of that term
- Use the format:
term: definition. This allows you to remove the parenthesis. E.g.:
- [Batching](https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#Payment_batching): combining the outputs...| - SegWit adoption. | ||
|
|
||
|
|
||
| The dashboard is live and is showing these stats (and more) at [dashboard.bitcoinops.org](https://dashboard.bitcoinops.org). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest also removing the parenthesis here: "...showing these stats and more at..."
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at existing tools for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. | ||
|
|
||
| After exploring these options, I decided I can probably get most of the stats needed for the dashboard through RPCs from bitcoind. I wrote some simple code (using btcd's RPC client) to get the stats I needed from each transaction with the `getblock` and `getrawtransaction` RPCs. This code worked pretty well for smaller blocks, but when I tried using it to get the stats of a more recent block - which often have over a thousand transactions - I quickly noticed a problem. My code was calling `getrawtransaction` for every single transaction input in the block. Asking bitcoind to do thousands of RPCs for every single block quickly becomes unsustainable, and was taking a very long amount of time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that you mention somewhere in this paragraph that getblockstats is a new RPC in Bitcoin Core expected to be released in Bitcoin Core 0.17 so that people aren't surprised when they can't find it on their 0.16.x or earlier nodes.
|
|
||
| I patched `getblockstats` in bitcoind to get the following stats: | ||
| - number of transactions that signal opt-in replace-by-fee (RBF) | ||
|  |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will format nicer if you separate the images from text by a newline and indent the image tag by four spaces to indicate it's still part of the preceding bullet. E.g.,
- number of transactions that signal opt-in replace-by-fee (RBF)
| - number of transactions that signal opt-in replace-by-fee (RBF) | ||
|  | ||
|
|
||
| - number of transactions (and inputs/outputs where it makes sense) that spend/create the different kinds of SegWit outputs: P2SH-Nested and Native (Bech32) variants of P2WPKH and P2WSH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest changing "spend/create" to "spend or create". (I'm a chronic abuser of the forward slash, so I admit this is hypocritical 'do as I say not as I do' advice.)
| ### Databases | ||
| What do we do with all the data from `getblockstats`? | ||
|
|
||
| First, we derive some extra stats like "percentage of X that are Y" from the stats "X" and "Y" for the sake of convenience (the bitcoind patch doesn't include these to give users the option to not have them and because they are trivial to compute). Then we store the result as a JSON file, and in a database. This is done in a program that uses the modified btcd RPC client described above. The graphs in the dashboard are created by Grafana which makes queries into this database. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know "Grafana" is a program of some sort, but that's not clear from the context here---someone reading this could assume it's a person. I suggest either saying the "Grafana graph creation package" (or whatever) or just linking to its project page somewhere.
ea311e1 to
daa9240
Compare
jnewbery
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch more nits for your consideration :)
| The goal of the Optech dashboard is to show a variety of metrics of how effectively blockspace is being used. Important metrics are those that show activity like: | ||
| - [batching](https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#Payment_batching): combining the outputs of what would otherwise be many different transactions into a single transaction | ||
| - [consolidations](https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#Consolidation): combining many UTXOs in a single UTXO during low fee period, to avoid paying a higher fee in the future | ||
| - dust creation: creation of UTXOs that cost more to spend than they have value, at different fee-rates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you've included links for the previous two, how about https://bitcoin.stackexchange.com/questions/10986/what-is-meant-by-bitcoin-dust for dust and https://bitcoincore.org/en/2016/01/26/segwit-benefits/ or https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#P2SH-wrapped_segwit for segwit.
| - [batching](https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#Payment_batching): combining the outputs of what would otherwise be many different transactions into a single transaction | ||
| - [consolidations](https://en.bitcoin.it/wiki/Techniques_to_reduce_transaction_fees#Consolidation): combining many UTXOs in a single UTXO during low fee period, to avoid paying a higher fee in the future | ||
| - dust creation: creation of UTXOs that cost more to spend than they have value, at different fee-rates | ||
| - SegWit adoption: creation of and spending from SegWit outputs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'segwit' (or 'Segwit' at the start of a sentence) seems the more common style, including on this site.
|
|
||
|  | ||
|
|
||
| - number of transactions (and inputs/outputs where it makes sense) that spend or create the different kinds of SegWit outputs: P2SH-Nested and Native (Bech32) variants of P2WPKH and P2WSH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SegWit -> segwit
|
|
||
| The dashboard is live and is showing these stats and more at [dashboard.bitcoinops.org](https://dashboard.bitcoinops.org). | ||
|
|
||
| The dataset used for the dashboard is also saved as a directory of JSON files, updated nightly. You can download the zipped dataset from our [S3 bucket](http://dashboard.dataset.s3.us-east-2.amazonaws.com/backups/bitcoinops-dataset.tar.gz). If you just want the JSON file for a specific block, you can get it at: `http://dashboard.dataset.s3.us-east-2.amazonaws.com/blocks/BLOCK_NUMBER.json` If you'd rather recreate the data using your own full node (it might take a few days), you can use the code with instructions at [github.com/bitcoinops/btc-dashboard](https://github.com/bitcoinops/btc-dashboard). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing . after the JSON URL. I'd also change recreate -> regenerate (sorry - I know I suggested recreate earlier)
| An [example JSON file with an explanation of added fields](https://github.com/bitcoinops/btc-dashboard/blob/master/STATS_TRACKED.md) is available in the same repository. | ||
|
|
||
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at an existing tool for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove 'like' from 'like BlockSci'. Also make it a link to the BlockSci repo.
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at an existing tool for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. | ||
|
|
||
| After exploring these options, I decided I could probably get most of the stats needed for the dashboard through RPCs from bitcoind. I wrote some simple code (using btcd's RPC client) to get the stats I needed from each transaction with the `getblock` and `getrawtransaction` RPCs. This code worked pretty well for smaller blocks, but when I tried using it to get the stats of a more recent block - which often have over a thousand transactions - I quickly noticed a problem. My code was calling `getrawtransaction` for every single transaction input in the block. Asking bitcoind to do thousands of RPCs for every single block quickly becomes unsustainable, and was taking a very long time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've only mentioned one option above (BlockSci). these options -> that option?
| ## Implementation | ||
| The rest of this post will be about my experience building the dashboard. To start, I looked at an existing tool for blockchain analysis, like BlockSci. Although BlockSci works pretty well, it has a long setup time and I was also looking to analyze blocks as they were being confirmed. In my experience using BlockSci, I found it very easy to write queries similar to [those in the demo they produced](https://citp.github.io/BlockSci/demo.html), which took just a few minutes to return a result. The main downside with using BlockSci is that its parser takes a long time to pre-process blockchain data, but once that is done it performs quite well. | ||
|
|
||
| After exploring these options, I decided I could probably get most of the stats needed for the dashboard through RPCs from bitcoind. I wrote some simple code (using btcd's RPC client) to get the stats I needed from each transaction with the `getblock` and `getrawtransaction` RPCs. This code worked pretty well for smaller blocks, but when I tried using it to get the stats of a more recent block - which often have over a thousand transactions - I quickly noticed a problem. My code was calling `getrawtransaction` for every single transaction input in the block. Asking bitcoind to do thousands of RPCs for every single block quickly becomes unsustainable, and was taking a very long time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
becomes -> became to avoid jumping tenses.
| } | ||
| ``` | ||
|
|
||
| I then started looking at the `getblockstats` RPC for some additional stats while I figure out a way around this problem. [`getblockstats` is a new RPC in bitcoind](https://github.com/bitcoin/bitcoin/pull/10757) expected to be released in Bitcoin Core 0.17. It gives several useful statistics for any given block. For example, the dashboard uses it to show evolving fee-rates, changes in the size of the UTXO set, and the number of inputs, outputs, and transactions for every block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd remove "expected to be released" here
| The patch to `getblockstats` with these extra stats (and some more) is [publicly available](https:///github.com/bitcoinops/bitcoin/tree/expand-getblockstats). I also [patched the btcd RPC client](https://github.com/bitcoinops/btcd/tree/dashboard-rpc) for convenience to allow usage of `getblockstats` in the code I wrote. | ||
|
|
||
| ### Using bitcoind RPC efficiently | ||
| Using the `getblockstats` RPC to get data from the entire history of Bitcoin can still be pretty slow! Running the code to backfill data from all blocks on my desktop, it would have taken weeks to get all the data. After profiling bitcoind while it responded to `getblockstats` RPCs, we saw that as expected most of its time was spent retrieving transactions from the tx_index, so we configured bitcoind to better handle all these database reads: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove 'as expected'
|
I think this is pretty much good to go (modulo any nits that Marcin wants to take). We should include a link to it in next week's newsletter and merge at the same time. |
|
tACK Fantastic write up! I learned a lot from it. Great work this summer. |
Agreed! |
28cd0bd to
d96c171
Compare
|
Thanks! And thanks for all the reviews. Should I change the date on the post then? |
Good idea. Newsletter goes out on Tuesday (Sept 4). |
d96c171 to
e0094be
Compare
|
Merged in #63. Thanks Marcin! |
No description provided.