Tracing Transactions

This guide will cover the following topics:

  • basic EVM tracing with JS

  • filtered EVM tracing with JS

  • JSON-RPC debug_trace* endpoints

Basic EVM Tracing with JS

Tracing a transaction means requesting an BlockX node to re-execute the desired transaction with varying degrees of data collection.

Re-executing a transaction has a few prerequisites to be met. All historical state accessed by the transaction must be available, including:

  • Balance, nonce, bytecode, and storage of both the recipient as well as all internally invoked contracts

  • Block metadata referenced during execution of the outer as well as all internally created transactions

  • Intermediate state generated by all preceding transactions contained in the same block as well as the one being traced

This means there are limits on the transactions that can be traced and imported based on the synchronization and pruning configuration of a node.

  • Archive nodes: retain all historical data back to genesis, can trace arbitrary transactions at any point in the history of the chain.

  • Fully synced nodes: transactions within a recent range (depending on how much history is stored) are accessible.

  • Light synced nodes: these nodes retrieve data on demand, so in theory they can trace transactions for which all required historical state is readily available in the network (however, data availability cannot be reasonably assumed).

Basic Traces

The simplest type of transaction trace that Geth can generate are raw EVM opcode traces. For every VM instruction the transaction executes, a structured log entry is emitted, contained all contextual metadata deemed useful. This includes:

  • program counter

  • opcode name & cost

  • remaining gas

  • execution depth

  • occurred errors

as well as (optionally) the execution stack, execution memory, and contract storage.

The entire output of a raw EVM opcode trace is a JSON object having a few metadata fields: consumed gas, failure status, return value, and a list of opcode entries:

{
  "gas":         25523,
  "failed":      false,
  "returnValue": "",
  "structLogs":  []
}

An example log for a single opcode entry has the following format:

{
  "pc":      48,
  "op":      "DIV",
  "gasCost": 5,
  "gas":     64532,
  "depth":   1,
  "error":   null,
  "stack": [
    "00000000000000000000000000000000000000000000000000000000ffffffff",
    "0000000100000000000000000000000000000000000000000000000000000000",
    "2df07fbaabbe40e3244445af30759352e348ec8bebd4dd75467a9f29ec55d98d"
  ],
  "memory": [
    "0000000000000000000000000000000000000000000000000000000000000000",
    "0000000000000000000000000000000000000000000000000000000000000000",
    "0000000000000000000000000000000000000000000000000000000000000060"
  ],
  "storage": {
  }
}

Limits of Basic Traces

Although raw opcode traces generated above are useful, having an individual log entry for every single opcode is too low level for most use cases, and will require developers to create additional tools to post-process the traces. Additionally, a single opcode trace can easily be hundreds of megabytes, making them very resource intensive to extract from the node and process extenally.

To avoid these issues, Geth supports running custom JavaScript traces within the BlockX (or any EVM-compatible) node, which have full access to the EVM stack, memory, and contract storage. This means developers only have to gather data that they actually need, and do any processing at the source.

Filtered EVM Tracing with JS

Basic traces can include the complete status of the EVM at every point in the transaction's execution, which is huge space-wise. Usually, developers are only interested in a small subset of this information, which can be obtained by specifying a JavaScript filter.

Running a Simple Trace

debug.traceTransaction must be invoked from within the Geth console, although it can be invoked from outside the node using JSON-RPC (eg. using Curl), as seen in the following section. If developers want to use debug.traceTransaction as it is used here, maintainence of a node is required.

  1. Create a file, filterTrace_1.js, with this content:

tracer = function(tx) {
  return debug.traceTransaction(tx, {tracer:
      '{' +
        'retVal: [],' +
        'step: function(log,db) {this.retVal.push(log.getPC() + ":" + log.op.toString())},' +
        'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
        'result: function(ctx,db) {return this.retVal}' +
      '}'
  }) // return debug.traceTransaction ...
}   // tracer = function ...

2. Run the JavaScript console.

3. Get a hash of a recent transaction.

4. Run this command to run the script:

loadScript("filterTrace_1.js")

5. Run the tracer from the script:

tracer("<hash of transaction>")

The bottom of the output looks similar to:

"3366:POP", "3367:JUMP", "1355:JUMPDEST", "1356:PUSH1", "1358:MLOAD", "1359:DUP1", "1360:DUP3", "1361:ISZERO", "1362:ISZERO",
"1363:ISZERO", "1364:ISZERO", "1365:DUP2", "1366:MSTORE", "1367:PUSH1", "1369:ADD", "1370:SWAP2", "1371:POP", "1372:POP", "1373:PUSH1",
"1375:MLOAD", "1376:DUP1", "1377:SWAP2", "1378:SUB", "1379:SWAP1", "1380:RETURN", ...

6. Run this command to get a more readable output with each string on its own line:

console.log(JSON.stringify(tracer("<hash of transaction>"), null, 2))

The JSON.stringify function's documentation is here. If we just return the output, we get for newlines, which is why we need to use console.log.

How Does it Work?

We call the same debug.traceTransaction function used for basic traces, but with a new parameter, tracer. This parameter is a string, which is the JavaScript object we use. In the case of the trace above, it is:

{
   retVal: [],
   step: function(log,db) {this.retVal.push(log.getPC() + ":" + log.op.toString())},
   fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},
   result: function(ctx,db) {return this.retVal}
}

This object has to have three member functions:

  • step, called for each opcode

  • fault, called if there is a problem in the execution

  • result, called to produce the results that are returned by debug.traceTransaction after the execution is done

It can have additional members. In this case, we use retVal to store the list of strings that we'll return in result.

The step function here adds to retVal: the program counter, and the name of the opcode there. Then, in result, we return this list to be sent to the caller.

Actual Filtering

For actual filtered tracing, we need an if statement to only log revelant information. For example, if we are interested in the transaction's interaction with storage, we might use:

tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'step: function(log,db) {' +
         '   if(log.op.toNumber() == 0x54) ' +
         '     this.retVal.push(log.getPC() + ": SLOAD");' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE");' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
}   // tracer = function ...

The step function here looks at the opcode number of the op, and only pushes an entry if the opcode is SLOAD or SSTORE. We could have used log.op.toString instead, but it is faster to compare numbers rather than strings.

The output looks similar to this:

[
  "5921: SLOAD",
  .
  .
  .
  "2413: SSTORE",
  "2420: SLOAD",
  "2475: SSTORE",
  "6094: SSTORE"
]

Stack Information

The trace above tells us the program counter and whether the program read from storage or wrote to it. To know more, you can use the log.stack.peek function to peek into the stack. log.stack.peek(0) is the stack top, log.stack.peek(1) is the entry beow it, etc.

The values returned by log.stack.peek are Go big.int objects. By default they are converted to JavaScript floating point numbers, so you need toString(16) to get them as hexadecimals, which is how we normally represent 256-bit values such as storage cells and their content.

tracer = function(tx) {
      return debug.traceTransaction(tx, {tracer:
      '{' +
         'retVal: [],' +
         'step: function(log,db) {' +
         '   if(log.op.toNumber() == 0x54) ' +
         '     this.retVal.push(log.getPC() + ": SLOAD " + ' +
         '        log.stack.peek(0).toString(16));' +
         '   if(log.op.toNumber() == 0x55) ' +
         '     this.retVal.push(log.getPC() + ": SSTORE " +' +
         '        log.stack.peek(0).toString(16) + " <- " +' +
         '        log.stack.peek(1).toString(16));' +
         '},' +
         'fault: function(log,db) {this.retVal.push("FAULT: " + JSON.stringify(log))},' +
         'result: function(ctx,db) {return this.retVal}' +
      '}'
      }) // return debug.traceTransaction ...
}   // tracer = function ...
[
  "5921: SLOAD 0",
  .
  .
  .
  "2413: SSTORE 3f0af0a7a3ed17f5ba6a93e0a2a05e766ed67bf82195d2dd15feead3749a575d <- fb8629ad13d9a12456",
  "2420: SLOAD cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870",
  "2475: SSTORE cc39b177dd3a7f50d4c09527584048378a692aed24d31d2eabeddb7f3c041870 <- 358c3de691bd19",
  "6094: SSTORE 0 <- 1"
]

There are several other facets of filtered EVM tracing, including:

  • determining operation results

  • dealing with calls between contracts

  • accessing memory

  • using the db parameter to know the state of the chain at the time of execution

JSON-RPC debug_trace* Endpoints

BlockX supports the following debug_trace* JSON-RPC Methods, which follow Geth's debug API guidelines.

debug_traceTransaction

The traceTransaction debugging method will attempt to run the transaction in the exact same manner as it was executed on the network. It will replay any transaction that may have been executed prior to this one, before it will finally attempt to execute the transaction that corresponds to the given hash.

Parameters:

  • trace configuration

# Request
curl -X POST --data '{"jsonrpc":"2.0","method":"debug_traceTransaction","params":[<transaction hash>, {"tracer": "{data: [], fault: function(log) {}, step: function(log) { if(log.op.toString() == \"CALL\") this.data.push(log.stack.peek(0)); }, result: function() { return this.data; }}"}],"id":1}' -H "Content-Type: application/json" https://eth.bd.blockxnet.com:8545

# Result
{"jsonrpc":"2.0","id":1,"result":[{"result":["68410", "51470"]}]}

debug_traceBlockByHash

The traceBlockByNumber endpoint accepts a block hash, and will replay the block that is already present in the database.

Parameters:

  • trace configuration

# Request
curl -X POST --data '{"jsonrpc":"2.0","method":"debug_traceBlockByNumber","params":[<block hash>, {"tracer": "{data: [], fault: function(log) {}, step: function(log) { if(log.op.toString() == \"CALL\") this.data.push(log.stack.peek(0)); }, result: function() { return this.data; }}"}],"id":1}' -H "Content-Type: application/json" https://eth.bd.blockxnet.com:8545

# Result
{"jsonrpc":"2.0","id":1,"result":[{"result":["68410", "51470"]}]}

Last updated