1 of 6

Postmortems

This section of documentation is for keeping a record of issues occurring on the network since its inception:

Postmortems / Issues
- xAPIC
- Secpk-Verification Bloat (Shade Airdrop)
- Earn Contract Exploit
- Testnet Halt 95

SNIP-20 leaks

Here is the blogpost regarding the Spicy printF leaks and SNIP-20 padding issues revealed by Andew Miller in research about the network and reference implementation code.

xApic

Secpk-Verifications Bloat

On the afternoon of Feb 21, 2022 the community started seeing impacted network performance stemming from the launch of the Shade airdrop. The performance degradation had multiple reasons and a lot of lessons were learned from this stress test. Some of these findings are documented here so to educate network participants.

The community came together quite quickly to solve as many problems as possible, the incident response can be found here: https://forum.scrt.network/t/discuss-network-issues-w-shade-airdrop-2-21-22/5475

What happened?

On Feb 21, 2022 at around 11pm UTC the Shade protocol airdrop was launched, drawing a lot of attention to Secret Network.

As a part of their airdrop mechanism, Shade heavily utilized secp256k1 signature verification in their contracts, which is very computationally expensive.

These transactions are causing blocks to slow down due to the time required to compute each block, causing the mempool to fill up which delays the execution of transactions. A further effect of blocks that take a long time to compute is that queries were slowed down as well, as at that time a node could not both compute a block and serve a request.

The network was clogged with transactions for multiple days on end because of the high demand and slow block times, after rate limiting the shade airdrop application the network picked back up again so that the rest of the applications were no longer affected.

Why Did This Happen?

The reason for the abnormal behavior is mostly due to nodes running an outdated WebAssembly engine, which does not handle long computations very efficiently. Also, gas calculations do not account for this inefficiency, which further compounds the issue.

To put it simply, these Secpk verifications took more computation power than they were paying for in Gas. A full block would therefore be several magnitudes more complex to compute which made it so that validators were not done in the normal 6 second time frame causing block time to become longer and longer. Longer block times means less space for transactions per second causing the network to be too slow to handle transactions.

A secondary reason as to why validator nodes were not able to meet the computational demand was not related to hardware, but to peering. Validators speak to their peers so to pass information about consensus along. Validators can set persistent peers but the network can decide to completely cut off certain validators when the network is being stressed. What happened is that during the chain congestion only certain groups of validators were talking to each other. When 67% of voting power was found the block would be signed leaving the rest of the validators to not sign at all. This created very spotty patterns in the nodes who signed blocks, some were signing every one and others were signing none even though they were both done before the block was committed.

Quick note: The shinobi protocol testnet launched slightly before the shade airdrop also caused significant slowdown of the network, blocks were full just from a few of their transactions at the same time. it was later recognised that this was because of the same problem with Secpk verifications which Shinobi uses for light client verification's of cross-chain bitcoin transfers.

What has been done

Firstly, a small upgrade was released which significantly improved the query node performance. This upgrade allows nodes to both serve many more requests, and lessen the impact of long block computations. This will help services like Keplr stay available during network-wide events. Reference: https://github.com/scrtlabs/SecretNetwork/releases/tag/v1.2.5
The execution performance for computationally expensive functions like secp256k1 verification are changed to being exposed to contracts (instead of being executed inside the contract) which made them much more efficient. New APIs were released during Shockwave Alpha which brought 500x improvements to these transactions. Reference: https://github.com/scrtlabs/SecretNetwork/releases/tag/v1.3.0
Introduced Seeds for solving the validator peering issues. One can reference the validator documentation to add these seeds to their peering list.
We are also replacing our WASM engine with a newer, more performant one. This item is still on the roadmap and will help with long term scalability of Secret Network.
Lastly, we will also be re-evaluating gas calculation and pricing and try to adjust the gas to more accurately reflect the computational cost of each contract. This was a huge lesson learned, gas needs to equal the computational cost or nodes will not be able to handle the load.

For some extra information on lessons learned from this event you can read this blog: https://scrt.network/blog/scrt-labs-update-scaling-secrets

Earn Contract Exploit

September 2021 Earn Contract Exploit

Chain id: secret-2
Date: 13/09/2021 3am UTC
Related issues: https://forum.scrt.network/t/earn-contract-exploit-post-mortem/4426

Description

Hi everyone,

A couple of weeks ago, a vulnerability in the SecretSwap Earn contracts (also known as the SPY contracts) was discovered and exploited. As far as we know, this is the first Rust/WASM-based contract exploit case, which is interesting in and by itself, and specifically, the first one on Secret Network where the interactions with said contract were all private (more on this below).

At this point, it’s important to clarify that Secret Network was in no way exploited, neither were the bridges, and that all funds are safe (with the exception of some minor network-upgrade related cases we are actively resolving, accounting for ~$50K). Like in any other major smart-contract chain, including ETH, BSC, and others, smart contract-related vulnerabilities are a potential risk. All we can do is mitigate the risk (and improve on our best practices in doing so), but it cannot be completely eliminated. In this case, the vulnerability, as is described below in a quite technical manner, was not an easy one to uncover and was quite sophisticated.

What Happened

The exploit took advantage of a missing input integrity check in the SPY contracts’ (=reward pools’ contracts) deposit function to arbitrarily generate rights to withdraw assets from the SPY contracts. We’ll go over a normal flow of a deposit to a SPY contract, and then how it was exploited. Keep in mind that there are 5 types of contracts involved here:

Secret Tokens - contracts such as sSCRT, sETH, sXMR, etc. Swap Pairs - which handle trading between pairs of Secret Tokens. LP Tokens - which represent liquidity-provider’s portion of the liquidity pools in the Swap Pairs. SPY contracts - which allow users to deposit LP tokens in exchange for accumulating SEFI rewards. The Master Contract - which orchestrates the allocation and minting of $SEFI to the SPY contracts. A valid flow of depositing assets in SPY contracts works like this:

Alice has eligible LP tokens. Alice executes a send transaction to the SPY contract, with a deposit inner message:

export const DepositRewards = async (params: {
  secretjs: AsyncSender;
  recipient: string;
  address: string;
  amount: string;
  fee?: StdFee;
}): Promise<string> => {
  const tx = await Snip20Send({
    msg: 'eyJkZXBvc2l0Ijp7fX0K', // '{"deposit":{}}' -> base64
    ...params,
  });
  console.log(tx)
  return 'yooyoo';
};

The Receive handler of the SPY contract receives the message above with the LP funds amount, which is then parsed and handled as described below.

The integrity of the received assets (Alice’s locked assets) relies on the integrity of the LP token; we have to trust the LP token to provide an accurate amount of received tokens i.e. we trust amount that is received in the receive call. The LP contract constructs the receive message with the correct information here: (full code section) 3

fn try_add_receiver_api_callback<S: ReadonlyStorage>(
    messages: &mut Vec<CosmosMsg>,
    storage: &S,
    recipient: &HumanAddr,
    msg: Option<Binary>,
    sender: HumanAddr,
    from: HumanAddr,
    amount: Uint128,
) -> StdResult<()> {
    let receiver_hash = get_receiver_hash(storage, recipient);
    if let Some(receiver_hash) = receiver_hash {
        let receiver_hash = receiver_hash?;
        let receiver_msg = Snip20ReceiveMsg::new(sender, from, amount, msg);
        let callback_msg = receiver_msg.into_cosmos_msg(receiver_hash, recipient.clone())?;

        messages.push(callback_msg);
    }
    Ok(())
}

Upon receive, the SPY contract first needs to get the amount of rewards that the Master contract has allocated to it so far. This information needs to be collected before the other state changes occur (either a deposit or a redeem). Therefore, the SPY contract calls the Master contract with update_allocation. Since there is no ability to call an external contract function inline from another contract function, the SPY contract also provides a callback message (in this case called hook) that the Master, in turn, will send back to the same SPY contract, to proceed with the deposit operation: Building the hook message in the SPY contract: update_allocation( env, config, Some(to_binary(&LPStakingHookMsg::Deposit { from, amount: Uint128(amount), })?), ) Wrapping the hook with update_allocation:

fn update_allocation(env: Env, config: Config, hook: Option<Binary>) -> StdResult<HandleResponse> {
    Ok(HandleResponse {
        messages: vec![WasmMsg::Execute {
            contract_addr: config.master.address,
            callback_code_hash: config.master.contract_hash,
            msg: to_binary(&MasterHandleMsg::UpdateAllocation {
                spy_addr: env.contract.address,
                spy_hash: env.contract_code_hash,
                hook,
            })?,
            send: vec![],
        }
        .into()],
        log: vec![],
        data: None,
    })
}

That message is handled by the Master contract, and then the hook is sent back to the SPY contract like this:

// Notify to the spy contract on the new allocation
messages.push(
    WasmMsg::Execute {
        contract_addr: spy_address.clone(),
        callback_code_hash: spy_hash,
        msg: to_binary(&LPStakingHandleMsg::NotifyAllocation {
            amount: Uint128(rewards),
            hook,
        })?,
        send: vec![],
    }
    .into(),
);

The Master contract calls the SPY with notify_allocation which contains hook. The SPY contract proceeds to finish the deposit operation. Note that the amount argument here came originally from the receive function, therefore trusted and should be valid.

Exploit Flow

Bob (attacker) executes a transaction that calls directly to the Master contract with update_allocation with an inner deposit message (as hook). Note that update_allocation requires no permissions and can be called by anyone. The Master contract calls notify_allocation on the SPY contract with the provided hook. Since notify_allocation relies on the data originally coming from a receive message, there are no further integrity checks on amount, and the hook is interpreted as a valid deposit message. In the SPY contract, the deposit_hook function is called with the parameters from the deposit, and increments Bob’s balance in the SPY contract: user.locked += amount;

Bob’s deposit message is successfully processed and he is given a right to withdraw funds equivalent to the amount he provided i.e. Bob can withdraw assets that were not deposited by him.

Resolution

As soon as the exploit became known, the entire Enigma team, many of the network’s validators and other members of the community, such as the Secret Foundation, committee members and leads, bridge operators and many others, came together to devise an action plan. Despite the many difficulties in coordinating so many actors in a decentralized ecosystem across many time zones, we were all able to coordinate a network upgrade that corrected the situation. While not an easy decision, given the funds at stake, this course of action was accepted by the majority of validators in the network.

In addition, to prevent funds from flowing out of the network, we communicated with all bridge operators and exchanges to ensure withdrawals outside of the network are temporarily disabled. This again required the interaction of many parties in the community and outside of it, and we are grateful for everyone who participated and assisted.

In particular, I’d like to also use this opportunity to thank my own team (Enigma), for staying up for 40+ hours while ensuring the vulnerability is found and patched, and for taking a leading part in coordinating all the different parties until a successful resolution.

Currently, everything in the network and all of its applications (including SecretSwap and the ETH/BSC bridges) are back to normal activity. We expect the Monero bridge to activate shortly as well, and we can say that the new Earn contracts, which would require migrating liquidity from the old (vulnerable) Earn contracts, are coming soon (next week at the latest). Given the privacy features of the network, it’s not possible to easily withdraw unclaimed SEFI from the old rewards contracts. This means that quite a lot of SEFI will in fact be burned. In addition, no new SEFI has been minted in the past few weeks, reducing the effective SEFI supply. Some of that supply will be reintroduced as compensation for liquidity providers who stayed and will migrate to the new contracts, in the form of accelerated rewards in the first few days.

Conclusion And Next Steps

There was a very sophisticated vulnerability in a Secret Contract. The network was never compromised, nor were any of the bridges. Nevertheless, in a collective action, the community came together and performed a network upgrade that ensured funds’ safety.

At this point, everything is back to normal operation, with the exception of the new, patched, Earn contracts (and by extension – governance) that will be re-introduced in the next week or so. These will require users to migrate, and will initially over-compensate LP’s unclaimed rewards loss. At the same time, a substantial amount of SEFI were effectively burned, thus reducing its overall supply.

Hope this clarifies the situation. We would like to remind everyone that we have a very generous bug/exploit bounty program, and that we always recommend taking a responsible disclosure course of action (we will make it worthwhile). For those who are interested please e-mail us at info (at) enigma (dot) co.

Best, Guy Enigma CEO

Testnet Halt 95

March 2020 Testnet Halt 95

Chain id: enigma-testnet
Date: 16/03/2020 3am UTC
Related issues: https://github.com/scrtlabs/SecretNetwork/issues/95

Description

On the 15 Mar 2020, around 9pm UTC the following param-change proposal was submitted:

At around 3am UTC of the following night the proposal got accepted, and as a result the network halted, with following error:

When the vote passed, the distribution module parameters changed to:

The problem occurred because the sum of baseproposerreward and bonusproposerreward can't be grater than 1 i.e. 0.999 + 0.04 > 1. This results in miscalculations of the rewards and fees.
The cause is a bug in Cosmos SDK in the parameter value validation, causing the proposal to pass despite being invalid. More on that here: https://github.com/cosmos/cosmos-sdk/issues/5808

Additional Notes

Another invalid proposal was on voting period, and by itself would have caused the network to halt as well:

Action Items

https://github.com/scrtlabs/SecretNetwork/issues/95
https://github.com/scrtlabs/SecretNetwork/issues/97
https://github.com/scrtlabs/SecretNetwork/issues/104

Recovery Process

Logged in to the testnet bootstrap machine.
Exported state from the last "rounded" block height:

Removed all references to proposal ids 4 and 5 in:

Made sure the distribution parameters still make sense:

Erased the coins in possession of the gov ModuleAccount:

"Refund" coins to the account that deposited to these proposals on the first place i.e. added to account's balance in:

A problem occurred with staking, described at: https://github.com/cosmos/cosmos-sdk/issues/5818 Changed the following:

To this:

Then a problem occured with the compute module:

This one got fixed when deleted the .enigmad/.compute directory.

Reset state:

Restarted the node.

Earn Contract Exploit

September 2021 Earn Contract Exploit

Chain id: secret-2
Date: 13/09/2021 3am UTC
Related issues: https://forum.scrt.network/t/earn-contract-exploit-post-mortem/4426

Description

Hi everyone,

What Happened

Alice has eligible LP tokens. Alice executes a send transaction to the SPY contract, with a deposit inner message:

export const DepositRewards = async (params: {
  secretjs: AsyncSender;
  recipient: string;
  address: string;
  amount: string;
  fee?: StdFee;
}): Promise<string> => {
  const tx = await Snip20Send({
    msg: 'eyJkZXBvc2l0Ijp7fX0K', // '{"deposit":{}}' -> base64
    ...params,
  });
  console.log(tx)
  return 'yooyoo';
};

The Receive handler of the SPY contract receives the message above with the LP funds amount, which is then parsed and handled as described below.

fn try_add_receiver_api_callback<S: ReadonlyStorage>(
    messages: &mut Vec<CosmosMsg>,
    storage: &S,
    recipient: &HumanAddr,
    msg: Option<Binary>,
    sender: HumanAddr,
    from: HumanAddr,
    amount: Uint128,
) -> StdResult<()> {
    let receiver_hash = get_receiver_hash(storage, recipient);
    if let Some(receiver_hash) = receiver_hash {
        let receiver_hash = receiver_hash?;
        let receiver_msg = Snip20ReceiveMsg::new(sender, from, amount, msg);
        let callback_msg = receiver_msg.into_cosmos_msg(receiver_hash, recipient.clone())?;

        messages.push(callback_msg);
    }
    Ok(())
}

fn update_allocation(env: Env, config: Config, hook: Option<Binary>) -> StdResult<HandleResponse> {
    Ok(HandleResponse {
        messages: vec![WasmMsg::Execute {
            contract_addr: config.master.address,
            callback_code_hash: config.master.contract_hash,
            msg: to_binary(&MasterHandleMsg::UpdateAllocation {
                spy_addr: env.contract.address,
                spy_hash: env.contract_code_hash,
                hook,
            })?,
            send: vec![],
        }
        .into()],
        log: vec![],
        data: None,
    })
}

That message is handled by the Master contract, and then the hook is sent back to the SPY contract like this:

// Notify to the spy contract on the new allocation
messages.push(
    WasmMsg::Execute {
        contract_addr: spy_address.clone(),
        callback_code_hash: spy_hash,
        msg: to_binary(&LPStakingHandleMsg::NotifyAllocation {
            amount: Uint128(rewards),
            hook,
        })?,
        send: vec![],
    }
    .into(),
);

Exploit Flow

Bob’s deposit message is successfully processed and he is given a right to withdraw funds equivalent to the amount he provided i.e. Bob can withdraw assets that were not deposited by him.

Resolution

Conclusion And Next Steps

Best, Guy Enigma CEO

Secpk-Verifications Bloat

What happened?

On Feb 21, 2022 at around 11pm UTC the Shade protocol airdrop was launched, drawing a lot of attention to Secret Network.

As a part of their airdrop mechanism, Shade heavily utilized secp256k1 signature verification in their contracts, which is very computationally expensive.

Why Did This Happen?

What has been done

Firstly, a small upgrade was released which significantly improved the query node performance. This upgrade allows nodes to both serve many more requests, and lessen the impact of long block computations. This will help services like Keplr stay available during network-wide events. Reference: https://github.com/scrtlabs/SecretNetwork/releases/tag/v1.2.5
The execution performance for computationally expensive functions like secp256k1 verification are changed to being exposed to contracts (instead of being executed inside the contract) which made them much more efficient. New APIs were released during Shockwave Alpha which brought 500x improvements to these transactions. Reference: https://github.com/scrtlabs/SecretNetwork/releases/tag/v1.3.0
Introduced Seeds for solving the validator peering issues. One can reference the validator documentation to add these seeds to their peering list.
We are also replacing our WASM engine with a newer, more performant one. This item is still on the roadmap and will help with long term scalability of Secret Network.
Lastly, we will also be re-evaluating gas calculation and pricing and try to adjust the gas to more accurately reflect the computational cost of each contract. This was a huge lesson learned, gas needs to equal the computational cost or nodes will not be able to handle the load.

For some extra information on lessons learned from this event you can read this blog: https://scrt.network/blog/scrt-labs-update-scaling-secrets

Testnet Halt 95

March 2020 Testnet Halt 95

Chain id: enigma-testnet
Date: 16/03/2020 3am UTC
Related issues: https://github.com/scrtlabs/SecretNetwork/issues/95

Description

On the 15 Mar 2020, around 9pm UTC the following param-change proposal was submitted:

At around 3am UTC of the following night the proposal got accepted, and as a result the network halted, with following error:

Mar 16 05:02:52 ip-172-31-44-28 enigmad[20612]: I[2020-03-16|05:02:52.767] Executed block                               module=state height=171146 validTxs=0 invalidTxs=0
Mar 16 05:02:52 ip-172-31-44-28 enigmad[20612]: I[2020-03-16|05:02:52.780] Committed state                              module=state height=171146 txs=0 appHash=FB4739B6F0D4FED77D431922E340B95B8144BF37483D3C1225431311A5BB229D
Mar 16 05:02:58 ip-172-31-44-28 enigmad[20612]: E[2020-03-16|05:02:58.075] CONSENSUS FAILURE!!!                         module=consensus err="negative coin amount" stack="goroutine 1419 [running]:\nruntime/debug.Stack(0xc0012c2ce0, 0x1115f40, 0x16671b0)\n\t/usr/lib/go-1.13/src/runtime/debug/stack.go:24 +0x9d\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine.func2(0xc002741c00, 0x149b478)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:613 +0x57\npanic(0x1115f40, 0x16671b0)\n\t/usr/lib/go-1.13/src/runtime/panic.go:679 +0x1b2\ngithub.com/cosmos/cosmos-sdk/types.DecCoins.Sub(...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/types/dec_coin.go:307\ngithub.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.AllocateTokens(0x169cf60, 0xc000eab5b0, 0xc000ab8460, 0xc000ab8460, 0x169cf60, 0xc000eab5e0, 0x169cfa0, 0xc000eab630, 0xc000e9fbe0, 0xc, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/x/distribution/keeper/allocation.go:66 +0x1589\ngithub.com/cosmos/cosmos-sdk/x/distribution.BeginBlocker(0x16ad9a0, 0xc000034048, 0x16c2100, 0xc005382800, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/x/distribution/abci.go:26 +0x2e6\ngithub.com/cosmos/cosmos-sdk/x/distribution.AppModule.BeginBlock(...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/x/distribution/module.go:147\ngithub.com/cosmos/cosmos-sdk/types/module.(*Manager).BeginBlock(0xc000ab9030, 0x16ad9a0, 0xc000034048, 0x16c2100, 0xc005382800, 0xa, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/types/module/module.go:297 +0x1ca\ngithub.com/enigmampc/SecretNetwork.(*EnigmaChainApp).BeginBlocker(...)\n\t/home/assafmo/workspace/enigmachain/app.go:391\ngithub.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(0xc000d197c0, 0xc004071c00, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/baseapp/abci.go:136 +0x469\ngithub.com/tendermint/tendermint/abci/client.(*localClient).BeginBlockSync(0xc000e7cd20, 0xc004071c00, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/abci/client/local_client.go:231 +0x101\ngithub.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(0xc000eab1b0, 0xc004071c00, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/proxy/app_conn.go:69 +0x6b\ngithub.com/tendermint/tendermint/state.execBlockOnProxyApp(0x16ae3a0, 0xc0027fa9a0, 0x16bb520, 0xc000eab1b0, 0xc003dc3a40, 0x16c4080, 0xc000011138, 0x6, 0xc00272a9a0, 0xe)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/state/execution.go:280 +0x3e1\ngithub.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc0000fce00, 0xa, 0x0, 0xc00272a980, 0x6, 0xc00272a9a0, 0xe, 0x29c8a, 0xc0038a1660, 0x20, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/state/execution.go:131 +0x17a\ngithub.com/tendermint/tendermint/consensus.(*State).finalizeCommit(0xc002741c00, 0x29c8b)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1431 +0x8f5\ngithub.com/tendermint/tendermint/consensus.(*State).tryFinalizeCommit(0xc002741c00, 0x29c8b)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1350 +0x383\ngithub.com/tendermint/tendermint/consensus.(*State).enterCommit.func1(0xc002741c00, 0x0, 0x29c8b)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1285 +0x90\ngithub.com/tendermint/tendermint/consensus.(*State).enterCommit(0xc002741c00, 0x29c8b, 0x0)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1322 +0x61a\ngithub.com/tendermint/tendermint/consensus.(*State).addVote(0xc002741c00, 0xc005087c20, 0xc003fd0b70, 0x28, 0xc0012c9a38, 0xd31f92, 0x0)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1819 +0xa39\ngithub.com/tendermint/tendermint/consensus.(*State).tryAddVote(0xc002741c00, 0xc005087c20, 0xc003fd0b70, 0x28, 0xf136b9f2d600ff82, 0x108, 0x100)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1642 +0x59\ngithub.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc002741c00, 0x168b640, 0xc0028f4538, 0xc003fd0b70, 0x28)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:709 +0x252\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc002741c00, 0x0)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:644 +0x6eb\ncreated by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:335 +0x13a\n"
Mar 16 05:36:48 ip-172-31-44-28 enigmad[20612]: E[2020-03-16|05:36:48.395] Connection failed @ sendRoutine              module=p2p peer=842822b88cb5762ba99cee50a37156c6d0a6c452@149.248.55.89:60440 conn=MConn{149.248.55.89:60440} err="pong timeout"
Mar 16 05:36:48 ip-172-31-44-28 enigmad[20612]: E[2020-03-16|05:36:48.402] Stopping peer for error                      module=p2p peer="Peer{MConn{149.248.55.89:60440} 842822b88cb5762ba99cee50a37156c6d0a6c452 in}" err="pong timeout"

When the vote passed, the distribution module parameters changed to:

community_tax: "0.020000000000000000"
base_proposer_reward: "0.999000000000000000"
bonus_proposer_reward: "0.040000000000000000"
withdraw_addr_enabled: true

The problem occurred because the sum of baseproposerreward and bonusproposerreward can't be grater than 1 i.e. 0.999 + 0.04 > 1. This results in miscalculations of the rewards and fees.
The cause is a bug in Cosmos SDK in the parameter value validation, causing the proposal to pass despite being invalid. More on that here: https://github.com/cosmos/cosmos-sdk/issues/5808

Additional Notes

Another invalid proposal was on voting period, and by itself would have caused the network to halt as well:

"changes": [
  {
    "subspace": "distribution",
    "key": "bonusproposerreward",
    "value": "\"0.999000000000000000\""
  }
]

Action Items

https://github.com/scrtlabs/SecretNetwork/issues/95
https://github.com/scrtlabs/SecretNetwork/issues/97
https://github.com/scrtlabs/SecretNetwork/issues/104

Recovery Process

Logged in to the testnet bootstrap machine.
Exported state from the last "rounded" block height:

enigmad export --for-zero-height --height=170000 > state_export.json

Removed all references to proposal ids 4 and 5 in:

"gov":{
    "deposits":[...],
    "proposals":[...],
    "votes":[...],
    ...
}

Made sure the distribution parameters still make sense:

"distribution":{
    "params":{
            "base_proposer_reward":"0.010000000000000000",
            "bonus_proposer_reward":"0.040000000000000000",
            "community_tax":"0.020000000000000000",
            "withdraw_addr_enabled":true
    },
    ...
}

Erased the coins in possession of the gov ModuleAccount:

"auth":{
    "accounts":[
        {
            "type":"cosmos-sdk/ModuleAccount",
                "value":{
                  "coins": [
                    <subtracted amount here>
                  ]
                  "name":"gov"
                }
        }, ...
}, ...

"Refund" coins to the account that deposited to these proposals on the first place i.e. added to account's balance in:

"app_state":{
  "auth":{
     "accounts":[
        {
           "value":{
              "coins":[
                 {
                    "amount":"<added to this amount>"
                 }
              ]
           }
        }
     ]
  }
}

A problem occurred with staking, described at: https://github.com/cosmos/cosmos-sdk/issues/5818 Changed the following:

"distribution":{
  "delegator_starting_infos":[
    {
      ...,
      "starting_info":{
        "...,
        "stake":"999990000.000000000000000000"
      },
      ...
    },
    ...
  ],
  ...
}

To this:

"distribution":{
  "delegator_starting_infos":[
    {
      ...,
      "starting_info":{
        "...,
        "stake":"990000000.000000000000000000"
      },
      ...
    },
    ...
  ],
  ...
}

Then a problem occured with the compute module:

panic: create wasm contract failed: Wasm Error: Filesystem error: File exists (os error 17)

This one got fixed when deleted the .enigmad/.compute directory.

Reset state:

enigmad unsafe-reset-all

Restarted the node.

Mar 16 05:02:52 ip-172-31-44-28 enigmad[20612]: I[2020-03-16|05:02:52.767] Executed block                               module=state height=171146 validTxs=0 invalidTxs=0
Mar 16 05:02:52 ip-172-31-44-28 enigmad[20612]: I[2020-03-16|05:02:52.780] Committed state                              module=state height=171146 txs=0 appHash=FB4739B6F0D4FED77D431922E340B95B8144BF37483D3C1225431311A5BB229D
Mar 16 05:02:58 ip-172-31-44-28 enigmad[20612]: E[2020-03-16|05:02:58.075] CONSENSUS FAILURE!!!                         module=consensus err="negative coin amount" stack="goroutine 1419 [running]:\nruntime/debug.Stack(0xc0012c2ce0, 0x1115f40, 0x16671b0)\n\t/usr/lib/go-1.13/src/runtime/debug/stack.go:24 +0x9d\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine.func2(0xc002741c00, 0x149b478)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:613 +0x57\npanic(0x1115f40, 0x16671b0)\n\t/usr/lib/go-1.13/src/runtime/panic.go:679 +0x1b2\ngithub.com/cosmos/cosmos-sdk/types.DecCoins.Sub(...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/types/dec_coin.go:307\ngithub.com/cosmos/cosmos-sdk/x/distribution/keeper.Keeper.AllocateTokens(0x169cf60, 0xc000eab5b0, 0xc000ab8460, 0xc000ab8460, 0x169cf60, 0xc000eab5e0, 0x169cfa0, 0xc000eab630, 0xc000e9fbe0, 0xc, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/x/distribution/keeper/allocation.go:66 +0x1589\ngithub.com/cosmos/cosmos-sdk/x/distribution.BeginBlocker(0x16ad9a0, 0xc000034048, 0x16c2100, 0xc005382800, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/x/distribution/abci.go:26 +0x2e6\ngithub.com/cosmos/cosmos-sdk/x/distribution.AppModule.BeginBlock(...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/x/distribution/module.go:147\ngithub.com/cosmos/cosmos-sdk/types/module.(*Manager).BeginBlock(0xc000ab9030, 0x16ad9a0, 0xc000034048, 0x16c2100, 0xc005382800, 0xa, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/types/module/module.go:297 +0x1ca\ngithub.com/enigmampc/SecretNetwork.(*EnigmaChainApp).BeginBlocker(...)\n\t/home/assafmo/workspace/enigmachain/app.go:391\ngithub.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).BeginBlock(0xc000d197c0, 0xc004071c00, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.38.1/baseapp/abci.go:136 +0x469\ngithub.com/tendermint/tendermint/abci/client.(*localClient).BeginBlockSync(0xc000e7cd20, 0xc004071c00, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/abci/client/local_client.go:231 +0x101\ngithub.com/tendermint/tendermint/proxy.(*appConnConsensus).BeginBlockSync(0xc000eab1b0, 0xc004071c00, 0x20, 0x20, 0xa, 0x0, 0x0, 0x0, 0x0, 0x0, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/proxy/app_conn.go:69 +0x6b\ngithub.com/tendermint/tendermint/state.execBlockOnProxyApp(0x16ae3a0, 0xc0027fa9a0, 0x16bb520, 0xc000eab1b0, 0xc003dc3a40, 0x16c4080, 0xc000011138, 0x6, 0xc00272a9a0, 0xe)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/state/execution.go:280 +0x3e1\ngithub.com/tendermint/tendermint/state.(*BlockExecutor).ApplyBlock(0xc0000fce00, 0xa, 0x0, 0xc00272a980, 0x6, 0xc00272a9a0, 0xe, 0x29c8a, 0xc0038a1660, 0x20, ...)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/state/execution.go:131 +0x17a\ngithub.com/tendermint/tendermint/consensus.(*State).finalizeCommit(0xc002741c00, 0x29c8b)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1431 +0x8f5\ngithub.com/tendermint/tendermint/consensus.(*State).tryFinalizeCommit(0xc002741c00, 0x29c8b)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1350 +0x383\ngithub.com/tendermint/tendermint/consensus.(*State).enterCommit.func1(0xc002741c00, 0x0, 0x29c8b)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1285 +0x90\ngithub.com/tendermint/tendermint/consensus.(*State).enterCommit(0xc002741c00, 0x29c8b, 0x0)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1322 +0x61a\ngithub.com/tendermint/tendermint/consensus.(*State).addVote(0xc002741c00, 0xc005087c20, 0xc003fd0b70, 0x28, 0xc0012c9a38, 0xd31f92, 0x0)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1819 +0xa39\ngithub.com/tendermint/tendermint/consensus.(*State).tryAddVote(0xc002741c00, 0xc005087c20, 0xc003fd0b70, 0x28, 0xf136b9f2d600ff82, 0x108, 0x100)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:1642 +0x59\ngithub.com/tendermint/tendermint/consensus.(*State).handleMsg(0xc002741c00, 0x168b640, 0xc0028f4538, 0xc003fd0b70, 0x28)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:709 +0x252\ngithub.com/tendermint/tendermint/consensus.(*State).receiveRoutine(0xc002741c00, 0x0)\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:644 +0x6eb\ncreated by github.com/tendermint/tendermint/consensus.(*State).OnStart\n\t/home/assafmo/workspace/go/pkg/mod/github.com/tendermint/tendermint@v0.33.0/consensus/state.go:335 +0x13a\n"
Mar 16 05:36:48 ip-172-31-44-28 enigmad[20612]: E[2020-03-16|05:36:48.395] Connection failed @ sendRoutine              module=p2p peer=842822b88cb5762ba99cee50a37156c6d0a6c452@149.248.55.89:60440 conn=MConn{149.248.55.89:60440} err="pong timeout"
Mar 16 05:36:48 ip-172-31-44-28 enigmad[20612]: E[2020-03-16|05:36:48.402] Stopping peer for error                      module=p2p peer="Peer{MConn{149.248.55.89:60440} 842822b88cb5762ba99cee50a37156c6d0a6c452 in}" err="pong timeout"