Creation of a formal verification system from scratch. Part 1: Character VM in PHP and Python

Formal verification is the verification of one program or algorithm with the help of another.

This is one of the most powerful methods that allows you to find all the vulnerabilities in the program or prove that they do not exist.

A more detailed description of formal verification can be seen on the example of solving the problem of Wolf, Goat, and cabbage in my previous article.

In this article, I move from formal problem verification to programs and describe how
how you can convert them into systems of formal rules automatically.

To do this, I wrote my analogue of a virtual machine, on symbolic principles.

It parses the program code and translates it into a system of equations (SMT), which can already be solved programmatically.

Since information about symbolic calculations is presented on the Internet rather fragmentarily,
I will briefly describe what it is.

Symbolic calculations are a way to simultaneously execute a program on a wide range of data and are the main tool for formal program verification.

For example, we can set input conditions where the first argument can take any positive value, the second can be negative, the third can be zero, and the output argument is, for example, 42.

Symbolic calculations in one run will give us an answer whether it is possible for us to obtain the desired result and an example of a set of such input parameters. Or proof that there are no such parameters.

Moreover, we can set the input arguments in general as all possible, and choose only the output, for example, the administrator password.

In this case, we will find all the vulnerabilities of the program, or we will get proof that the admin password is safe.

It can be seen that the classical execution of a program with specific input data is a special case of a symbolic one.

Therefore, my character VM can also work in the emulation mode of a standard virtual machine.

In the comments to the previous article, one can also find fair criticism of formal verification with a discussion of its weaknesses.

The main problems are:

  1. Combinatorial explosion, since formal verification eventually rests on P=NP
  2. Handling calls to the file system, networks, and other external storage is more difficult to verify
  3. Bugs in the specification, when the customer or programmer conceived one thing, but did not accurately describe it in the TOR.

As a result, the program will be verified and comply with the specification, but will do something completely different from what the creators expected from it.

Since in this article I mainly consider the application of formal verification in practice, I will not beat my forehead against the wall for now, and I will choose a system where the algorithmic complexity and the number of external calls are minimal.

Since smart contracts fit these requirements in the best way, the choice fell on RIDE contracts from the Waves platform: they are not Turing-complete, and their maximum complexity is artificially limited.

But we will consider them exclusively from the technical side.

In addition to everything, for any contracts, formal verification will be especially in demand: as a rule, it is impossible to correct a contract error after it has been launched.
And the price of such errors can be very high, since quite large amounts of funds are often stored on smart contracts.

My character VM is written in PHP and Python, and uses Microsoft Research's Z3Prover to solve the resulting SMT formulas.

At its core is a powerful multi-transactional search, which
allows you to find solutions or vulnerabilities, even if it requires many transactions.
Even Mythrill, one of the most powerful character frameworks for finding Ethereum vulnerabilities, only added this capability a few months ago.

But it is worth noting that ether contracts are more complex and have Turing-completeness.

PHP translates the source code of the RIDE smart contract into a python script, in which the program is presented as a Z3 SMT-compatible system of contract states and their transition conditions:

Creation of a formal verification system from scratch. Part 1: Character VM in PHP and Python

Now I will describe what is happening inside, in more detail.

But first, a few words about the RIDE smart contract language.

It is a functional and expression-based programming language that is lazy by design.
RIDE runs in isolation inside the blockchain, and can extract and write information from the storage associated with the user's wallet.

A RIDE contract can be attached to each wallet, and the result of execution will be only TRUE or FALSE.

TRUE means that the smart contract allows the transaction, and FALSE that it prohibits it.
A simple example: the script can prohibit the transfer if the wallet balance is less than 100.

As an example, I will take the same Wolf, Goat, and Cabbage, but already presented in the form of a smart contract.

The user will not be able to withdraw money from the wallet on which the contract is deployed until he sends everyone to the other side.

#ИзвлСкаСм ΠΏΠΎΠ»ΠΎΠΆΠ΅Π½ΠΈΠ΅ всСх ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ΠΎΠ² ΠΈΠ· Π±Π»ΠΎΠΊΡ‡Π΅ΠΉΠ½Π°
let contract = tx.sender
let human= extract(getInteger(contract,"human"))
let wolf= extract(getInteger(contract,"wolf"))
let goat= extract(getInteger(contract,"goat"))
let cabbage= extract(getInteger(contract,"cabbage"))

#Π­Ρ‚ΠΎ Ρ‚Π°ΠΊ называСмая Π΄Π°Ρ‚Π°-транзакция, Π² ΠΊΠΎΡ‚ΠΎΡ€ΠΎΠΉ ΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Ρ‚Π΅Π»ΡŒ присылаСт Π½ΠΎΠ²Ρ‹Π΅ 4 ΠΏΠ΅Ρ€Π΅ΠΌΠ΅Π½Π½Ρ‹Π΅.
#ΠšΠΎΠ½Ρ‚Ρ€Π°ΠΊΡ‚ Ρ€Π°Π·Ρ€Π΅ΡˆΠΈΡ‚ Π΅Ρ‘ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Π² случаС Ссли всС ΠΎΠ±ΡŠΠ΅ΠΊΡ‚Ρ‹ останутся Π² сохранности.
match tx {
case t:DataTransaction =>
   #ИзвлСкаСм Π±ΡƒΠ΄ΡƒΡ‰Π΅Π΅ ΠΏΠΎΠ»ΠΎΠΆΠ΅Π½ΠΈΠ΅ всСх ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ΠΎΠ² ΠΈΠ· Ρ‚Ρ€Π°Π½Π·Π°ΠΊΡ†ΠΈΠΈ
   let newHuman= extract(getInteger(t.data,"human")) 
   let newWolf= extract(getInteger(t.data,"wolf"))
   let newGoat= extract(getInteger(t.data,"goat"))
   let newCabbage= extract(getInteger(t.data,"cabbage"))
   
   #0 ΠΎΠ±ΠΎΠ·Π½Π°Ρ‡Π°Π΅Ρ‚, Ρ‡Ρ‚ΠΎ ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ Π½Π° Π»Π΅Π²ΠΎΠΌ Π±Π΅Ρ€Π΅Π³Ρƒ, Π° 1 Ρ‡Ρ‚ΠΎ Π½Π° ΠΏΡ€Π°Π²ΠΎΠΌ
   let humanSide= human == 0 || human == 1
   let wolfSide= wolf == 0 || wolf == 1
   let goatSide= goat == 0 || goat == 1
   let cabbageSide= cabbage == 0 || cabbage == 1
   let side= humanSide && wolfSide && goatSide && cabbageSide

   #Π‘ΡƒΠ΄ΡƒΡ‚ Ρ€Π°Π·Ρ€Π΅ΡˆΠ΅Π½Ρ‹ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Ρ‚Π΅ Ρ‚Ρ€Π°Π½Π·Π°ΠΊΡ†ΠΈΠΈ, Π³Π΄Π΅ с ΠΊΠΎΠ·ΠΎΠΉ Π½ΠΈΠΊΠΎΠ³ΠΎ Π½Π΅Ρ‚ Π² отсутствии Ρ„Π΅Ρ€ΠΌΠ΅Ρ€Π°.
   let safeAlone= newGoat != newWolf && newGoat != newCabbage
   let safe= safeAlone || newGoat == newHuman
   let humanTravel= human != newHuman 

   #Бпособы ΠΏΡƒΡ‚Π΅ΡˆΠ΅ΡΡ‚Π²ΠΈΡ Ρ„Π΅Ρ€ΠΌΠ΅Ρ€Π° Ρ‚ΡƒΠ΄Π° ΠΈ ΠΎΠ±Ρ€Π°Ρ‚Π½ΠΎ, с ΠΊΠ΅ΠΌ-Ρ‚ΠΎ Π»ΠΈΠ±ΠΎ Π² ΠΎΠ΄ΠΈΠ½ΠΎΡ‡ΠΊΡƒ.
   let t1= humanTravel && newWolf == wolf + 1 && newGoat == goat && newCabbage == cabbage 
   let t2= humanTravel && newWolf == wolf && newGoat == goat + 1 && newCabbage == cabbage
   let t3= humanTravel && newWolf == wolf && newGoat == goat && newCabbage == cabbage + 1
   let t4= humanTravel && newWolf == wolf - 1 && newGoat == goat && newCabbage == cabbage
   let t5= humanTravel && newWolf == wolf && newGoat == goat - 1 && newCabbage == cabbage
   let t6= humanTravel && newWolf == wolf && newGoat == goat && newCabbage == cabbage - 1
   let t7= humanTravel && newWolf == wolf && newGoat == goat && newCabbage == cabbage
   let objectTravel = t1 || t2 || t3 || t4 || t5 || t6 || t7
   
   #ПослСдняя строка Π² Ρ€Π°Π·Π΄Π΅Π»Π΅ Ρ‚Ρ€Π°Π½Π·Π°ΠΊΡ†ΠΈΠΈ описываСт Ρ€Π°Π·Ρ€Π΅ΡˆΠ°ΡŽΡ‰Π΅Π΅ Ρ‚Ρ€Π°Π½Π·Π°ΠΊΡ†ΠΈΡŽ условиС.
   #ΠŸΠ΅Ρ€Π΅ΠΌΠ΅Π½Π½Ρ‹Π΅ Ρ‚Ρ€Π°Π½Π·Π°ΠΊΡ†ΠΈΠΈ Π΄ΠΎΠ»ΠΆΠ½Ρ‹ ΠΈΠΌΠ΅Ρ‚ΡŒ значСния 1 ΠΈΠ»ΠΈ 0, всС ΠΎΠ±ΡŠΠ΅ΠΊΡ‚Ρ‹ Π΄ΠΎΠ»ΠΆΠ½Ρ‹
   #Π±Ρ‹Ρ‚ΡŒ Π² бСзопасности, Π° Ρ„Π΅Ρ€ΠΌΠ΅Ρ€ Π΄ΠΎΠ»ΠΆΠ΅Π½ ΠΏΠ΅Ρ€Π΅ΠΏΠ»Ρ‹Π²Π°Ρ‚ΡŒ Ρ€Π΅ΠΊΡƒ Π² ΠΎΠ΄ΠΈΠ½ΠΎΡ‡ΠΊΡƒ 
   #ΠΈΠ»ΠΈ с ΠΊΠ΅ΠΌ-Ρ‚ΠΎ Π½Π° ΠΊΠ°ΠΆΠ΄ΠΎΠΌ ΡˆΠ°Π³Ρƒ
   side && safe && humanTravel && objectTravel
case s:TransferTransaction =>
   #Вранзакция Π²Ρ‹Π²ΠΎΠ΄Π° срСдств Ρ€Π°Π·Ρ€Π΅ΡˆΠ΅Π½Π° Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Π² случаС Ссли всС ΠΏΠ΅Ρ€Π΅ΠΏΠ»Ρ‹Π»ΠΈ Π½Π° Π΄Ρ€ΡƒΠ³ΠΎΠΉ Π±Π΅Ρ€Π΅Π³
   human == 1 && wolf == 1 && goat == 1 && cabbage == 1

#ВсС ΠΏΡ€ΠΎΡ‡ΠΈΠ΅ Ρ‚ΠΈΠΏΡ‹ Ρ‚Ρ€Π°Π½Π·Π°ΠΊΡ†ΠΈΠΉ Π·Π°ΠΏΡ€Π΅Ρ‰Π΅Π½Ρ‹
case _ => false

}

PHP first extracts all variables from the smart contract in the form of their keys and the corresponding boolean expression.

cabbage: extract ( getInteger ( contract , "cabbage" ) )
goat: extract ( getInteger ( contract , "goat" ) )
human: extract ( getInteger ( contract , "human" ) )
wolf: extract ( getInteger ( contract , "wolf" ) )
fState: human== 1 && wolf== 1 && goat== 1 && cabbage== 1
fState: 
wolf: 
goat: 
cabbage: 
cabbageSide: cabbage== 0 || cabbage== 1
human: extract ( getInteger ( contract , "human" ) )
newGoat: extract ( getInteger ( t.data , "goat" ) )
newHuman: extract ( getInteger ( t.data , "human" ) )
goatSide: goat== 0 || goat== 1
humanSide: human== 0 || human== 1
t7: humanTravel && newWolf== wolf && newGoat== goat && newCabbage== cabbage
t3: humanTravel && newWolf== wolf && newGoat== goat && newCabbage== cabbage + 1
t6: humanTravel && newWolf== wolf && newGoat== goat && newCabbage== cabbage - 1
t2: humanTravel && newWolf== wolf && newGoat== goat + 1 && newCabbage== cabbage
t5: humanTravel && newWolf== wolf && newGoat== goat - 1 && newCabbage== cabbage
t1: humanTravel && newWolf== wolf + 1 && newGoat== goat && newCabbage== cabbage
t4: humanTravel && newWolf== wolf - 1 && newGoat== goat && newCabbage== cabbage
safeAlone: newGoat != newWolf && newGoat != newCabbage
wolfSide: wolf== 0 || wolf== 1
humanTravel: human != newHuman
side: humanSide && wolfSide && goatSide && cabbageSide
safe: safeAlone || newGoat== newHuman
objectTravel: t1 || t2 || t3 || t4 || t5 || t6 || t7

PHP then converts them into a Z3Prover SMT compatible python system description.
The data is wrapped in a loop, where the storage variables receive index i, the transaction variables index i + 1, and the variables with expressions set the rules for transition from the previous state to the next.

This is the very heart of our virtual machine, which provides a multi-transactional search engine.

fState:  And( And( And( human[Steps]  ==  1 , wolf[Steps]  ==  1 )  , goat[Steps]  ==  1 )  , cabbage[Steps]  ==  1 )  
final:  fState[Steps] 
fState:   
wolf:   
goat:   
cabbage:   
cabbageSide:  Or( cabbage[i]  ==  0 , cabbage[i]  ==  1 )  
goatSide:  Or( goat[i]  ==  0 , goat[i]  ==  1 )  
humanSide:  Or( human[i]  ==  0 , human[i]  ==  1 )  
t7:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] )  , goat[i+1]  ==  goat[i] )  , cabbage  ==  cabbage[i] )  
t3:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] )  , goat[i+1]  ==  goat[i] )  , cabbage  ==  cabbage[i] + 1 )  
t6:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] )  , goat[i+1]  ==  goat[i] )  , cabbage  ==  cabbage[i] - 1 )  
t2:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] )  , goat[i+1]  ==  goat[i] + 1 )  , cabbage  ==  cabbage[i] )  
t5:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] )  , goat[i+1]  ==  goat[i] - 1 )  , cabbage  ==  cabbage[i] )  
t1:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] + 1 )  , goat[i+1]  ==  goat[i] )  , cabbage  ==  cabbage[i] )  
t4:  And( And( And( humanTravel[i] , wolf  ==  wolf[i] - 1 )  , goat[i+1]  ==  goat[i] )  , cabbage  ==  cabbage[i] )  
safeAlone:  And( goat[i+1] != wolf , goat[i+1] != cabbage )  
wolfSide:  Or( wolf[i]  ==  0 , wolf[i]  ==  1 )  
humanTravel:  human[i] != human[i+1] 
side:  And( And( And( humanSide[i] , wolfSide[i] )  , goatSide[i] )  , cabbageSide[i] )  
safe:  Or( safeAlone[i] , goat[i+1]  ==  human[i+1] )  
objectTravel:  Or( Or( Or( Or( Or( Or( t1[i] , t2[i] )  , t3[i] )  , t4[i] )  , t5[i] )  , t6[i] )  , t7[i] )  
data:  And( And( And( side[i] , safe[i] )  , humanTravel[i] )  , objectTravel[i] )  

The conditions are sorted and inserted into a script template designed to describe the SMT system in python.

Empty template


import json

from z3 import *

s = Solver()

  
  
    
Steps=7
Num= Steps+1

$code$



#template, only start rest
s.add(data + start)

#template
s.add(final)




ind = 0

f = open("/var/www/html/all/bin/python/log.txt", "a")



while s.check() == sat:
  ind = ind +1
  

  print ind
  m = s.model()
  print m

  print "traversing model..." 
  #for d in m.decls():
	#print "%s = %s" % (d.name(), m[d])

  
 
  f.write(str(m))
  f.write("nn")
  exit()
  #s.add(Or(goat[0] != s.model()[data[0]] )) # prevent next model from using the same assignment as a previous model



print "Total solution number: "
print ind  

f.close()
 


For the last state of the whole chain, the rules that are set in the transfer transaction section are applied.

This means that Z3Prover will look for exactly such sets of states that will ultimately allow you to withdraw funds from the contract.

As a result, we automatically get a fully functional SMT model of our contract.
You can see that it is very similar to the model from my previous article, which I compiled by hand.

Completed Template


import json

from z3 import *

s = Solver()

  
  
    
Steps=7
Num= Steps+1

human = [ Int('human_%i' % (i + 1)) for i in range(Num) ]
wolf = [ Int('wolf_%i' % (i + 1)) for i in range(Num) ]
goat = [ Int('goat_%i' % (i + 1)) for i in range(Num) ]
cabbage = [ Int('cabbage_%i' % (i + 1)) for i in range(Num) ]
nothing= [  And( human[i] == human[i+1], wolf[i] == wolf[i+1], goat[i] == goat[i+1], cabbage[i] == cabbage[i+1] )   for i in range(Num-1) ]


start= [ human[0] == 1, wolf[0] == 0, goat[0] == 1, cabbage[0] == 0 ]

safeAlone= [  And( goat[i+1] != wolf[i+1] , goat[i+1] != cabbage[i+1] )   for i in range(Num-1) ]
safe= [  Or( safeAlone[i] , goat[i+1]  ==  human[i+1] )   for i in range(Num-1) ]
humanTravel= [  human[i] != human[i+1]  for i in range(Num-1) ]
cabbageSide= [  Or( cabbage[i]  ==  0 , cabbage[i]  ==  1 )   for i in range(Num-1) ]
goatSide= [  Or( goat[i]  ==  0 , goat[i]  ==  1 )   for i in range(Num-1) ]
humanSide= [  Or( human[i]  ==  0 , human[i]  ==  1 )   for i in range(Num-1) ]
t7= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] )  , goat[i+1]  ==  goat[i] )  , cabbage[i+1]  ==  cabbage[i] )   for i in range(Num-1) ]
t3= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] )  , goat[i+1]  ==  goat[i] )  , cabbage[i+1]  ==  cabbage[i] + 1 )   for i in range(Num-1) ]
t6= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] )  , goat[i+1]  ==  goat[i] )  , cabbage[i+1]  ==  cabbage[i] - 1 )   for i in range(Num-1) ]
t2= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] )  , goat[i+1]  ==  goat[i] + 1 )  , cabbage[i+1]  ==  cabbage[i] )   for i in range(Num-1) ]
t5= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] )  , goat[i+1]  ==  goat[i] - 1 )  , cabbage[i+1]  ==  cabbage[i] )   for i in range(Num-1) ]
t1= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] + 1 )  , goat[i+1]  ==  goat[i] )  , cabbage[i+1]  ==  cabbage[i] )   for i in range(Num-1) ]
t4= [  And( And( And( humanTravel[i] , wolf[i+1]  ==  wolf[i] - 1 )  , goat[i+1]  ==  goat[i] )  , cabbage[i+1]  ==  cabbage[i] )   for i in range(Num-1) ]
wolfSide= [  Or( wolf[i]  ==  0 , wolf[i]  ==  1 )   for i in range(Num-1) ]
side= [  And( And( And( humanSide[i] , wolfSide[i] )  , goatSide[i] )  , cabbageSide[i] )   for i in range(Num-1) ]
objectTravel= [  Or( Or( Or( Or( Or( Or( t1[i] , t2[i] )  , t3[i] )  , t4[i] )  , t5[i] )  , t6[i] )  , t7[i] )   for i in range(Num-1) ]
data= [ Or(  And( And( And( side[i] , safe[i] )  , humanTravel[i] )  , objectTravel[i] )   , nothing[i]) for i in range(Num-1) ]


fState=  And( And( And( human[Steps]  ==  1 , wolf[Steps]  ==  1 )  , goat[Steps]  ==  1 )  , cabbage[Steps]  ==  1 )  
final=  fState 




#template, only start rest
s.add(data + start)

#template
s.add(final)




ind = 0

f = open("/var/www/html/all/bin/python/log.txt", "a")



while s.check() == sat:
  ind = ind +1
  

  print ind
  m = s.model()
  print m

  print "traversing model..." 
  #for d in m.decls():
	#print "%s = %s" % (d.name(), m[d])

  
 
  f.write(str(m))
  f.write("nn")
  exit()
  #s.add(Or(goat[0] != s.model()[data[0]] )) # prevent next model from using the same assignment as a previous model



print "Total solution number: "
print ind  

f.close()
 


After launch, Z3Prover solves the smart contract and displays a chain of transactions for us, which will allow us to withdraw funds:

Winning transaction chain found:
Data transaction: human= 0, wolf= 0, goat= 1, cabbage= 0
Data transaction: human= 1, wolf= 0, goat= 1, cabbage= 1
Data transaction: human= 0, wolf= 0, goat= 0, cabbage= 1
Data transaction: human= 1, wolf= 1, goat= 0, cabbage= 1
Data transaction: human= 0, wolf= 1, goat= 0, cabbage= 1
Data transaction: human= 1, wolf= 1, goat= 1, cabbage= 1
Data transaction: human= 1, wolf= 1, goat= 1, cabbage= 1
Transfer transaction

In addition to the crossing contract, you can experiment with your own contracts or try this simple example, which is solved in 2 transactions.

let contract = tx.sender
let a= extract(getInteger(contract,"a"))
let b= extract(getInteger(contract,"b"))
let c= extract(getInteger(contract,"c"))
let d= extract(getInteger(contract,"d"))

match tx {
case t:DataTransaction =>
let na= extract(getInteger(t.data,"a")) 
let nb= extract(getInteger(t.data,"b"))
let nc= extract(getInteger(t.data,"c"))
let nd= extract(getInteger(t.data,"d"))
   
   nd == 0 || a == 100 - 5
case s:TransferTransaction =>
   ( a + b - c ) * d == 12

case _ => true

}

Since this is the very first version, the syntax is very limited and there may be bugs.
In the next articles, I plan to cover the further development of the VM, and show how you can create formally verified smart contracts with it, and not just solve them.

The character virtual machine is available at http://2.59.42.98/hyperbox/
After bringing the source code of the character VM in order and adding comments there, I plan to put it on github in free access.

Source: habr.com

Add a comment