Contracts
Contracts
Contracts in Solidity are similar to classes in object-oriented languages. They contain persistent data in state variables and functions that can modify these variables. Calling a function on a different contract (instance) will perform an EVM function call and thus switch the context such that state variables are inaccessible.
Creating Contracts
Contracts can be created “from outside” via Ethereum transactions or from within Solidity contracts.
IDEs, such as Remix, make the creation process seamless using UI elements.
Creating contracts programmatically on Ethereum is best done via using the JavaScript API web3.js. It has a function called web3.eth.Contract to facilitate contract creation.
When a contract is created, its constructor (a function declared with the constructor
keyword) is executed once.
A constructor is optional. Only one constructor is allowed, which means overloading is not supported.
After the constructor has executed, the final code of the contract is deployed to the blockchain. This code includes all public and external functions and all functions that are reachable from there through function calls. The deployed code does not include the constructor code or internal functions only called from the constructor.
Internally, constructor arguments are passed ABI encoded after the code of the contract itself, but you do not have to care about this if you use web3.js
.
If a contract wants to create another contract, the source code (and the binary) of the created contract has to be known to the creator. This means that cyclic creation dependencies are impossible.
Visibility and Getters
Since Solidity knows two kinds of function calls (internal ones that do not create an actual EVM call (also called a “message call”) and external ones that do), there are four types of visibilities for functions and state variables.
Functions have to be specified as being external
, public
, internal
or private
. For state variables, external
is not possible.external
:External functions are part of the contract interface, which means they can be called from other contracts and via transactions. An external function f
cannot be called internally (i.e. f()
does not work, but this.f()
works). External functions are sometimes more efficient when they receive large arrays of data.public
:Public functions are part of the contract interface and can be either called internally or via messages. For public state variables, an automatic getter function (see below) is generated.internal
:Those functions and state variables can only be accessed internally (i.e. from within the current contract or contracts deriving from it), without using this
.private
:Private functions and state variables are only visible for the contract they are defined in and not in derived contracts.
Everything that is inside a contract is visible to all observers external to the blockchain. Making something private
only prevents other contracts from accessing and modifying the information, but it will still be visible to the whole world outside of the blockchain.
The visibility specifier is given after the type for state variables and between parameter list and return parameter list for functions.
In the following example, D
, can call c.getData()
to retrieve the value of data
in state storage, but is not able to call f
. Contract E
is derived from C
and, thus, can call compute
.
Getter Functions
The compiler automatically creates getter functions for all public state variables. For the contract given below, the compiler will generate a function called data
that does not take any arguments and returns a uint
, the value of the state variable data
. State variables can be initialized when they are declared.
The getter functions have external visibility. If the symbol is accessed internally (i.e. without this.
), it evaluates to a state variable. If it is accessed externally (i.e. with this.
), it evaluates to a function.
If you have a public
state variable of array type, then you can only retrieve single elements of the array via the generated getter function. This mechanism exists to avoid high gas costs when returning an entire array. You can use arguments to specify which individual element to return, for example data(0)
. If you want to return an entire array in one call, then you need to write a function, for example:
Now you can use getArray()
to retrieve the entire array, instead of myArray(i)
, which returns a single element per call.
The next example is more complex:
It generates a function of the following form. The mapping in the struct is omitted because there is no good way to provide the key for the mapping:
Function Modifiers
Modifiers can be used to easily change the behaviour of functions. For example, they can automatically check a condition prior to executing the function. Modifiers are inheritable properties of contracts and may be overridden by derived contracts.
Multiple modifiers are applied to a function by specifying them in a whitespace-separated list and are evaluated in the order presented.
In an earlier version of Solidity, return
statements in functions having modifiers behaved differently.
Explicit returns from a modifier or function body only leave the current modifier or function body. Return variables are assigned and control flow continues after the “_” in the preceding modifier.
Arbitrary expressions are allowed for modifier arguments and in this context, all symbols visible from the function are visible in the modifier. Symbols introduced in the modifier are not visible in the function (as they might change by overriding).
Constant State Variables
State variables can be declared as constant
. In this case, they have to be assigned from an expression which is a constant at compile time. Any expression that accesses storage, blockchain data (e.g. now
, address(this).balance
or block.number
) or execution data (msg.value
or gasleft()
) or makes calls to external contracts is disallowed. Expressions that might have a side-effect on memory allocation are allowed, but those that might have a side-effect on other memory objects are not. The built-in functions keccak256
, sha256
, ripemd160
, ecrecover
, addmod
and mulmod
are allowed (even though, with the exception of keccak256
, they do call external contracts).
The reason behind allowing side-effects on the memory allocator is that it should be possible to construct complex objects like e.g. lookup-tables. This feature is not yet fully usable.
The compiler does not reserve a storage slot for these variables, and every occurrence is replaced by the respective constant expression (which might be computed to a single value by the optimizer).
Not all types for constants are implemented at this time. The only supported types are value types and strings.
Functions
Function Parameters and Return Variables
As in JavaScript, functions may take parameters as input. Unlike in JavaScript and C, functions may also return an arbitrary number of values as output.
Function Parameters
Function parameters are declared the same way as variables, and the name of unused parameters can be omitted.
For example, if you want your contract to accept one kind of external call with two integers, you would use something like:
Function parameters can be used as any other local variable and they can also be assigned to.
An external function cannot accept a multi-dimensional array as an input parameter. This functionality is possible if you enable the new experimental ABIEncoderV2
feature by adding pragma experimental ABIEncoderV2;
to your source file.
An internal function can accept a multi-dimensional array without enabling the feature.
Return Variables
Function return variables are declared with the same syntax after the returns
keyword.
For example, suppose you want to return two results: the sum and the product of two integers passed as function parameters, then you use something like:
The names of return variables can be omitted. Return variables can be used as any other local variable and they are initialized with their default value and have that value unless explicitly set.
You can either explicitly assign to return variables and then leave the function using return;
, or you can provide return values (either a single or multiple ones) directly with the return
statement:
This form is equivalent to first assigning values to the return variables and then using return;
to leave the function.
You cannot return some types from non-internal functions, notably multi-dimensional dynamic arrays and structs. If you enable the new experimental ABIEncoderV2
feature by adding pragma experimental ABIEncoderV2;
to your source file then more types are available, but mapping
types are still limited to inside a single contract and you cannot transfer them.
Returning Multiple Values
When a function has multiple return types, the statement return (v0, v1, ..., vn) can be used to return multiple values. vn)
can return multiple values. The number of components must be the same as the number of return types.
View Functions
Functions can be declared view
in which case they promise not to modify the state.
If the compiler’s EVM target is Byzantium or newer (default) the opcode STATICCALL
is used for view
functions which enforces the state to stay unmodified as part of the EVM execution. For library view
functions DELEGATECALL
is used, because there is no combined DELEGATECALL
and STATICCALL
. This means library view
functions do not have run-time checks that prevent state modifications. This should not impact security negatively because library code is usually known at compile-time and the static checker performs compile-time checks.
The following statements are considered modifying the state:
Writing to state variables.
Emitting events.
Using
selfdestruct
.Sending Ether via calls.
Calling any function not marked
view
orpure
.Using low-level calls.
Using inline assembly that contains certain opcodes.
constant
on functions used to be an alias to view
, but this was dropped in version 0.5.0.
Getter methods are automatically marked view
.
Prior to version 0.5.0, the compiler did not use the STATICCALL
opcode for view
functions. This enabled state modifications in view
functions through the use of invalid explicit type conversions. By using STATICCALL
for view
functions, modifications to the state are prevented on the level of the EVM.
Pure Functions
Functions can be declared pure
in which case they promise not to read from or modify the state.
If the compiler’s EVM target is Byzantium or newer (default) the opcode STATICCALL
is used, which does not guarantee that the state is not read, but at least that it is not modified.
In addition to the list of state modifying statements explained above, the following are considered reading from the state:
Reading from state variables.
Accessing
address(this).balance
or.balance
.Accessing any of the members of
block
,tx
,msg
(with the exception ofmsg.sig
andmsg.data
).Calling any function not marked
pure
.Using inline assembly that contains certain opcodes.
Pure functions are able to use the revert() and require() functions to revert potential state changes when an error occurs.
Reverting a state change is not considered a “state modification”, as only changes to the state made previously in code that did not have the view
or pure
restriction are reverted and that code has the option to catch the revert
and not pass it on.
This behaviour is also in line with the STATICCALL
opcode.
It is not possible to prevent functions from reading the state at the level of the EVM, it is only possible to prevent them from writing to the state (i.e. only view
can be enforced at the EVM level, pure
can not).
Prior to version 0.5.0, the compiler did not use the STATICCALL
opcode for pure
functions. This enabled state modifications in pure
functions through the use of invalid explicit type conversions. By using STATICCALL
for pure
functions, modifications to the state are prevented on the level of the EVM.
Prior to version 0.4.17 the compiler did not enforce that pure
is not reading the state. It is a compile-time type check, which can be circumvented doing invalid explicit conversions between contract types, because the compiler can verify that the type of the contract does not do state-changing operations, but it cannot check that the contract that will be called at runtime is actually of that type.
Fallback Function
A contract can have exactly one unnamed function. This function cannot have arguments, cannot return anything and has to have external
visibility. It is executed on a call to the contract if none of the other functions match the given function identifier (or if no data was supplied at all).
Furthermore, this function is executed whenever the contract receives plain Ether (without data). Additionally, in order to receive Ether, the fallback function must be marked payable
. If no such function exists, the contract cannot receive Ether through regular transactions.
In the worst case, the fallback function can only rely on 2300 gas being available (for example when send or transfer is used), leaving little room to perform other operations except basic logging. The following operations will consume more gas than the 2300 gas stipend:
Writing to storage
Creating a contract
Calling an external function which consumes a large amount of gas
Sending Ether
Like any function, the fallback function can execute complex operations as long as there is enough gas passed on to it.
Even though the fallback function cannot have arguments, one can still use msg.data
to retrieve any payload supplied with the call.
The fallback function is also executed if the caller meant to call a function that is not available. If you want to implement the fallback function only to receive ether, you should add a check like require(msg.data.length == 0)
to prevent invalid calls.
Contracts that receive Ether directly (without a function call, i.e. using send
or transfer
) but do not define a fallback function throw an exception, sending back the Ether (this was different before Solidity v0.4.0). So if you want your contract to receive Ether, you have to implement a payable fallback function.
A contract without a payable fallback function can receive Ether as a recipient of a coinbase transaction (aka miner block reward) or as a destination of a selfdestruct
.
A contract cannot react to such Ether transfers and thus also cannot reject them. This is a design choice of the EVM and Solidity cannot work around it.
It also means that address(this).balance
can be higher than the sum of some manual accounting implemented in a contract (i.e. having a counter updated in the fallback function).
Function Overloading
A contract can have multiple functions of the same name but with different parameter types. This process is called “overloading” and also applies to inherited functions. The following example shows overloading of the function f
in the scope of contract A
.
Overloaded functions are also present in the external interface. It is an error if two externally visible functions differ by their Solidity types but not by their external types.
Both f
function overloads above end up accepting the address type for the ABI although they are considered different inside Solidity.
Overload resolution and Argument matching¶
Overloaded functions are selected by matching the function declarations in the current scope to the arguments supplied in the function call. Functions are selected as overload candidates if all arguments can be implicitly converted to the expected types. If there is not exactly one candidate, resolution fails.
Return parameters are not taken into account for overload resolution.
Calling f(50)
would create a type error since 50
can be implicitly converted both to uint8
and uint256
types. On another hand f(256)
would resolve to f(uint256)
overload as 256
cannot be implicitly converted to uint8
.
Events
Solidity events give an abstraction on top of the EVM’s logging functionality. Applications can subscribe and listen to these events through the RPC interface of an Ethereum client.
Events are inheritable members of contracts. When you call them, they cause the arguments to be stored in the transaction’s log - a special data structure in the blockchain. These logs are associated with the address of the contract, are incorporated into the blockchain, and stay there as long as a block is accessible (forever as of the Frontier and Homestead releases, but this might change with Serenity). The Log and its event data is not accessible from within contracts (not even from the contract that created them).
It is possible to request a simple payment verification (SPV) for logs, so if an external entity supplies a contract with such a verification, it can check that the log actually exists inside the blockchain. You have to supply block headers because the contract can only see the last 256 block hashes.
You can add the attribute indexed
to up to three parameters which adds them to a special data structure known as “topics” instead of the data part of the log. If you use arrays (including string
and bytes
) as indexed arguments, its Keccak-256 hash is stored as a topic instead, this is because a topic can only hold a single word (32 bytes).
All parameters without the indexed
attribute are ABI-encoded into the data part of the log.
Topics allow you to search for events, for example when filtering a sequence of blocks for certain events. You can also filter events by the address of the contract that emitted the event.
For example, the code below uses the web3.js subscribe("logs")
method to filter logs that match a topic with a certain address value:
The hash of the signature of the event is one of the topics, except if you declared the event with the anonymous
specifier. This means that it is not possible to filter for specific anonymous events by name.
The use in the JavaScript API is as follows:
The output of the above looks like the following (trimmed):
Low-Level Interface to Logs
It is also possible to access the low-level interface to the logging mechanism via the functions log0
, log1
, log2
, log3
and log4
. logi
takes i + 1
parameter of type bytes32
, where the first argument will be used for the data part of the log and the others as topics. The event call above can be performed in the same way as
where the long hexadecimal number is equal to keccak256("Deposit(address,bytes32,uint256)")
, the signature of the event.
Inheritance
Solidity supports multiple inheritance including polymorphism.
All function calls are virtual, which means that the most derived function is called, except when the contract name is explicitly given or the super
keyword is used.
When a contract inherits from other contracts, only a single contract is created on the blockchain, and the code from all the base contracts is compiled into the created contract.
The general inheritance system is very similar to Python’s, especially concerning multiple inheritance, but there are also some differences.
Details are given in the following example.
Note that above, we call mortal.kill()
to “forward” the destruction request. The way this is done is problematic, as seen in the following example:
A call to Final.kill()
will call Base2.kill
as the most derived override, but this function will bypass Base1.kill
, basically because it does not even know about Base1
. The way around this is to use super
:
If Base2
calls a function of super
, it does not simply call this function on one of its base contracts. Rather, it calls this function on the next base contract in the final inheritance graph, so it will call Base1.kill()
(note that the final inheritance sequence is – starting with the most derived contract: Final, Base2, Base1, mortal, owned). The actual function that is called when using super is not known in the context of the class where it is used, although its type is known. This is similar for ordinary virtual method lookup.
Constructors
A constructor is an optional function declared with the constructor
keyword which is executed upon contract creation, and where you can run contract initialisation code.
Before the constructor code is executed, state variables are initialised to their specified value if you initialise them inline, or zero if you do not.
After the constructor has run, the final code of the contract is deployed to the blockchain. The deployment of the code costs additional gas linear to the length of the code. This code includes all functions that are part of the public interface and all functions that are reachable from there through function calls. It does not include the constructor code or internal functions that are only called from the constructor.
Constructor functions can be either public
or internal
. If there is no constructor, the contract will assume the default constructor, which is equivalent to constructor() public {}
. For example:
A constructor set as internal
causes the contract to be marked as abstract.
Prior to version 0.4.22, constructors were defined as functions with the same name as the contract. This syntax was deprecated and is not allowed anymore in version 0.5.0.
Arguments for Base Constructors
The constructors of all the base contracts will be called following the linearization rules explained below. If the base constructors have arguments, derived contracts need to specify all of them. This can be done in two ways:
One way is directly in the inheritance list (is Base(7)
). The other is in the way a modifier is invoked as part of the derived constructor (Base(_y * _y)
). The first way to do it is more convenient if the constructor argument is a constant and defines the behaviour of the contract or describes it. The second way has to be used if the constructor arguments of the base depend on those of the derived contract. Arguments have to be given either in the inheritance list or in modifier-style in the derived constructor. Specifying arguments in both places is an error.
If a derived contract does not specify the arguments to all of its base contracts’ constructors, it will be abstract.
Multiple Inheritance and Linearization
Languages that allow multiple inheritance have to deal with several problems. One is the Diamond Problem. Solidity is similar to Python in that it uses “C3 Linearization” to force a specific order in the directed acyclic graph (DAG) of base classes. This results in the desirable property of monotonicity but disallows some inheritance graphs. Especially, the order in which the base classes are given in the is
directive is important: You have to list the direct base contracts in the order from “most base-like” to “most derived”. Note that this order is the reverse of the one used in Python.
Another simplifying way to explain this is that when a function is called that is defined multiple times in different contracts, the given bases are searched from right to left (left to right in Python) in a depth-first manner, stopping at the first match. If a base contract has already been searched, it is skipped.
In the following code, Solidity will give the error “Linearization of inheritance graph impossible”.
The reason for this is that C
requests X
to override A
(by specifying A, X
in this order), but A
itself requests to override X
, which is a contradiction that cannot be resolved.
Inheriting Different Kinds of Members of the Same Name
When the inheritance results in a contract with a function and a modifier of the same name, it is considered as an error. This error is produced also by an event and a modifier of the same name, and a function and an event of the same name. As an exception, a state variable getter can override a public function.
Abstract Contracts
Contracts are marked as abstract when at least one of their functions lacks an implementation as in the following example (note that the function declaration header is terminated by ;
):
Such contracts cannot be compiled (even if they contain implemented functions alongside non-implemented functions), but they can be used as base contracts:
If a contract inherits from an abstract contract and does not implement all non-implemented functions by overriding, it will itself be abstract.
Note that a function without implementation is different from a Function Type even though their syntax looks very similar.
Example of function without implementation (a function declaration):
Example of a Function Type (a variable declaration, where the variable is of type function
):
Abstract contracts decouple the definition of a contract from its implementation providing better extensibility and self-documentation and facilitating patterns like the Template method and removing code duplication. Abstract contracts are useful in the same way that defining methods in an interface is useful. It is a way for the designer of the abstract contract to say “any child of mine must implement this method”.
Interfaces
Interfaces are similar to abstract contracts, but they cannot have any functions implemented. There are further restrictions:
They cannot inherit other contracts or interfaces.
All declared functions must be external.
They cannot declare a constructor.
They cannot declare state variables.
Some of these restrictions might be lifted in the future.
Interfaces are basically limited to what the Contract ABI can represent, and the conversion between the ABI and an interface should be possible without any information loss.
Interfaces are denoted by their own keyword:
Contracts can inherit interfaces as they would inherit other contracts.
Types defined inside interfaces and other contract-like structures can be accessed from other contracts: Token.TokenType
or Token.Coin
.
Libraries
Libraries are similar to contracts, but their purpose is that they are deployed only once at a specific address and their code is reused using the DELEGATECALL
(CALLCODE
until Homestead) feature of the EVM. This means that if library functions are called, their code is executed in the context of the calling contract, i.e. this
points to the calling contract, and especially the storage from the calling contract can be accessed. As a library is an isolated piece of source code, it can only access state variables of the calling contract if they are explicitly supplied (it would have no way to name them, otherwise). Library functions can only be called directly (i.e. without the use of DELEGATECALL
) if they do not modify the state (i.e. if they are view
or pure
functions), because libraries are assumed to be stateless. In particular, it is not possible to destroy a library.
Until version 0.4.20, it was possible to destroy libraries by circumventing Solidity’s type system. Starting from that version, libraries contain a mechanism that disallows state-modifying functions to be called directly (i.e. without DELEGATECALL
).
Libraries can be seen as implicit base contracts of the contracts that use them. They will not be explicitly visible in the inheritance hierarchy, but calls to library functions look just like calls to functions of explicit base contracts (L.f()
if L
is the name of the library). Furthermore, internal
functions of libraries are visible in all contracts, just as if the library were a base contract. Of course, calls to internal functions use the internal calling convention, which means that all internal types can be passed and types stored in memory will be passed by reference and not copied. To realize this in the EVM, code of internal library functions and all functions called from therein will at compile time be pulled into the calling contract, and a regular JUMP
call will be used instead of a DELEGATECALL
.
The following example illustrates how to use libraries (but manual method be sure to check out using for for a more advanced example to implement a set).
Of course, you do not have to follow this way to use libraries: they can also be used without defining struct data types. Functions also work without any storage reference parameters, and they can have multiple storage reference parameters and in any position.
The calls to Set.contains
, Set.insert
and Set.remove
are all compiled as calls (DELEGATECALL
) to an external contract/library. If you use libraries, be aware that an actual external function call is performed. msg.sender
, msg.value
and this
will retain their values in this call, though (prior to Homestead, because of the use of CALLCODE
, msg.sender
and msg.value
changed, though).
The following example shows how to use types stored in memory and internal functions in libraries in order to implement custom types without the overhead of external function calls:
As the compiler cannot know where the library will be deployed at, these addresses have to be filled into the final bytecode by a linker (see Using the Commandline Compiler for how to use the commandline compiler for linking). If the addresses are not given as arguments to the compiler, the compiled hex code will contain placeholders of the form __Set______
(where Set
is the name of the library). The address can be filled manually by replacing all those 40 symbols by the hex encoding of the address of the library contract.
Manually linking libraries on the generated bytecode is discouraged, because it is restricted to 36 characters. You should ask the compiler to link the libraries at the time a contract is compiled by either using the --libraries
option of solc
or the libraries
key if you use the standard-JSON interface to the compiler.
Restrictions for libraries in comparison to contracts:
No state variables
Cannot inherit nor be inherited
Cannot receive Ether
(These might be lifted at a later point.)
Call Protection For Libraries
As mentioned in the introduction, if a library’s code is executed using a CALL
instead of a DELEGATECALL
or CALLCODE
, it will revert unless a view
or pure
function is called.
The EVM does not provide a direct way for a contract to detect whether it was called using CALL
or not, but a contract can use the ADDRESS
opcode to find out “where” it is currently running. The generated code compares this address to the address used at construction time to determine the mode of calling.
More specifically, the runtime code of a library always starts with a push instruction, which is a zero of 20 bytes at compilation time. When the deploy code runs, this constant is replaced in memory by the current address and this modified code is stored in the contract. At runtime, this causes the deploy time address to be the first constant to be pushed onto the stack and the dispatcher code compares the current address against this constant for any non-view and non-pure function.
Using For
The directive using A for B;
can be used to attach library functions (from the library A
) to any type (B
). These functions will receive the object they are called on as their first parameter (like the self
variable in Python).
The effect of using A for *;
is that the functions from the library A
are attached to any type.
In both situations, all functions in the library are attached, even those where the type of the first parameter does not match the type of the object. The type is checked at the point the function is called and function overload resolution is performed.
The using A for B;
directive is active only within the current contract, including within all of its functions, and has no effect outside of the contract in which it is used. The directive may only be used inside a contract, not inside any of its functions.
By including a library, its data types including library functions are available without having to add further code.
Let us rewrite the set example from the Libraries in this way:
It is also possible to extend elementary types in that way:
Note that all library calls are actual EVM function calls. This means that if you pass memory or value types, a copy will be performed, even of the self
variable. The only situation where no copy will be performed is when storage reference variables are used.
Last updated