Side Chain Storage
Motivation
This proposal is Side Chain approach to storing and validating the collected data. The motivation to take into account this proposal is to increase confidence and trust in the system via external audit and by assuring validity and immutability of the data. It also gives the opportunity to reduce transactional cost when it comes to token transactions.
This solution is basically increasing the value of the project by:
Increasing confidence.
By adding a new figure called validator which is going to be in charge of validating data collected is correct and assuring no negligence is being mentioned.
Immutability.
Solution implies the implementation of a merkle tree (a cryptographic algorithm) which is going to assure immutability of the already validated data.
Cost reduction.
transactions over private Side Chains are cheaper and faster than transactions over the public Main Chain. This leads us to a whole set of cost reduction and availability increment opportunities.
Proposal.
The proposal is structured in three different layers:
Physical layer. This is the tangible layer and it contains the entities that provide the data to the system (vinduinos, weather Station and workers). It also holds the administrator/owner of the system.
Side Chain. This is the transactional and storing layer. It includes the smart contracts and two physical world related entities called executor and validator which will be treated further into this proposal.
Main Chain. This is the security trusted layer. It mantains the real value for the tokens and validity logs (hashes) that assure the content in the Side Chain hasn't been altered.
Physical Layer.
This layer is going to be in charge of introducing the data to the system. In fact, each entity in this layer but the owner must be treated as a datasource.
Vinduinos. Vinduinos must send information collected directly to the Side Chain with a given frequency (this frequency will be discussed later on). Each vinduino must have a Side Chain pair of keys in order to transact via its identity allowing contracts to classify correctly the information.
Weather Station. It must send information to the Side Chain with a given frequency (this frequency will be discussed later on). This weather station must own a Side Chain pair of keys to allow its identification as.
Workers. Workers have to send their task log to the Side Chain via a software (not specified) which must be capable of holding wallets and of transacting with the blockchain. This software will interact with the Side Chain, using the wallet stored, permitting worker identification in the transaction.
Owner. This is a special user who is going to be in charge of administrating the Side Chain by:
A. Selecting validators (there might exist a method of auto-selection to avoid any negligence)
B. Executing merkle root updates (main chain data copies too).
C. Executing side chain to main chain transfers (MTB18 tokens).
*) Both B and C are further described in the Side Chain Layer and the Main Chain Layer point.
Side Chain Layer.
This layer is the responsible for validating data, for storing data and for offering a reduced cost transaction option.
Validators. Validators are persons who are in charge of validating the data collected is correct. They have no relation with the vineyard. They are working in the validation because they will receive a prize for doing it (similar to what happens with miners). The prize have to be determined, but it can be a quantity of MTB18* tokens based on their amount of work (thats why the contract validators has a register of validation counts for each validator).
Executor. The executor is responsible for transmitting data from the Side Chain Layer to the Main Chain Layer and of executing the merkle root updates.
Contracts. There are 3 subgroups of smart contracts: (Important: this is just an abstract idea, it is not a full design of a solution. Contracts are not truly specified, it's just the structure what has been taken into account.)
A. Data storage. This smart contracts are used to store the information. They all have a similar structure which contains: a timestamp (time of the transaction), some information (data collected) and a hash generated with the concatenation of the timestamp and the information.
B. Validity and immutability. This package contains three different contracts.
Interactors. Contain a description of each type of valid users. This contract will allow us to control permissions over the Side Chain and will also give the possibility of offering users a brief interactor description and its identifier.
Validation. It contains a timestamp, a count of validations and a list with the addresses which has already validated that transaction. This contract controls data validity via external audit called validators. Validators are supposed to, every X (time to be determined), validate data collected is correct and transact against this contract to communicate that everything is working OK. A validation of a given timestamp instantly validates previous validation entries. So that: "If V1 validates at 13.59 and V2 validates at 18.00, validation at 13.59 will have been done by V1 and V2 whereas validation at 18.00 will have been done only by V2". Valid data (data storage contracts) will be represented by a lower timestamp than the one in the last validation stored with more than a 50% of validators signs (this percentage may be changed)
MerkleTree. Every X time the executor will execute this contract to calculate a merkle root with the following values:
- The previous merkle root. (1 leaf)
- The new information appended and validated. (n leaf).This will return a hash which will represent the actual state of the information. If some historical values are changed and the merkle root is recalculated the hash will be different and negligence will be detected. As this contract saves the historical value of the merkle root and its timestamp (state at a time), in case of negligence, by recalculating the merkle roots backwards it would be possible to discover at which point this negligence was made and this would prevent from having to delete all the information.
(merkle-tree.py) Merkle-tree example in python.
C. Token MTB18. This is going to be explained in the Main Chain Layer point.
Main Chain Layer.
This layer is supposed to be hold in the Ethereum Mainchain. The cost of transaction of this network is higher than the transaction cost in the Side Chain where it is negligible. However, this layer is the layer in which the value of the Token is and, in fact, the one with the higher confidence. As a consequence, a copy of the Side Chain Merkle Root Historic is going to be stored in this chain to increase even more the immutability, this opens us another way of action which is to duplicate this data, once it is validated in the SC, into an external database to protect the system in case the Side Chain fails.
MTB18. The token already implemented.
MerkleCheck. The historical copy of data from the MerkleTreeCheck contract located in the Side Chain.
LockMTB18. This is a contract that locks MTB18 Tokens in order to transfer them to the Side Chain. Whenever the user wants to get it back to the Main chain, he just has to unlock them buy burning the MTB18 in the Side Chain. In example:
"Let's suppose we are 4 friends (F1, F2, F3, F4) who bought 6 MTB each one and:
Action | (MC) MTB18* | (MC) LOCKMTB18* | (SC) MTB18* |
---|---|---|---|
F1 locks 6 MTB18 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 6, MTB18[F4] = 6 | MTB18Lock = 6 | MTB18[F1] = 6, MTB18[F2] = 0, MTB18[F3] = 0, MTB18[F4] = 0 |
F1 transfers 3 MTB18 to F2 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 6, MTB18[F4] = 6 | MTB18Lock = 6 | MTB18[F1] = 0, MTB18[F2] = 3, MTB18[F3] = 0, MTB18[F4] = 0 |
F1 transfers 2 MTB18 to F3 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 6, MTB18[F4] = 6 | MTB18Lock = 6 | MTB18[F1] = 0, MTB18[F2] = 3, MTB18[F3] = 2, MTB18[F4] = 0 |
F1 transfers 1 MTB18 to F4 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 6, MTB18[F4] = 6 | MTB18Lock = 6 | MTB18[F1] = 0, MTB18[F2] = 3, MTB18[F3] = 2, MTB18[F4] = 1 |
F4 transfers 1 MTB18 to F3 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 6, MTB18[F4] = 6 | MTB18Lock = 6 | MTB18[F1] = 0, MTB18[F2] = 3, MTB18[F3] = 3, MTB18[F4] = 0 |
F2 transfers 3 MTB18 to F3 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 6, MTB18[F4] = 6 | MTB18Lock = 6 | MTB18[F1] = 0, MTB18[F2] = 0, MTB18[F3] = 6, MTB18[F4] = 0 |
F3 unlocks 6 MTB18 | MTB18[F1] = 0, MTB18[F2] = 6, MTB18[F3] = 12, MTB18[F4] = 6 | MTB18Lock = 0 | MTB18[F1] = 0, MTB18[F2] = 0, MTB18[F3] = 0, MTB18[F4] = 0 |
If this was done in the Main Chain, 5 transactions would be needed. In the case of doing it through the Side Chain, only 2 transactions are needed, the lock and the unlock one."
Previous proposal
Currently, data received from vinduinos and sensor stations is being stored in a DB service. Vinduinos and sensor stations send the data collected to Bravo node (10.112.48.22) where the DB is allocated (PostgreSQL 8.3.x server). This database is used by the website (costaflores.com) to display the data for the user to check. A structure similar to the one shown in the following figure:
With this proposal we want to change this strucuture in order to replace the database with a Side Chain (Ethereum private network) solution. This proposal will integrate the database part of the project into the blockchain, and thanks to that, this project is going to based on a full-blockchain backend (not really common). Having a full-blockchain backend contributes highly to transparency and immutability. As OpenVino is an open-vineyard, transparency is crucial. The new structure is going to follow this structure (extended description of internal side chain design in the 'Blockchain storage proposal'):
Apart from that, this migration will also allow the future introduction of the validator figure, explained in the initial 'Blockchain storage proposal'. Adding this figure will increase the value of the project by giving the opportunity to:
Prove that data collected and stored is correct and no negligence is committed during the sensor to bravo communication.
Control that data stored has no outlier and that everything is working correctly during the Growing the grapes process.
Add a new feature to allow users interaction with the vineyard via and App that leads them to improve their wine and blockchain knowledge.
This initial proposal will only include the part of storing the data in the blockchain. This decision was made because after making a first approach to the problem with the 'Blockchain storage proposal', we have realised that there might be several problems when it comes to using Blockchain as a Database because it is not as optimized as an SQL DB and this could lead to a slower response time which is undesirable. So in this first step of the Side Chain Development, we are going to implement the data storage part of the side chain solution and load tests are going to be executed in order to check if the system proposed response time is acceptable or not. In case of not being acceptable a second iteration is going to be done adding a DB to store utility indexes in order to reduce the response time of queries and searches and other optimizations.
For this project to be done, a private Ethereum Network and a Database Server (in case of reaching the second iteration) must be given. The raspberry in charge of inserting data to the blockchain will have to own a pair of private/public keys. We will also need a clear specification of the data extracted with the sensors and its treatment.
After having finished the first part of the proposal, a blockchain database system will have been deployed and documentation for its integration with the sensors/vinduinos and the website will be given. Once finished and accepted, the validation part of the Side Chain proposal will begin (defined as a new project). This part might alter the first smart contracts given but a full migration will be included and previous functionalities won't be changed.
The development of this Side chain data storage feature is being started today and tomorrow it is planned to have already defined the requirements. Team members Communication might be necessary in order to establish quality and performance requirements.
Requirements
Now that the project has been defined, we will carry on by specifying the requirements. In this case, some of the requirements might not seem functional nor systemic but, from our point of view it's a real need for this part of the project to determine data and to give documentation for the future integration. This requirements are classified as: Integration Requirements.
Can I write data from sensor in the vineyard (and winery) to a blockchain, such that:
The data cannot be changed in the future
Anyone can lookup the data
The data store can be distributed (decentralized), such that no single source-of-authority exists, beyond the initial sensor publishing the data
The sensor itself can sign the data creation with an unknown private-key.
Camera images
blockchain nodes can be distributed physically around the world, but only need to be “plugged in”.
Data lookup does not have to be “real time”, but can be read by a database and indexed, hashed, etc.
We want to show that organic certification data can be placed (automatically) onto the blockchain, sensor and image data, accounting data, and read and presented online.
Show that we are organic, trustworthy, true. Compare with existing OIA forms.
sensor data
accounting data
work data
images
For each of these, prove that data is:
written directly to the blockchain
read and processed by a more sophisticated DB
decentralized by other participants (wine shops and CGV participants)
validated by customers and game players.
Steps for building this:
Construct a single-node blockchain
Write data to the blockchain from a sensor, sale, ticket, or camera.
Encrypt what is written to the blockchain from the sensor, transaction, ticket creator, camera
Read this data into Openbravo
Present this data to the world
Decentralize the blockchain to mutliple local nodes
Decentralize the blockchain to many non-participant nodes
Make the nodes participants (if necessary) - validation model
wine store or CGV sensor data is there
lookups of Costaflores data
Purchases made from shop
game review of work steps
views of images
Req | P | Definition | Owner | Notes |
1.4.1 | 1 | An Ethereum Private Network must be opened in a one of the servers. |
| Server is still to be decided and specifications of this network have to be specified by consensus |
1.4.2 | 1 | A full-node must be initialized and connected to the Private Network. |
|
|
1.4.3 | 3 | An automatic method must activate/deactivate the full node from mining depending on whether there are transactions waiting to be mined or not. |
|
|
1.4.4 | 3 | A soft-node has to be initialized and connected to the Private Networks in the raspberry to enable sensors/vinduinos communication. |
|
|
1.4.5 | 1 | Version and releases control of the different development tools must be carried on. |
| (Due to the short life time of this technology, versions of tools used are relevant) |
1.4.6 | 1 | The implementation must be deployed to the previously created private network. |
|
|
1.4.7 | 1 | All the relevant data must be storable in the Side Chain with the determined frequency/granularity. |
| *relevant data = data we want to show. |
1.4.8 | 1 | All the relevant data must be readable from the Side Chain with the determined frequency/granularity. |
| Store granularity might not be the same as Read granularity. |
1.4.9 | 1 | Just authorised people must be able to store data to the smart contracts. |
| (Identification of authorized public addresses) |
1.4.10 | 1 | A way to recover previous blockchain states in case of error must be implemented. |
| (Only authorized to the admin) |
1.4.11 | 1 | Data stored must assure immutability, no method might change them apart from the recover one. |
| (Just insert and delete methods!) |
1.4.12 | 2 | Data written must contain a timestamp for the future validation feature. |
| (Time stamps are necessary for validations to be done, timestamps might be block based) |
1.4.13 | 1 | Data contracts must contain a hash method for the future validation feature. |
|
|
1.4.14 | 1 | The deployed contracts must be tested to check if the desired query performance is accomplished. |
| (Mocha testing, for functional and performance requirements) |
Integration Requirements
Req | P | Definition | Owner | Notes |
1.4.15 | 1 | Specification of concrete data we are extracting must be done and documentation about it must be generated. |
| Something like metadata. |
1.4.16 | 1 | Data granularity (timely based) must be justified and concreted. |
|
|
1.4.17 | 1 | Data necessity for the website and the app must be specified. |
| (Important, no changes assured after this specification) |
1.4.18 | 1 | The final UML Class diagram of the data t have to be generated. |
| (This will be added to the final documentation) |
1.4.19 | 2 | Optimizations must be at least documented in case test validation is accomplished without using them and implemented in case it is not accomplished. |
| (Assure scalability) |
1.4.20 | 1 | Implementation of the design made must be done and uploaded to a repository. |
| (Using GitFlow procedure) |
1.4.21 | 2 | Documentation of the contracts and the network must be generated for the website developers to integrate it and for the implementation of scripts that allow sensors to transfer data to the side chain. |
|
|
Webservice Requirements
Req | P | Definition | Owner | Notes |
1.4.22 | 1 | Insert records on the "Daily Log" window, create a header and add data into the "Enviroment Info" tab. Also, insert data into existing records. |
| Header: Date Enviroment Info: Temperature (max and min) – Subsoil humidity – Solar irradiance |
1.4.23 | 1 | List the environment information and the tasks performed on a given day |
|
|
1.4.24 | 1 | List the environment information and the tasks performed of every day of a given month |
|
|
1.4.25 | 1 | List the average temperature (min and max), humidity and solar radiation of each month in a given year |
|
|