Streamr brings U.S. stocks to Ethereum blockchain

What are the future implications of massively distributed, decentralised databases enabled by blockchains? Will it some day be possible to have a significant share of new data generated in the world – data from IoT, social media, stock exchanges and so on – pass through a blockchain-powered system for superior persistence, security, and peer-to-peer distribution? Data could be easily available, transparent, proven, and not controlled by giant corporations.

While present-day blockchains aren’t nearly scalable enough to ingest big data, we at Streamr are working on ways to make the above vision happen sooner rather than later. Stay tuned for more information about that in the near future. Meanwhile, the concept of real-time data flowing into a decentralised system can already be easily demonstrated with the existing Streamr platform and its built-in Ethereum integration.

For this week’s demo, we decided to inject the U.S. stock market into the blockchain. Well, not the whole market just yet, we’ll start with the stocks in the S&P 500 index. For now, direct input of data into the blockchain, combined with current blockchain scalability means that the idea can only be explored on a miniature scale, to avoid spamming the Ethereum blockchain and paying excessive gas costs.

In the demo, you can see a process that ingests streaming trade data from NASDAQ, aggregates the data into 15 minute OHLC (open, high, low, close) “bars” by stock symbol, and submits those bars to a smart contract deployed on the Rinkeby testnet. Data for all the stocks are reported in a single transaction, in order to avoid spamming hundreds of transactions every 15 minutes.

What effectively happens is that everyone running a full Ethereum node will automatically have fast and verifiable local access to the published data points. A smart contract holds the data, and it can be queried by anyone to retrieve the events posted earlier (to display a chart, for example).

Here’s the live canvas (full screen, or open in editor):

The smart contract is at 0xcc85a5f3ecc0ae62023e35faeb9c72fea25d75d6, and its source code can be seen on the canvas inside the StockTicker contract. Here’s an example bit of web3 code for querying the Apple stock price history and watching for new events:

// Symbol to watch
var symbol = 'AAPL'

// Contract address and abi
var address = '0xcc85a5f3ecc0ae62023e35faeb9c72fea25d75d6'
var abi = [{"constant":false,"inputs":[{"name":"symbol","type":"bytes32[]"},{"name":"open","type":"int256[]"},{"name":"high","type":"int256[]"},{"name":"low","type":"int256[]"},{"name":"close","type":"int256[]"},{"name":"time","type":"uint256"}],"name":"setPrices","outputs":[],"payable":false,"type":"function"},{"inputs":[],"payable":false,"type":"constructor"},{"anonymous":false,"inputs":[{"indexed":true,"name":"symbolIdx","type":"bytes32"},{"indexed":false,"name":"symbol","type":"string"},{"indexed":false,"name":"time","type":"uint256"},{"indexed":false,"name":"open","type":"int256"},{"indexed":false,"name":"high","type":"int256"},{"indexed":false,"name":"low","type":"int256"},{"indexed":false,"name":"close","type":"int256"}],"name":"Price","type":"event"}];

web3.eth.contract(abi).at(address).Price(
	{
	 	// Convert symbol to hex bytes32
		symbolIdx: web3.fromUtf8(symbol).padEnd(66, '0')
	},
	{
		// Fetch complete history of events
		fromBlock: 0,
		toBlock: 'latest'
	}, 
	function(error, event) {
		console.log({
			// Convert integer BigNumbers back to double-precision primitives
			symbol: event.args.symbol,
			time: new Date(event.args.time.toNumber()),
			open: event.args.open.toNumber() / 10000,
			high: event.args.high.toNumber() / 10000,
			low: event.args.low.toNumber() / 10000,
			close: event.args.low.toNumber() / 10000
		})
	}
)

Note that we originally used a string field to index the events, but ran into this bug when trying to query the events in web3. We worked around the issue by using a bytes32 representation of the symbol string for the purposes of this demo.

While this example demonstrates the idea of decentralised streaming data delivery in the future, we’re currently working with Oraclize to enable smart contracts to obtain data from Streamr securely on a per-request basis, and on many different blockchains. This will be a topic for an upcoming blog post.

Overall, we begin to see hints of how a decentralized system for storing and transmitting data might work in practice. And of course, there is much more that can be done: we have not even scratched the surface of Ethereum’s other protocols, nor have we explored Polkadot Parachains or other multi-chain scaling schemes.

Decentralised data feeds are a new and exciting frontier, and we could not be more thrilled that the state of working technology already allows us to get this far! As always, we invite anyone who is as data-crazy as we are, to join us on our Slack, and on Twitter.

Streamr and Ethereum, or how we saw the light

It was the summer of 2012 when we put our thinking hats on to try and solve a particular problem that was collectively driving us more than a bit mad: It was extremely difficult to build and even more so to test trading algorithms based on so-called “real world” data. United by our hard-won comfort in the adrenaline-soaked world of high-frequency trading, we thus decided to build a set of tools to make our lives easier. This was the first version of Streamr, born as a way to assemble and distribute low-latency data feeds, and to rapidly build models and algorithms which monitor and react to this data.

Fast-forward a few years, and the Internet of Things is beginning to emerge. From the earliest steps, we intuited that this was something important, that perhaps the whole world would, in its own way, be moving in the direction of trading and finance. After all, what is the point of so many sensors, if not to provide real-time data and “identify” signals on the fly? This would naturally lead to algorithms taking automatic action to grab those fleeting-but-profitable opportunities, for the benefit of both individuals and businesses. And we were far from wrong: streaming analytics is now growing into one of the hot topics in the software business.

So we built a business around real-time data and streaming analytics. We created a functional and scalable low-code SaaS platform where it is easy to develop and deploy real-time microservices. We’ve gathered up a number of good customers, and there are many fascinating use cases in development. But until now, something has been missing from the picture.

Our ultimate vision is that everyone is able to use our platform and tools, connecting real-time data sources to other popular services and platforms, generally performing magic to solve customer problems and even create totally new services. But to accomplish this, another layer is needed, which provides both community and a mechanism for trust.

By late 2016, our home town of Helsinki was a true hotbed of innovation and startup activity. When a long-time friend introduced us to some interesting characters in the blockchain community, a number of illuminating conversations transpired, and we found ourselves thinking out loud:

Hey, there’s this wonderful community of tech-minded people who have created a decentralised, trustless network where you can have immutable facts mined and set in stone (or the digital equivalent) for anyone to see. And there’s yet another protocol where you can have computer code run in the same decentralised, trustless fashion, and all it takes is something called “gas”.

Our minds were blown. It’s not that we hadn’t heard of Bitcoin, blockchain, Ethereum, and other related technologies before, but it took a bolt of lightning for the reality to fully click. We already knew how to work with real-time data, and how to make life easy for those developing algorithms on top of streaming data. But what if we applied this to a decentralised computing machine? Could we ultimately offer the world a trust-free service with easy access to a real world data, and a killer usability layer? And what, in fact, would that mean?

After sleepless days and nights coddling our brains while pondering the profound and almost magical ramifications, the missing piece of the puzzle indeed fell into place for us. We could build this. And since we are as excited about Ethereum as we have ever been about anything that we have seen, we intend to go “all out”, as they say. At EDCON in Paris this year, I got on stage, connected a smart contract to a live feed from Helsinki City public transport, and deployed the contract, all in 5 minutes or so. A simple pay-by-use demo on the Ethereum platform, all using existing Streamr tools. It was fun, we got lots of positive feedback and ideas, and made some fascinating new friends.

Here’s an embedded a video of the EDCON demo:

For those who didn’t make to Paris, let me first briefly recap what the demo is about. The action takes place in Helsinki, Finland, where each tram is equipped with a GPS receiver as well as sensors which measure the vehicle speed, heading, and many other quantities. Each tram transmits its location and the sensor readings to the public transport agency’s backend every second or so. The live data feed is available at no charge over the MQTT protocol to any company or developer out there.

Now imagine that you’re in charge of the public transport in Helsinki, and a decision has been made to outsource the running of the tram network to a third party. How do you set up a smart contract which automatically incentivises the tram operator for running a tight ship (if you forgive me the pun)?

After some thought, we decided that it’s best to keep things simple to start with. The city pays the operator incrementally based on mileage. A smart contract keeps track of the mileage run and makes the payments as the trams happily chug along. The demo shows how such a smart contract can be set up easily, connected to a live data feed, and deployed in the chain.

The demo touches on one important aspect of real-world data as input to a smart contract. In this use case — as in many others — the volume of streaming data is sufficiently large to overwhelm any blockchain. There’s simply too much data to feed everything in to a smart contract and have it processed by e.g. the Ethereum virtual machine.

What this means that some off-chain processing is needed. In the demo, we make use the Streamr platform where much of the required streaming analytics functionality is already built in. This is what we do:

  1. First we connect to the live data stream from the Helsinki transport agency (this is the MQTT feed from the trams).
  2. We visualize the live tram data on a module showing the city map with the live location of every tram. The purpose here is nothing but to make sure that the data looks sensible, and is something we can build on.
  3. Next we deploy the smart contract. Our aim is to build ready-made, reusable smart contract templates readily available in Streamr. Here, we’re using one called PayByUse. We can open the template code in a built-in Solidity editor in the Streamr front-end. The contract is initialized with information on whom to pay and how much (in wei) to pay per reported unit (traveled meter). When the process is running, a function called update will be called every once in a while to report usage to the smart contract, which calculates and makes the payments.
  4. We need to calculate the mileage run by each tram in the city. We get that done by calculating speed * delta time (equals distance, remember high-school physics?) from subsequent measurements for each tram. In the below image this is abstracted inside the ForEach module.
  5. As discussed above, we cannot feed in all the real-time data to the smart contract. What we do instead is to accumulate the little distance increments until a threshold (10.000 meters in this case) is reached, report that to the smart contract, and reset the accumulator back to zero. The contract makes the transaction where the agreed amount of wei is calculated and transferred to the tram operator every time the threshold is crossed.
  6. Now we’re pretty much done. Let’s save the workspace, press the start button, and see what happens. And voilà — the mileage starts accumulating, and as soon as a total of 10.000 meters is reached, update function call is made, and wei is transacted.

Below you can see a live Streamr Canvas (a workspace) with the above steps plus some visualisations added. It shows the actual real-time feed from the Helsinki city transport connected to a smart contract. You’re free to have a look, poke around, kick the tires. Note though that this demo connects to the Rinkeby testnet, and it doesn’t worry about things like secrecy (you’ll find the private key in there but please do not withdraw the ether, it will spoil the fun for all others😊).

You can also view the below Canvas in a new tab or in our editor (requires free signup). This environment is our experimental Ethereum-enabled environment; our production environment is here.

There’s of course many simplifications in here. In practise you’d want to take many other variables into account in the payment schedule: The number of passengers, keeping to the timetable, even the ride quality, and so on. In many use cases you would want to have a good look at the data provenance (in here we trust the public transport authority to provide clean and truthful data). And there’s a good question of how we can prove that any off-chain processing achieves what is agreed, given that the such processing by definition takes place outside the blockchain proper. These are not easy questions, but we know there’s a lot of experimentation on interesting solutions taking place out there. For the sake of the demo, we’ve skimmed over the questions; not because these are not important, but because the all the solutions are not there quite yet.

As we see it, the EDCON demo is but a taster of what we we’d like to bring to the table. These are the earliest of days, and we don’t yet know exactly where we’ll end up, what our role might ultimately be, or much else. We certainly don’t claim to have all the answers, or yet have the deep technical expertise in the space that some others do (by the way, we profusely thank those people for their public knowledge sharing!). But we’re willing to learn, exchange ideas, and partner up with those who truly do know. These are the most exciting of times, and we can’t wait to contribute our share!

We have recently set up a public Slack. Please join and let us know what you think.

EDIT: Migrated demo from Ropsten to Rinkeby testnet