Wednesday, August 29, 2012

Talk (Should Be) Cheap

When contemplating how the Internet of Things will communicate, it helps to forget everything you know about traditional networking schemes – especially wide area networking and wireless networking. In traditional wide area and wireless networking, the bandwidth or spectrum is expensive and limited; and the amount of data to be transmitted is large and always growing. While over-provisioning data paths in wiring the desktop is commonplace, this isn't usually practical in the WAN or wireless network – it’s just too expensive.

Besides cost, there's the matter of potential data loss and (in the wireless world) collisions. Traditional networking needs lots of checks and double-checks on message integrity and order to minimize costly retransmissions. These constraints led to the protocol stacks with which we are familiar today such as TCP/IP and 802.11.

In most of the Internet of Things, however, the situation is completely different. Oh, the costs of wireless and wide-area bandwidth are still high, to be sure. But the amounts of data from most devices will be almost immeasurably low and the delivery of any single "chirp" or message completely uncritical. As I keep saying, the IoT is lossy and intermittent, so the end devices will be designed to function perfectly well even if they miss sending or receiving data for a while – even a long while. It's this self-sufficiency that eliminates the criticality of any single "chirp".

It might be worthwhile at this point to contrast my view of the IoT with traditional IP. First, IP is fundamentally oriented toward large packets. With large packets, the IP overhead is a relatively small percentage of the overall transmission. But in the IoT, IP overhead is much larger than the typical payload of a chirp.

In addition, a significant amount of the overhead in IP is dedicated to security, encryption, and other services, none of which matter at the very edges of the Internet of Things where the simplest devices predominate (if my view of the IoT is correct).

By contrast, IoT chirps are like pollen – lightweight, broadly propagated, and with meaning only at the "interested" Integrator functions. The IoT is receiver-centric, not sender-centric, as is IP. Because IoT chirps are so small and no individual chirp is critical, we have no concern over retries and resulting broadcast storms, which are a danger in IP.

It’s true that efficient IoT propagator nodes will prune and bundle broadcasts, but seasonal or episodic broadcast storms from end devices are much less of a problem because the chirps are small and individually uncritical. Like nature treats pollen, the IoT may treat any single chirp as truly "best effort" – so heavy broadcast storms caused by an external event will die out pretty quickly.

In my view of the IoT, this means that huge packets, security at the publisher, and assured delivery of any single message are passé. This will allow us to mirror nature with massive networks based on lightweight components. In my technical imagination, this makes the IoT more "female" (receiver-oriented) than the "male" structure of IP (sender-oriented).

But having said all that, what's the point in having an IoT if nothing ever gets through? How can we deal with the unpredictable nature of connections? The answer, perhaps surprisingly, is over-provisioning. That is, we can resend these short simple chirps over and over again as a brute force means of ensuring that some get through.

Because the chunks of data are so small, the costs of this over-provisioning at the very edge of the IoT are infinitesimal. But the benefits of this sort of scheme are huge. Since no individual message is critical, there's no need for any error-recovery or integrity-checking overhead (except for the most basic checksum to avoid a garbled message). Each message simply has an address, a short data field, and a checksum. In some ways, these messages are what IP Datagrams were meant to be. The cost and complexity burden on the end devices will be very low, as it must be in the IoT.

The address will incorporate the "arrow" of transmission I mentioned earlier, identifying the general direction of the message: whether toward end devices or toward integrator functions. Messages moving to-or-from end devices will only need the address of the end device – where it is headed or where it is from is unimportant to the vast majority of simple end devices. They're merely broadcasting and/or listening.

So the end devices are awash in the ebb and flow of countless transmissions. But replicating this traffic willy-nilly throughout the IoT would clearly choke the network, so we must apply intelligence at levels above the individual devices. For this, we'll turn to the propagator nodes I've referenced in past posts.

Propagator nodes will use their knowledge of adjacencies to form a near- range picture of the network, locating end devices and nearby propagator nodes. The propagator nodes will intelligently package and prune the various data messages before broadcasting them to adjacent nodes. Using the simple checksum and the "arrow" of transmission (toward end devices or toward integrator functions), redundant messages will be discarded. Groups of messages that are all to be propagated via an adjacent node may be bundled into one "meta" message for efficient transmission. Arriving "meta" messages may be unpacked and re-packed.

Propagator nodes will be biased to forward certain information in particular directions based on routing instructions passed down from the integrator functions interested in communicating with a particular functional or geographic neighborhood of end devices. It is the integrator functions that will dictate the overall communications flow based on their needs to get data or set parameters in a neighborhood of IoT end devices.

Discovery of new end devices, propagator nodes, and integrator functions will be again similar to my architecture for wireless mesh. When messages from-or-to new end devices appear, propagator nodes will forward those and add the addresses to their tables. Appropriate age-out algorithms will allow for pruning the tables of adjacencies for devices that go off-line or are mobile and are only passing through.

One other aspect of communication to be addressed within the Internet of Things is the matter of wireless networking. It’s likely that many of the end device connections in the IoT will be wireless, using a wide variety of frequencies. This fact seems to suggest a need for something like CSMA/CD (Carrier Sense Multiple Access with Detection), as used in 802.11 WiFi. But that's another aspect of traditional networking that we need to forget.

Again, data rates will be very small and most individual transmissions completely uncritical. Even in a location with many devices vying for airtime, the overall duty cycle will be very low. And most messages will be duplicates, from our earlier principle of over-provisioning. With that in mind, an occasional collision is of no significance. All that we must avoid is a "deadly embrace" in which multiple devices, unaware of one another's presence, continue transmitting at exactly the same time and colliding over and over.

The solution is a simple randomization of transmission times at every device, perhaps with continuously varying pauses between transmission based on prime numbers, hashed end device address or some other factor that provide uniquely varying transmission events.

While the resulting communication scheme is very different from traditional networking protocols, it will be all that we need for the IoT. Providing just enough communication at very low cost and complexity will be good enough for the Internet of Things.

Many of the ideas I'm developing for the Internet of Things are inspired by the interactions of beings in nature. Next time, a look at the way aggregations of creatures become highly-functioning colonies and "SuperOrganisms" – and the lessons this provides for the IoT.

Wednesday, August 22, 2012

Organize Among Yourselves!

What's in a name? Well, if we're building an Internet of Things, naming becomes a challenge. True, there are existing ways to identify end point devices, such as MAC IDs (Media Access Control) and IPv6 addresses. I'm very familiar with these from my work with wireless mesh networking, which is why I can state with absolute confidence that they won't work for the majority of devices in the Internet of Things.

It comes down to the matters of complexity and lack of centralized control. The simplest of the billions of devices in the IoT can't be burdened with the memory demands, power requirements, and management overhead associated with a heavyweight protocol stack such as IPv6. And since these devices will come from millions of different suppliers of varying degrees of networking know-how, managing a central repository of the equivalent of MAC IDs probably won't work, either. Not to mention that connections to IoT end devices will be lossy, intermittent, and uncertain.

At this scale, only self-organization works, just as it does in nature. In my mind there are two key components of a massively scalable naming scheme: 1) non-guarantee of absolute uniqueness; and 2) derivation from environment.

I can hear the wailing now, "Non-unique addresses, is he crazy?" No, just observant. How many "John Smith"s are there in the world? Probably millions. Yet if we meet someone named John Smith, we can place him in the context of the environment: where he lives and works, who he knows, to whom he is related, etc. John Smith is not a unique name, yet we can keep things straight when communicating with the particular John Smith we are interested in.

The same can be true for the billions of devices of the three main types I identified earlier: End Devices, Propagator Nodes, and Integrator Functions. An individual device may have a relatively simple "base" address (more on that in a minute), but there may be additional context applied to the headers of data "chirps" destined to-and-from that device. These might include the addresses of the propagator node(s) to which it was first (or is now) connected. Just as with our friend John Smith, we'll be able to distinguish among similar device "base" addresses by the company they keep, where they live, and what they do.

What about those base addresses? In my mind, these base addresses for individual end devices come from multiple sources: pre-set factory identities like a model number; one or more environmental inputs such as the time of day of first operation, GPS location, supplied voltage, temperature, etc.; and perhaps the identity of any other devices or propagator nodes that the device detects. All of these inputs are then "hashed" with a simple algorithm into an address that may not be unique in the world, but is very likely to be unique from any other device in the neighborhood. "Neighborhood" here might be geographical, purpose-based, or defined by the integrator functions interested in a particular set of end devices.

Again, as with our "John Smith", even though these end device addresses may not be universally unique, they will be distinctive enough to be recognized individually in context.

The simplest propagator nodes might follow the same sorts of naming conventions. Why not, if the algorithm exists? Or one might choose traditional MAC ID and IPv6 addresses for these devices, since at least some of their communication will be via more traditional protocols and they will already have more processing power and memory than the end devices. And the same is probably true for integrator functions, again because they will be using existing protocols and will have the resources to manage the protocol stacks.

Defining a new naming scheme for the Internet of Things may seem redundant, but that's only if one is looking at the world from a network-centric perspective. When we turn the telescope around to view the world from the perspective of the end device, we can see that the billions of simple, low power, intermittently connected end points have completely different needs than does the traditional Internet. In the IoT, numbers rule – and the numbers favor the end points by multiple orders of magnitude. How those end points communicate will be the subject of the next post.

Wednesday, August 15, 2012

Not a Stack, a Crowd

What we all used to know as simple server-based computing architecture has been replaced by the glossy and over-hyped marketing term "the cloud". But from an architecture perspective, the cloud is pretty much the same IP-based networking we've been using for decades. Because the Internet backbone is (at least today) still over-provisioned, an Internet and IP-based cloud can work well, even for important transactions. (Whether that holds true for the decades ahead is a different matter, of course.)

But the Internet of Things shares only a need of wide connectivity with "the cloud". In most other important ways, it’s completely different: crowds of billions of end devices that connect intermittently at very low speeds to other machines, not to humans. In my developing picture of the IoT, this makes traditional protocol stacks irrelevant – or at the very least, overkill.

While the traditional protocols may make sense for connecting Propagator nodes and Integrator functions (see the previous blog post), the vast numerical majority of connections will be to relatively low-data-need devices such as HVAC units, air quality sensors, and street lights. This is the segment of the communications architecture that must be re-thought from the ground up, in my opinion.

Rather than treat intermittent connections, data loss, and low data rates as problems (as they would be in IP), we must embrace these as facts of life in the Internet of Things. It’s a lossy world on the IoT frontier, and that's OK – if we engineer the architecture with that in mind. Most of these end devices won't need constant check-ins with a central site to function. They'll simply keep running, functioning with or without network updates. If an update comes, fine, but there's no immediate response required.

Turning to nature, birdsong and pollen give us another picture of how the IoT devices will treat communications. Many birds sing without expecting (or waiting for) an answer. They sing "blindly" to mark territory, advertise mating availability, or signal danger – and trust in the universe to deliver the message to hearers who may act upon the message. Similarly, trees and other flowering plants broadcast pollen extremely broadly (hence, allergy season) without any feedback on whether the "message" is received. Propagated by winds, pollen may be carried hundreds or thousands of miles away from the originating source.

All of this leads to my heretical view of the very edge of the Internet of Things: it just isn't reliable when viewed from the perspective of a single message. The devices may be switched off at various times, propagation paths may be lost, etc. Yet by sending the same small data chirps over and over, eventually there is a good chance that some or a few will get through. This will mean over-transmitting on a massive scale. But because each data chirp is so small, there is virtually no net cost involved in this over-provisioning.

I believe that this is one of the key things about the Internet of Things that is completely different from the "Big I" Internet: the very small amount of data in each transmission and the lack of criticality of any single transmission. As the traditional Internet becomes clogged with ever larger real-time data streams such as those generated by video and multiplayer gaming, the IoT's growth will be at the fringes of the network with billions of low-duty-cycle, low-data-rate devices.

I believe we'll need a new architecture at the edges of the Internet of Things. In place of the traditional IP protocol stack with hierarchical layers of routing topologies, there will instead be a gigantic crowd of devices speaking and listening – each unconcerned with what's happening anywhere else in the network. Instead of rigid routing paths there will be transient clumps and aggregations of unrelated devices sharing propagation facilities. It's a truly best-effort world, and as I have said before – that will be good enough.

Conceptually, I am breaking this new architecture into the elements of Naming, Communication, and Propagation. Next blog, we'll start with the most challenging aspect of this architecture: naming those billions of IoT end points in the crowd.

Thursday, August 9, 2012

Forget Equality

The general concept of peer-to-peer networks is extremely attractive. It appeals to my philosophical leanings and to my sense of engineering elegance. The prospect of billions of devices seamlessly interacting with one another seems to allow the Internet of Things to escape the limitations of centralized command and control, instead taking full advantage of Metcalfe's Law to create more value through more interconnections.

But true peer-to-peer communication isn't perfect democracy – it's senseless cacophony. In the IoT, devices at the edge of the network have no need to be connected with other devices at the edge of the network – there is zero value in the information. These devices have simple needs to speak and hear: sharing a few bytes of data per hour on bearing temperature and fuel supply for a diesel generator, perhaps. Therefore, burdening them with protocol stacks, processing, and memory to allow true peer-to-peer networking is a complete waste of resources and creates more risk of failures, management and configuration errors, and hacking.

Having said that, there is obviously a need to transport the data destined to or originating from these edge devices. The desired breakthrough for a truly universal IoT is using increasing degrees of intelligence and networking capability to mange that transportation of data.

Conceptually, a very simple three-level model will suffice. At the edge of the network are simple Devices. They transmit or receive their small amounts of data in a variety of ways: wirelessly over any number of protocols, via power line networking, or by being directly connected to a higher level device. These edge devices simply "chirp" their bits of data or listen for chirps directed toward them (and how are these addressed, you might ask – we'll get there in a later blog post).

Note that I've said nothing about error-checking, routing, higher-level addressing or anything of the sort. That's because none of these are needed. Edge devices (Level I, if you will) are fairly mindless "worker bees" existing on a minimum of data flow. This will suffice for the overwhelming majority of devices connected to the IoT*.

The emphasis in the last sentence above is a key point. Much of what has been written about the IoT assumes an IP stack in every refrigerator, parking meter, and fluid valve. Why? It's obvious that these devices won't need the decades of built-up network protocol detritus encoded in TCP/IP. We all must free our thinking from our personal experience of the networking of computers, Smartphones and human users to address the much simpler needs of the myriad devices at the edge of the IoT.

So if the end devices aren't capable of protocol intelligence, it must reside somewhere. And the major elements of that somewhere are the Level II Propagator nodes. These are technologically a bit more like the networking equipment with which we are all familiar, but they operate in a different way. Propagators listen for data "chirping" from any device. Based on a simple set of rules regarding the "arrow" of transmission (toward devices or away from devices), propagator nodes decide how to broadcast these chirps to other propagator nodes or to the higher-level Integrator device I'll discuss in a moment.

In order to scale to the immense size of the Internet of Things, these propagator nodes must be capable of a great deal of discovery and self-organization. They will recognize other propagator nodes within range, set up simple routing tables of adjacencies, and discover likely paths to the appropriate integrators. I've solved this sort of problem before with wireless mesh networking and although the topology algorithms are complex, the amount of data exchange needed is small.

One of the important capabilities of propagator nodes will be their ability to prune and optimize broadcasts. Chirps passing from-and-to end devices may be combined with other traffic and forwarded in the general direction of their transmission "arrow". In my view of the IoT, propagators are the closest thing to the traditional idea of peer-to-peer networking, but they are providing this networking on behalf of devices and integrators at levels "above" and "below" themselves. Any of the standard networking protocols may be used, and propagator nodes will perform important translation functions between different networks (power line or Bluetooth to ZigBee or WiFi, for example).

Integrator functions are where the chirps from hundreds to millions of devices are analyzed and acted upon. Integrator functions also send their own chirps to get information or set values at devices – of course these chirps' transmission arrow is pointed toward devices. Integrator functions may also incorporate a variety of inputs, from big data to social networking trends and "Likes" to weather reports.

Integrator functions are the human interface to the IoT. As such, they will be built to reduce the unfathomably large amounts of data collected over a period of time to a simple set of alarms, exceptions, and other reports for consumption by humans. In the other direction, they will be used to manage the IoT by biasing devices to operate within certain desired parameters.

Using simple concepts such as "cluster" and "avoid", integrated scheduling and decision-making processes within the integrator functions will allow much of the IoT to operate transparently and without human intervention. One integrator function might be needed for an average household, operating on a Smartphone, computer, or home entertainment device. Or the integrator function could be scaled up to a huge global enterprise, tracking and managing energy usage across a corporation, for example.

When it comes to actually packaging and delivering products, some physical devices will certainly be combinations of functions. Propagator nodes combined with one or more end devices certainly make sense, as will other combinations. But the important concept here is to replace the idea of peer-to-peer for everything with a graduated amount of networking delivered as needed and where needed. In the Internet of Things, we need a division of labor (like ant and bee colonies) so that devices with not much to say or hear receive only the amount of networking they need and no more.

Next time, we'll talk about the communications architecture for the IoT in more detail and why the needs of the crowd are different from the needs of the cloud.

*Yes, there will be a relatively small number (still billions and billions) of more-sophisticated devices connected to the IoT. But these will connect with good ol' IPv6, as the investment of a protocol stack is worth it.

Tuesday, August 7, 2012

It's Different Out Here

Through nearly my entire career, I have been trying to create highly functional systems with a bare minimum of resources, whether that is defined as power, space, wireless spectrum, money, time, or other factors. This has often led me to develop systems formed of essentially autonomous devices that were able to self-organize, manage perturbations, and tune performance to the environment. The systems have been as diverse as tactical robots, web information harvesters, and wireless mesh networks – but all shared aspects of being simultaneously independent and coordinated.

"Experts" from product managers to preachers to pundits have turned their attention recently to the "The Internet of Things" (IoT). This phrase has many meanings, depending on who is doing the describing – and perhaps more importantly, the selling. Network-centric companies view the IoT as an extension of current networking protocols and practices, noting that IPv6 allows the addressing of billions and billions of devices (according to this infographic from Cisco Systems, 100 addresses for every atom of matter on earth).

Other market participants see the IoT as an extension of existing Radio Frequency Identity (RFID) applications, noting the power of the Internet of Things to locate and catalog every discrete item on earth – apparently believing that’s not only practical, but useful.

But my experience building a wide variety of "bare minimum" systems suggests that the real power of the Internet of Things will be quite different from either a traditional network centric or universal inventory perspective. Rather, I believe that the Internet of Things represents a completely different worldview: one where the machines take care of themselves and only trouble us for exceptions. Simple devices, speaking simply.

My vision of the IoT is absolutely required if one truly believes that the Internet of Things will reach down to billions of devices like diesel generators, soil moisture sensors, and toasters. It doesn't make economic or technical sense to add a lot of costly and finicky electronics to these devices merely to gather or impart the tiny amount of data they create or need.

This world of machine-to-machine interaction will be much more like birdsong or the interactions of social insects such as bees and ants than it will be like TCP/IP and WiFi. The overhead of traditional protocols such as IPv6 isn't necessary (or possible) when data rates are nearly immeasurably low. At the edges of the network, the vast numerical majority of devices will simply speak and listen in tiny bits of data. And they will be designed with a basic trust in an IoT universe that propagates these messages to some sort of integration point where the IoT may be interpreted for human consumption.

The Internet of Things is (and always will be) the very frontier of the network. Like every frontier in history, it will be messy, intermittent, lossy, and unpredictable. Best effort will be the rule of the day – and that will be enough!

In future blog posts, I'll explain further why the Internet of Things can be – indeed must be – completely different than the way it is currently envisioned by nearly everyone. Next time: why peer-to-peer doesn't mean equal.