IoT Project Structure

Once the initial solution design has been completed, the final task in the project initiation phase of an IoT project is usually to set up the organizational structure of the project. This is what we will discuss here.

Conway’s Law

IT is governed by numerous laws, many of which are important for the IoT – not least Moore’s and Metcalfe’s laws. A less widely known law – yet extremely important in our experience – was put forward by Melvin Conway in 1968. According to this law, the structure of a software system will reflect the structure of the organization that produced it. For example, if three development teams are working on an IT project, chances are that the resulting architecture will have three main components. Consequently, if the goal is to create an architecture with three components, the organizational structure of the project should be made up of three development teams.

With this in mind, the organizational structure of the project must take into account the results of the initial solution design as well as the results of the make-or-buy decision. This brings us back to the discussion about how to structure the top-level workstreams of an IoT solution project.

Identifying the right Work Streams

Identifying the right workstreams

IoT Project Workstreams

The Ignite |IoT Methodology uses the concept of workstreams as the top-most project structure. This is because we believe that for a complex project with as many dependencies as an IoT project, it is not possible to come up with a complete and fully stable Work Breakdown Structure (WBS) during the Initiation Phase – even if using a more traditional waterfall approach. The concept of workstreams better reflects the dynamics of these projects. Each workstream should be mapped to a set of key deliverables (defined as work packages, according to PMI [PM1], for example) from the Initial Solution Design (Conway’s law!), and needs a team and a team/workstream manager.

Proposed IoT Project Work Streams - Overview

Proposed IoT project workstreams

Although making generalizations is difficult, our analysis has shown that for many IoT projects it makes sense to define the workstreams illustrated in the figure above, i.e.:

  1. Project Management: Well defined in the PMI PMBOK [PM1], but also needs to incorporate solution architecture management across the first 6 workstreams.
  2. Cross-Cutting Tasks: Inclusion of tasks that have dependencies right across the subsequent workstreams, including security, asset lifecycle management, solution integration, and testing
  3. Solution Infrastructure and Operations: Setup and management of the hardware and software infrastructure required for developing and operating the solution, including a solution for application lifecycle management
  4. Backend Services: Implementation and operation of backend services; integration with existing applications
  5. Communication Services: Setup and management of communication infrastructure
  6. On-Asset Components: Design and manufacture or purchase of on-asset hardware and implementation of on-asset hardware
  7. Asset Preparation: Preparation of the asset and/or environment for the new solution

In the following, we will look at each workstream independently, taking into account key tasks and dependencies with other workstreams.

Project Management

Whether based on a waterfall or agile approach, each project needs an efficient Project Management process. PMI, the Project Management Institute, defines the key elements of project management. In “A Guide to the Project Management Body of Knowledge (PMBOK)” [PM1], it identifies Initiating, Planning, Executing, Monitoring and controlling, and Closing as the five key project management processes. The same applies for an IoT solution project; for further details on standard processes, refer to the PMBOK Guide. At this point, we would like to highlight certain project management tasks that we believe to be at least partially specific to IoT. For project planning, these include the following:

  • Procurement planning: The complexity of the process of procuring the components and services required for an IoT solution should not be underestimated (for on-asset hardware and communication services, for example). These processes therefore need to be started early and planned for accordingly.
  • Resource planning: We already know that one of the key characteristics of an IoT project is the diversity of skills required, from embedded software to communications to enterprise software. This makes the resource mobilization process challenging and sufficient time and resources need to be planned for this task.
  • Quality management: The QM reports for the solution should include solution-specific reporting items such as communication service quality, average asset status, etc.
  • Solution Architecture Management: The Solution Architect who was ideally already responsible for the Initial Solution Design will be required to continuously review the detailed designs created by the individual workstreams and ensure that all of the different designs add up to one consistent solution. This requires the establishment of an architecture review process to ensure the architectural alignment of each of the different workstreams.
  • Interface Management: Efficient managment of the technical interfaces (software, communication protocols, and hardware) between the different solution components and workstreams is one of the most important success factors of any distributed systems project. We highly recommend the following:
    • Establishing a dedicated role with responsibility for coordinating all of the cross-component interfaces and continuously liaising with the different workstreams.
    • Establishing a central repository for capturing all cross-component interfaces, including a technical and a functional description of each interface. However, avoid tool overkill: use a wiki as a flexible and lightweight mechanism for creating and maintaining this central interface repository. This is especially relevant given that there is a good chance that the interfaces will be highly heterogeneous, from REST and OPC to MQTT and CoAP to very device-specific protocols.

Cross-Cutting Tasks

True to its name, the Cross Cutting workstream is responsible for tasks that cut across all the other workstreams, such as the Security, Asset Lifecycle Management, Solution Integration, and Testing workstreams.

Security

Given the importance of security to any IoT solution, there is a strong argument for making it a top-level workstream in its own right. However, because security impacts on each of the other workstreams, we believe that it should actually be treated as a cross-cutting workstream.

For a detailed discussion of the technical aspects of IoT security and how they concern the other work streams – On-Asset Components, Communication Services, Backend Services, and Infrastructure – refer to the Security Technology Profile further below.

In terms of the security workstream, it is important to understand that security is not just about technologies, it also requires security policies and processes, all of which need to be defined here. Also, the project must define a strategy for ensuring and validating the security of the solution, including:

  • Risk assessment: This should help to identify the highest-priority threats and the profiles of the most likely attackers. For an IoT solution, this obviously has to include attacks on assets and asset-related communications as well as an asset-specific impact analysis.
  • Security architecture: Specialized architectural perspective that shows how different security mechanisms will be integrated into the solution architecture.
  • Security implementation: Detailed plan including work packages for implementing the security architecture, which usually is cross-cutting and involves all other work streams.
  • Security testing and validation: Dedicated test strategy to help identify potential flaws in the security architecture and implementation. This could require outside experts.

Asset Lifecycle Management

As per our discussion in the introduction, many IoT solutions have a huge impact on the asset’s lifecycle, touching as they do on Integrated Product/Service Design, Product Manufacturing, Service Development, Marketing & Sales, Distribution & Activation, Service Operations, Remote Condition Monitoring, Solution Support & Product Maintenance (including Remote Maintenance and Predictive Maintenance), Digital Services, and Resale & Retirement.

We recommend creating a dedicated role in the project with responsibility for looking at the solution from an Asset Lifecycle Management perspective across all workstreams, thereby ensuring that this important perspective is addressed end-to-end.

Take asset activation, for example. This important part of the lifecycle usually has an impact on the on-asset software, the communication services, and the backend services. So it makes sense to look at it from a centralized perspective in order to ensure that the best solution is implemented across all workstreams.

Solution Integration and Testing

Finally, solution integration and testing is also another important cross-cutting task. The person or team responsible will be required to liaise with all the other workstreams in order to agree on key release milestones, interface versions, staging mechanisms, processes, end-to-end test plans, etc. Again, not too different from any large enterprise IT project, but with the added complexity of having to integrate the on-asset components into the test plan.

It is not always possible to create a test environment that fully reflects the field environment. If you have an IoT solution tasked with managing 1,000,000 devices, testing this solution with 1,000,000 test devices is not going to be feasible. Instead, pragmatic mechanisms will have to be found to allow completion of an initial integration test using a limited number of actual devices. Load generators can then be used to simulate a system with 1,000,000 devices and thereby test the scalability of the backend systems.

Similarly, efficient test strategies will have to be devised for communication services. If planning a global rollout, testing all locations is not always going to be possible. In this case, the test team will be required to closely work with the service provider in order to define a joint test plan.

Again, because these kinds of integration tests usually have an impact on all other workstreams, it makes sense to centralize these tasks under the Cross-Cutting Tasks workstream.

Solution Infrastructure & Operations

This workstream is responsible for setting up the overall infrastructure that will be used to build, test, and operate the solution. In particular, it includes the standard infrastructure setup tasks found in any enterprise software project. However, in addition to these standard tasks, there are a number of tasks that are specific to IoT solutions.

For a standard enterprise application, this workstream would require members of the development project and members of the operations team to work closely together in order to agree on the different environments required – usually development, testing/integration, and production. They would also need to agree on the related staging processes required for integrating multiple development updates into a new release and ultimately making the new release available in the production environment.

Configuration management & release planning as well as steady-state and operations planning also form part of this workstream. Tasks such as hardware sizing and acquisition, backup management, security infrastructure (firewalls, DMZ, etc.), systems monitoring, alarm management, scalability planning, availability management (primary/disaster site each with local resilience, for example), customer helpdesk functions, etc. need to be taken into account also. If the solution is cloud based, cloud instance health monitoring, cloud SLA management, and cloud-2-enterprise application integration need to be taken into account as well.

Even for a standard project, the diversity of technologies, tools, concepts, and people involved here makes for a complex scenario. For an IoT solution, you have all this, and more, due to the highly distributed nature of such solutions.

This workstream will have to ensure that Application Lifecycle Management (ALM) in all its complexity is applied consistently from the backend to the asset. This includes providing the required software update infrastructure for the assets, asset monitoring, management of errors and alarms issued by assets, etc.

Another key issue concerns the customer support infrastructure. For example, most telecommunications companies provide their support teams with specialized tools that allow them to perform basic equipment tests – such as “pinging” a DSL modem to test whether it is online, for example. This functionality is usually built into the core call center application, which provides additional information such as the customer contact history. Many IoT solutions will require similar support capabilities, which will need to be addressed in this workstream.

If the IoT solution is designed to support an existing field service team – by providing Remote Condition Monitoring (RCM) or predictive maintenance capabilities, for example – the provision of technical training to this team is also going to be very important.

Backend Services

Typical work items in the Backend Services workstream include creation of a solution for the management of assets and asset data, implementation of solution-specific processes, and integration with existing applications via EAI (Enterprise Application Integration) or SOA (Service-Oriented Architecture).

If you are using a specialized M2M/IoT application platform (for details, see Technology Profiles section), you will have to select, acquire, and install this platform – a process which if started from scratch can take weeks if not months, especially if it is a strategic technology decision.

The goal is usually to build a dedicated tier in the backend containing a central representation of all field assets. It includes up-to-date asset status data and a detailed asset event history, as well as management functions for individual assets or groups of assets. An M2M/IoT application platform will generally provide out-of-the-box features for implementing these functions, such as a method for communicating with remote assets and managing asset data locally. These platforms often come with an administration console that provides an overview of all deployed assets and their corresponding status.

The Backend Service workstream will be required to create suitable profiles for each asset and for the devices deployed on these assets. The profiles will then need to be implemented on the platform. This can be done using the tools provided by the platform, for example.

In most cases, the administration console will not be suitable for end users, so a solution-specific UI will have to be implemented. This could be part of a specialized application in which solution-specific processes have been implemented. These processes will generally have to be integrated with a number of existing enterprise applications.

For a car sharing service, for example, the first step involves creating a central model of the remote assets (the vehicles in this case) in the backend. This is followed by the implementation of UIs and applications for car sharing customers. A mobile application then accesses the central asset repository to get data on current car locations so they can be visualized on a map. The application also needs to support a car reservation function. For the billing process, an existing application is integrated. The Web UI provides users and call center agents with a detailed car usage history. This data is pulled from events received from the car’s on-board unit, such as events that signal the beginning and end of a car rental process.

Another key task generally involves integration of the solution with a user management system. In the simplest case, this requires a web-based solution for user registration and login. In many examples – such as the car sharing service above – this might be more complex, and involve setting up a more elaborate registration process to validate the user’s driver license and issue a chip card for opening the car, for example.

Communication Services

The structure of the workstream for setting up and managing communication services between the assets and the backend will depend on many factors, not least the required global coverage, maximum latency and bandwidth, cost, etc. The Communication Services technology profile described later provides guidelines for selecting the right technology and provider.

Based on the initial solution design, this workstream is concerned with completing a more in-depth analysis in order to determine the best possible setup for the communication service. This can involve tasks such as the following:

  • Collecting detailed requirements on scope/services, including expected traffic profiles and backend connectivity and locations
  • Analyzing hardware capabilities, available support for hardware selection based on communication services supported, and interaction with mobile network and roaming
  • Determining the regulatory situation in the launch markets, including communication regulations and vertical-specific regulations (country-specific regulations for the eCall service, for example)
  • Monitoring communication behavior and potential impact on network, such as  data volumes, required resilience, and exception handling

This information will provide the basis for selecting the right technology and communication service provider. The time needed to select a provider and negotiate service contracts should not be underestimated.

During the build phase, it is important that the service provider makes a test environment available. The overall solution will then need to be integrated with the communication service provider’s infrastructure and tested. This can be done using the operator’s dedicated portal for managing connectivity, for example.

During runtime, the same portal will be used to monitor and manage the communication network, including provisioning status, device data usage, SIM status, networks errors, etc.

The following interview with Stephen Blackburn, Director of Sales Engineering at Aeris, provides more valuable insights into these topics.

Jim Morrish: What are the key points on the project manager`s check list with respect to planning and implementing the mobile communication part of his solution?

Stephen Blackburn: This project manager must have the understanding and insight to shepherd the development of a mobile communications solution from start to finish and have the scope to understand the widely differing elements that come into play. He must oversee the following considerations

1) Go-to-market business model – how the company will charge their customer

2) Application communication call flow – how the device will interact with the network

3) Definition of expected usage patterns, including normal operating patterns and over-the-air upgrades

4) Supply Chain Planning, taking into consideration device manufacture, device provisioning in carrier network as well as testing the communications link at the end of manufacture, and channel to market goals

5) Cellular carrier selection

6)  Carrier Integration (API integration)

7) End-to-end application test plan incorporating testing of the communication link across different RF conditions

8) Device Carrier Certification

9) Carrier support SLA understood and in place

10) Customer support process definition and support team training

Jim Morrish: What are the key interfaces between the mobile communication work stream and the other work streams, i.e., on-asset hardware, security, and so on?

Stephen Blackburn: The PM should be aware of how the mobile communication planning impacts other parts of the business so that these other groups are able to provide input to planning and execution as well as ensure that their own requirements are met. The key cross over areas include

  • Regulatory approvals
  • Manufacturing and supply chain
  • Security
  • IT systems
  • Operations and Support
  • Finance

Jim Morrish: Can you give advice on solution testing, specifically the mobile communication part? How does testing change in this type of environment?

Stephen Blackburn: Comprehensive testing is key. It is not only important to test the core functionality of the product, it is also essential to ensure the solution operates correctly over a cellular network which has different characteristics to a LAN, for instance.

In addition to solution testing, if the device is going to go through carrier certification to be permitted onto a specific carrier’s network, then the carrier will want to check that your device is a “good citizen” on the network. An example of this might be the behavior of the retry algorithm used by the device if it fails to connect to the application server in your data center.  At this point the reader might wonder, “Why would a carrier care?” The thing is that carriers design for network capacity, assuming user access to their network follows a random distribution. If all users tried to access the network at the same time, congestion may result and block other users’ access to service. So imagine you are planning a massive roll out of 100,000s or even millions of devices. If your data center experiences an outage resulting in all of your deployed devices losing contact with the network at the same time and they all try to retry immediately and fail, and retry immediately again and continue to fail this will put a synchronized load on local towers and the carrier network infrastructure which could impact other users if it were to cause congestion. So designing a random back-off algorithm and testing it prior to certification is a good idea.

Solution testing usually includes lab testing in a controlled environment but it is important to include a field testing phase to test the solution in the environment in which it will later be deployed. During the lab testing phase engineers typically ensure that the device has a good signal level which will provide a good quality cellular connection and achieve consistent data rates across the air interface. This is fine for basic functional testing but negative test cases should be introduced to force poor signal quality conditions which are sometimes present in the field and lead to low bandwidth and long latency conditions. Applications requiring near real time communication may not operate correctly under these conditions so it is a critical to consider this when designing the application, selecting the technology (for example 2G, 3G, LTE all have different latency characteristics), and when testing the application.

After the lab testing is completed, and assuming everything operates as expected there will be a high level of confidence that the device will operate as expected for a majority of its operation. However, there are those 5% of test cases which cannot be contrived in a lab and only the confluence of events found in a field environment can expose those cases. So it is important to think about all of the use cases which describe the solutions deployment, and then think of a few more and plan a limited pilot deployment to test those use cases. Test locations should include environments with poor and good quality cellular signals. Field testing a small device population will provide an indication of how the population of devices will behave after launch. Providing you select the appropriate sample size and deployment environments then if you see a problem with 1% of your devices then, within a margin of error, you are likely to see the same problem in 1% of your devices after launch. If you can address the problem and get the problem device count to an acceptable margin then there is a high level of confidence that after launch the percentage of devices with problems remain manageable.

Jim Morrish: Are there any pitfalls to be avoided during solution deployment and commissioning?

Stephen Blackburn: Issues which can occur during deployment depend on both the application and the deployment model. For instance, if the device is sold to consumers for a solution which is deployed in their home it is a certainty that a percentage of home owners will place the device in the area of the home with the worst cellular coverage (i.e. their basement). It is important to provide simple instructions which explain how to place the device. If it is within budget it is often worth putting some indicator on the device which lets the user know when they have placed it in an area with sufficient coverage.

Other applications may require the device to be placed in an area which isn’t optimal for cell tower coverage such as a manhole. These types of applications are typically installed by field technicians and it is important to equip them with a network scanning tool which can provide signal quality measurements for the networks covering an area. The technicians can use this to determine the signal quality in the location and determine the best placement.

Jim Morrish: How about after going live – what are the key things to plan for?

Stephen Blackburn: If the solution completed a pilot field test phase the risk of major problems after launch has been greatly reduced. However, even if during testing you are confident there are no issues, it is likely that some of the end users will experience problems including “operator error” type problems. So it is essential to setup appropriate support processes and train the support team to handle the inevitable calls from customers.

Support personnel will generally be trained on the product features and how to operate them but it is also very important for the support team (usually tier 2) to receive training on how to diagnose connectivity issues. This is where a rich set of diagnostic tools from the carrier, if available, become a huge benefit. If your tier 2 engineer can log into a portal and check if the device in question has registered on the carrier network, started a data session, and they can observe the recent behavior they can immediately focus their investigation to the root of the problem and provide quick feedback to their customer. If these tools are not available then a call to the carrier helps but is generally a lot slower.

On-Asset Components

Based on the input from the Initial Solution Design phase, this workstream is concerned with finalizing the design for the on-asset hardware and software. This can involve hardware manufacturing, local communication, selection and implementation of firmware and software, as well as integration and testing.

On-Asset Hardware

Ideally, the required sensors and actuators as well as gateways and other on-asset hardware will have already been identified as part of the Initial Solution Design phase. As part of this workstream, further testing and evaluation will generally need to be carried out in order to ensure that all elements are working together properly.

Depending on the outcome of the make-or-buy decision, a sourcing process may have to be initiated for external procurement of the required hardware components. Alternatively, it may be necessary to initiate a hardware design and prototyping process, followed by a hardware manufacturing and testing process.

On-Asset Operating System and Application Container

Selection of the operating system and application platform or container usually goes hand in hand with selection of the hardware. A number of M2M/IoT hardware providers offer integrated solutions, some of which even offer a backend management service that allows for remote management and remote firmware upgrades. Although we have grouped all tasks relating to application lifecycle management under the Solution Infrastructure workstream, there is a strong dependency between these two workstreams. The same also applies to lifecycle management of the application software running on the application platform. Finally, implementation of backup and restore functionality for local data should also be addressed at this point.

On-Asset Middleware

If you are using M2M or IoT middleware, the on-asset-specific elements of this middleware will have to be pre-installed and configured.

You will also need to set up and configure the communication protocols and ensure security certificate management for this middleware. These tasks can have strong dependencies with the Cross-Cutting Tasks and Solution Infrastructure workstreams.

Similarly, local processes for distributing and updating application software will have to be tested and integrated with the backend solution provided by the Solution Infrastructure workstream.

Device Integration

In many cases, the on-asset middleware will require the implementation of hardware-specific drivers, so that data and services offered by different devices can be made available to the backend in homogeneous format. The design and implementation of these local drivers and protocols should not be underestimated.

Also, assumptions made during the initial solution design phase about the accessibility of certain interfaces may not be correct. If it turns out that the asset needs to provide a certain interface, this information will need to be passed back to the Asset Preparation workstream. For example, we saw in one of our case studies that the initial design was based on the assumption that the asset would provide access to battery health data, which did not turn out to be the case. In this particular situation, it took quite some time to upgrade the asset so that it was capable of providing this interface, which in turn had an impact on the rollout of the entire solution.

On-Asset Business Logic

Implementation of on-asset business logic is another key task of this workstream. This can range from low-level embedded software to complex business logic including local business rules and data management. Companies such as Cisco are promoting the concept of “Fog” computing with powerful gateways that can store significant amounts of data on the gateway and process significant volumes of data locally. This can be related to our discussion of the functional design phase; specifically data and logic distribution between the on-asset software and the backend. While the functional design provides an initial proposal, the details will have to be decided in this workstream. Because data and logic distribution has an impact on the Backend workstream as well, close interaction is required between both workstreams. Also, we recommend implementing a central interface management as part of the project management process (see below).

Backend Integration

For integrating the asset with the backend, the required protocols must be available locally on the asset. For some gateways, for example, this is not always a given. Another important task involves defining the relevant authentication and authorization procedures. As a rule, user management will also need to be synchronized with a central backend function.

Asset Preparation

As discussed in the introduction, the Ignite | IoT methodology assumes that the IoT project is implemented independently of the main organization responsible for the asset the solution is built on. Because it represents the organizational interface between the asset and the IoT solution, the Asset Preparation workstream is extremely important. It is important to make it an explicit workstream, and to clearly identify who it is owned by, and who has to contribute to it. Contributions generally come from both the asset team and the solution team. Overall ownership should be assigned to one of the two teams.

The Asset Preparation workstream is required to look at the entire lifecycle of the asset and the solution, including analysis, (potentially integrated) design, asset manufacturing, solution implementation, support, and operation.

During the analysis phase, the asset – and potentially its environment – will need to be analyzed in order to determine the best way of mounting the solution’s on-asset hardware components, such as antennas, sensors, beacons, gateways, etc. This will involve consideration of aspects such as access to local bus connections and power supply.

If the solution requires a specialized network, the requirements for setting up this network will have to be examined. If the solution relies on beacons (or similar) for indoor positioning, the Cartesian coordinates of the exact positions of the beacons will have to be captured.

If the asset was designed from the ground up, or modified to accommodate the IoT solution, there will need to be close alignment between both teams during the design phase as well as clearly defined technical interfaces (on the hardware and/or software level).

In terms of integration between the asset and the IoT solution, it is worth considering the asset’s EBOM (Engineering Bill of Material). An asset will generally have a full EBOM stored in a PLM (or similar) system. For an IoT solution, the BOM may well consist of different sections: the on-board hardware, the on-board software, and the backend software, for example. The EBOM for the asset should include a reference to the on-board hardware at least.

Decisions will also need to be made in relation to the asset’s manufacturing process. Are the on-asset hardware components be attached to the asset as part of the asset’s manufacturing process? Or will these be retrofitted afterwards? In either case, provisioning of the on-asset components will have to be taken into account in the Supply Chain Process. Also, special assembly skills may be required and provision will need to be made to secure these.

In terms of a retrofitting approach, the Purfresh case study provides an interesting example. In this case, a set of specialized devices (gateway and sensors) had to be retrofitted onto containers onboard cargo ships. Purfresh set up a process whereby specialized handling agents perform this task in the short timeframe in which the container ships are docked at the harbor where the solution is to be installed. Note that this process not only includes local assembly but also activation and testing.

Finally, many IoT solutions are designed to improve after-sales processes. This can involve remote condition monitoring, predictive maintenance, and even digital after-sales services (see Introduction). The Asset Preparation workstream needs to be responsible for training the organizational teams involved in the new capabilities and ensuring seamless adoption – to the point of building up new organizational units, if required.