News

How we manage data in our Smart Building Platform – Digispace

– Dinesh Kumar, Software Engineering Manager

When you build a smart building platform like digispace, the first thing you learn is that buildings are not standardised. The sensors are not standardised. The manufacturers are not standardised. And the data certainly is not standardised.
Digispace connects a wide range of devices occupancy sensors, indoor air quality (IAQ) monitors, people flow counters, water meters, parking sensors, energy clamps, cloud-based access control systems, and traditional AV equipment such as displays, projectors, blinds, and DSP etc. Each of these comes from different manufacturers, uses different protocols, and produces data in completely different shapes.
For a while, we handled this the way most teams do at the start: we wrote bespoke integration code for each device type. It worked. But it did not scale. Every new manufacturer meant another one-off implementation. Every schema change upstream meant a code change downstream. The codebase was growing in the wrong direction — wider, not smarter.
So, we stepped back and asked a more useful question: what if the platform did not need to understand every device? What if it only needed to understand the data?

Every new manufacturer meant another one-off implementation. The codebase was growing in the wrong direction – wider, not smarter.

 

The Problem in Concrete Terms

To understand what we built, it helps to understand exactly what was breaking.
Inconsistent data formats. An occupancy sensor from Manufacturer A might report presence as a Boolean true/false. Manufacturer B might use 1/0. Manufacturer C might send a string: ‘occupied’ or ‘vacant’. Downstream, the platform needed to treat these identically, but the raw data looked completely different. Different units of measurement. An IAQ sensor might report CO2 in parts per million (ppm). Another might report it in milligrams per cubic metre. Temperature arrives in Celsius from some devices, Fahrenheit from others. Water consumption might come in litres or cubic metres depending on the meter manufacturer.

Data arriving at different frequencies. Some sensors push data every second. Others every minute. Others only on state change. Building a consistent time-series view across all of them, without gaps or misleading spikes required handling temporal normalisation as well as structural normalisation. Schema mismatches between sources. A parking sensor payload might use a field called ‘status’. An access control event might use ‘state’. A display might use ‘power’. These are semantically the same concept – but structurally, they live in completely different places in completely different payloads.Taken individually, each of these problems is manageable. Together, across dozens of device types, they compound quickly.

What We Built: The Digispace Driver Framework

Our answer was the Digispace Driver Framework: a normalisation layer that sits between raw device data and the platform, and that runs at both the edge layer and the front-end UI layer. The core idea is simple: instead of writing integration logic per device, we define a common set of data points — a canonical schema that represents what the platform cares about – and we write a driver that maps each device’s raw output to that schema. The platform only ever speaks to the canonical schema. The drivers handle everything else.
The framework has two key responsibilities:

  1. Normalisation. Convert incoming data — regardless of format, unit, or structure — into a standardised data point. Temperature is always Celsius. Occupancy is always boolean. CO2 is always ppm. The platform never has to handle unit conversion or field name ambiguity.
  2. Raw data preservation. We store the original, untransformed payload alongside the normalised version. This is important for debugging, for auditing, and for cases where the raw data contains information the normalised layer does not yet capture.

 

The platform only ever speaks to the canonical schema. The drivers handle everything else.

 

How It Works Across the Stack:

The framework is built in JavaScript, which was a deliberate choice – it allowed us to run the same driver logic in two very different environments without duplicating code.

At the edge layer:

Digispace deploys Node-RED on-premises in buildings where local processing is needed. The Driver Framework runs directly within Node-RED, meaning data is normalised at source – before it ever leaves the building. This reduces bandwidth, improves latency, and means the cloud platform receives clean, structured data rather than raw noise.

At the UI layer:

The same framework also runs within our React front end. For devices that push data directly to the cloud, the driver logic executes client-side, transforming payloads before they are rendered or stored. Consistency is maintained regardless of where in the stack the data enters.
The result is a single codebase, two execution environments, one canonical output.

Adding a New Device Type

This is where the framework pays for itself most clearly.
Before the framework existed, adding support for a new sensor meant touching multiple parts of the platform: the ingestion pipeline, the data model, the API layer, and often the front end too. A new manufacturer was a multi-day engineering task.
Now, adding a new device means writing a driver — a mapping file that declares how each field in the device’s payload corresponds to a canonical data point, and what transformations are needed to get there. The rest of the platform remains untouched.
A driver for a straightforward sensor can be completed in hours. Complex devices with rich payloads take longer, but the work is contained and predictable. The platform does not grow in complexity just because the building does.

What This Unlocked

The most immediate benefit was speed. New device integrations are faster, and they arrive already tested against the canonical schema rather than requiring platform-level validation. The second benefit was reliability. Because every data point flows through the same normalisation logic, anomalies are easier to detect. If a driver is producing unexpected output, it surfaces quickly – rather than causing subtle inconsistencies downstream that are hard to trace. The third benefit was flexibility. The raw data store means we can revisit historical data if the canonical schema evolves. We are not locked into today’s interpretation of the data – we can always go back to the source.
And the less obvious benefit: the framework made it possible to add new building types and new device categories without revisiting existing integrations. Occupancy sensors, water meters, and access control systems all coexist cleanly on the same platform because the platform only ever sees normalised data points.

The platform does not grow in complexity just because the building does.

 

Lessons Learned

A few things we would tell our past selves:
Define your canonical schema carefully and early. The temptation is to start building and define the schema as you go. Resist it. The canonical schema is a contract — everything downstream depends on it being stable and well-reasoned. We spent real time on this upfront and it saved us considerably more time later. Preserve the raw data from day one. We added this as part of the initial design, and it has proved invaluable. When a driver has a bug, or when a device firmware update changes the payload format, having the original data means the problem is recoverable.

Edge and cloud are not as different as they look. Running the same JavaScript framework in Node-RED and React required some careful design, but the consistency it provides is worth the upfront effort. Treating both environments as first-class targets from the start avoided a divergence problem we would otherwise have had to solve later. A normalisation layer is not a one-time project. The framework needs maintenance as device ecosystems evolve. Treat it as infrastructure — not a feature.

Where We Are Now

Digispace currently supports occupancy sensors, IAQ monitors, people flow counters, water meters, parking sensors, cloud-based access control, energy clamps, BMS system and a range of AV and building control devices. Each of these connects through the Driver Framework, producing a consistent, clean data layer that the platform and our partners can rely on.

The framework is not the most visible part of Digispace. It does not appear on dashboards. Partners do not interact with it directly. But it is the reason the platform can make the invisible visible – across any building, any device mix, at any scale.

That is usually how the most important infrastructure works. Quietly. Consistently and only noticed when it is not there.