National Building Control Office Open Data as an input for Building Fire Risk Prediction

National Building Control Office Open Data as an input for Building Fire Risk Prediction

The National Building Control Office (NBCO) host an Open Data Portal at https://data.nbco.gov.ie/.

Access to quality data on housing development, building activity and completions is a key requirement for effective decision-making and the NBCO is now making our data available to all stakeholders at the same time to support evidence-informed decision making on all forms of development.

Might this be of use as a source of quality data for the purposes of performing Dwelling and non-Dwelling Fire Risk Prediction? This blog post outlines the various datasets, and how they might be used for analysis.

Accessing the Data

There is only one dataset available through this portal at the moment, here and it has the somewhat cryptic title of BuildingsCNsCCCs. This represents Building Commencement and Completion Data, dating from 2014. It’s a fairly beefy file in csv format, measuring 104Mb at time of wirting. This certianly something that could benefit from being surfaced as an API, in a similar vein to the Valuation Office’s offering - https://opendata.valoff.ie/api/.

First Impressions

Size: This dataset has 94 columns, and just over 100,000 records

Summary: Due to the large size of this table, the R package summarytools was used to look at the data and get a first impression. Some interesting features of the dataset are shown now:

proposed_use.PNG
  1. There are 95 categories for “proposed use of building” , with residential_dwellings making up the bulk of the data at 78%. It’s disappointing to see other as the second most frequent at 5%….. but there is a sub-group column also.

  2. Type of construction could be a valuable data point, particularly if it is possible to isolate timber-frame construction buildings.

  3. The height field has a median of 7 (presumably metres?). There is also a column for height of top floor.

  4. It may be possible to determine change of use, from the existing to proposed columns.

  5. There is a field for number of stories below ground, numbe rof stories above ground.

  6. Cladding System is listed.

  7. Number of rooms and number of bedrooms are listed.

  8. Heat Supply is listed.

  9. Fire engineered solution is a column.

  10. Location Data is available for 86.5% of the data (lat, lon), Eircode for 18.5%. There are some significant issues with this though, and it is a pity that the BCMO could not invest more time and effort into cleaning this data, or validating at source….
    Looking in detail at Dublin Data, the precision of the majority of the coordinate pairs is at one or two decimal places, leaving it difficult to believe that this might be a useful dataset, unless it were extensively geocoded and validated.

Location Data Quality

Location Data Quality

Similarly, looking at the quality of Eircode values, after 2018 the quality is improving at all times, but still not great.

Eircode Quality

Eircode Quality

Verdict: Poor!

Significant effort will be required to render this dataset useful as a source of data for use as a Fire Risk predicitve modelling feature, given the poor quality of the location data. It is disappointing that National Agencies do not put good store in the location quality of the data they use, or if they do, and have geolocated this for internal use, that it is not being released in that format for public consumption.


Demographic Clustering for Community Risk Reduction

Demographic Clustering for Community Risk Reduction