coded CCTV survey footage from several UK water companies into a central storage location.
Collaborative project between water companies, WRc and AI vendors to develop a single dataset for sewer CCTV footage classification for the purposes of training AI, improving the accuracy of AI software companies’ solutions and accelerating the adoption of this technology.
coded CCTV survey footage from several UK water companies into a central storage location.
images from the coded CCTV survey footage using the metadata to indicate a defect.
the defect images and metadata manually to ensure they were suitable for AI model training.
Sewers are critical infrastructure. Their maintenance is of high importance, and to ensure they maintain performance their condition must be understood. Water companies spend over £20,000,000 per year to inspect their networks. CCTV inspection of sewers provides a method for asset condition assessments.
To enable repairs and replacement to be completed efficiently, the defects and features of sewers are classified by engineers using the CCTV footage. The process is slow, labour intensive and repetitive. The accuracy of the classification is important, but accuracy can be difficult to achieve.
Artificial intelligence (AI) can be used to improve the accuracy of the classification and reduce the overall time required. AI is widely used for object/ facial recognition in many sectors. AI models must be trained before they can perform object recognition, which requires accurate data. Obtaining the data for AI development in the area of sewer condition assessments is a significant obstacle for adoption of this technology.
A collaborative project between water companies, WRc and AI vendors was commissioned to develop a single benchmark dataset and accessible repository for sewer CCTV footage classification. The dataset can be used for AI training purposes and will enable AI software companies to improve the accuracy of their solutions and accelerate the adoption of this technology.
Previously coded CCTV survey footage from several UK water companies was provided to WRc who collated the footage into a central storage location. Images were spliced from the coded CCTV survey footage using the metadata. The defect images and metadata were manually validated to ensure the images and metadata were suitable for AI model training. Where the metadata was not correct for a defect, it was changed using a bespoke app which allowed easy correction of the metadata while keeping the correct file format for use with AI model training.
The dataset, comprising an image library of more than 27,000 images, was hosted online by Spring. It is freely available with data such as number of downloads being recorded for assessment of the value of this project.
Following discussions with 7 UK water companies and 4 AI software providers, adoption was identified as:
i. Training AI models
ii. Validating existing AI models
iii. Evaluating existing AI models
It was noted that the digital size of the image library may hinder adoption for some. To minimise this barrier the data within the image library can be downloaded in chunks.
To maximise adoption potential of the dataset, it was noted that continuous development of the library would be useful, to grow the number of images for each defect code to at least 1,000.