Building a framework to detect faulty IoT devices using Machine Learning.

I was able to find a dummy dataset of around 100 temperature data on the web and modifying it now as we speak to be around 10,000 entries.


I doubt even that’s not enough for ML algorithms to analyze and predict stuff. Since there are 2 main parts on my end, data collection and faulty device prediction. I will be using a IoT simulator based on a Raspberry Pi3 for data collection. Wrote a Python server script on an AWS EC2 cloud server to fetch every log from the RPi3.


But the data collected from the RPi3 is not enough for the prediction problem. That’s where my dummy dataset comes in >.> I have split them as 2 sensors and given more faulty ‘weight’ to the 2nd sensor while modifying it so obviously if the ML part works right, it should predict that the 2nd sensor is broken if I provide some sort of an abnormal input such as a very low minus temperature or a high temp. I have labelled each entry as either “faulty” or “ok”. This helps in training the dataset first. After that, I feed the data to score the model without the device status (faulty/ok), that way it should predict whatever input I gave later. Even if the input is not one from that dataset of 10,000 entries, it should accurately predict whether the device is about to die or is faulty. To get a basic idea of what I just said, refer to my paper