Machine Learning Tutorials from A to Z - Part 4 ( Missing Data ) - Technic Hubs

Technic Hubs

Technic Hubs aims in providing hands-on experience on Machine Learning, Augmented Reality(AR), Virtual Reality(VR), Django(Web Development), Flutter and React(App Development), Internet of Things(IoT) with videos on Nepali.

Breaking

Monday, November 13, 2017

Machine Learning Tutorials from A to Z - Part 4 ( Missing Data )

Booom!! What the heck is goin everybody!! Welcome to the Machine Learning Tutorial part 4 that is caring the missing data. In our data set also we have some missing value which is written as nan. You can check your data.csv file or print X in console as we have already saved values in X and y while importing datasets. We need all the data for proper operation and reduce malfunction. For any system to learn it should have clear datas.
How should we take care of missing data then? Its a very. You have to calculate the mean of the total datasets expect the missing one. And replace that missing number with the mean of the given data values. We can program to find the mean. But we have libraries in our support, for this we use sklearn library and data-preprocessing sub library. To import this library, type:
from sklearn.preprocessing import Imputer
imputer = Imputer(missing_values='NaN', strategy = 'mean', axis=0)

Here, imputer is the class in the sub-library preprocessing. We make object of the class Imputer as imputer. Capital 'I'mputer and small 'i'mputer are different things. Imputer is class while imputer is object. Imputer takes parameters as missing_values which is given by NaN in dataset of our data.csv file. Here, strategy means the calculation we perform i.e. mean. At last, axis 1 means rows and axis 0 means columns. We use x=0 because, we have to find mean from columns.

No comments:

Post a Comment