We have to use the shortcut "pd" that is shortcut for pandas and method read_csv() as code above. The dataset file is Data.csv.
So, we have four columns: Country, Age, Salary, and Purchased. also we have ten observations (rows). You have to understand that index in Python is start at zero.
There is something very important to understand machine learning in Python, we have a dataset but we need to distinguish the matrix of features and the dependent variable vector. We are going to create the matrix of three independent variables and simply called "X". Also we create the dependent variable vector which is going to be the last column with the ten observations.
Below, how to write the code
for variable "X" (independent variable), we take all the lines of data and -1 means left the last column. So, only the first three column. for variable "y" (dependent variable), 3 means only get column index three.
Ok, we have imported the data set and prepared the data correctly. See you in the next tutorial
0 comments:
Post a Comment