Welcome toVigges Developer Community-Open, Learning,Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
511 views
in Technique[技术] by (71.8m points)

python - Unable to understand a part of code about Linear Regression from sentdex tutorials on machine learning

Was following Sentdex Machine Learning Tutorials in youtube. In the 5th part he does this

forecast_out = int(math.ceil(0.01*len(df)))
print(forecast_out)

df['label'] = df[forecast_col].shift(-forecast_out)

X = np.array(df.drop(['label'],1))
X = preprocessing.scale(X)
X = X[:-forecast_out]
X_lately = X[-forecast_out:]


df.dropna(inplace=True)
y = np.array(df['label'])
y = np.array(df['label'])

I got completely lost what he was trying to do here. In int(math.ceil(0.01*len(df))) he was trying to get the number of days he wants to find the prediction of. After that, he did df[forecast_col].shift(-forecast_out) and i couldn't anything after that.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

There is not enough information here, but if this is a time series forecasting problem, then what I think is that df[forecast_col].shift(-forecast_out) shifts the forecast column up for 'forecast_out' number of days so that the label column for a specific day would be the number you need to forecast (which is, the number shifted from the future).


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to Vigges Developer Community for programmer and developer-Open, Learning and Share
...