I am trying to predict the Real time electricity prices using various different features. I have used XG BOOST method and LSTM method for the this time series data. I tried to make the base line model using linear regression but my target variable is not normally distributed, even after the lot of different approached such as log transformation etc i couldn't normalize the Y variable. so I just moved on with he XG boost and LSTM. fOr FEATURE CREATION , i USED hot encoding for the time datetime column, i created the extra feature such as lag of y and lag of rolling mean of y variable. I did auto hyperparameter tuning as well and here is the result on the test set :XGBoost - RMSE: 76.047788967971, MAE: 23.974193481494265, MAPE: 1072.247447050217, R^2: -0.24848103246590503 LSTM Sequential RMSE: 74.7214270995582, MAE: 17.199546833912457, MAPE: 552645178.2760035, R^2: -0.20531089121045043. Can anyone help?
What is sequence of the data for LSTM?
import kerastuner as ktfrom tensorflow import keras from keras.models import Sequentialfrom keras.layers import LSTM, Dense, Dropout from keras.regularizers import l2, l1from keras.optimizers import Adam def build_model(hp): model = Sequential() # You can choose a range for the number of units in your LSTM layer model.add(LSTM( units=hp.Int('units', min_value=32, max_value=512, step=32), return_sequences=True, input_shape=(X_train_reshaped.shape[1], X_train_reshaped.shape[2]), kernel_regularizer=l2(hp.Float('l2', min_value=1e-5, max_value=1e-2, sampling='LOG')) )) model.add(Dropout(hp.Float('dropout_1', min_value=0.1, max_value=0.5, step=0.1))) # Adding a second LSTM layer and some Dropout regularization model.add(LSTM(units=hp.Int('units', min_value=32, max_value=512, step=32), return_sequences=False)) model.add(Dropout(hp.Float('dropout_2', min_value=0.1, max_value=0.5, step=0.1))) # Adding a dense hidden layer model.add(Dense(units=hp.Int('dense_units', min_value=32, max_value=256, step=32), activation='relu')) # Output layer model.add(Dense(units=1)) # Compile the model learning_rate = hp.Float('learning_rate', min_value=1e-5, max_value=1e-2, sampling='LOG') model.compile(optimizer=Adam(learning_rate=learning_rate), loss='mean_squared_error') return model # Create a new directory for the tuning session to avoid loading old results new_directory = 'my_new_dir'new_project_name = 'lstm_hyperparam_tuning_new' tuner = kt.RandomSearch( build_model, objective='val_loss', max_trials=5, executions_per_trial=3, directory=new_directory, project_name=new_project_name) # Start the hyperparameter tuning process tuner.search( X_train_reshaped, y_train, # Make sure you're using the reshaped y_train if necessary epochs=10, validation_data=(X_test_reshaped, y_test), # Again, use reshaped y_test if needed callbacks=[EarlyStopping(monitor='val_loss', patience=5)] ) # Retrieve the best model. best_model = tuner.get_best_models(num_models=1)[0] #Summary of the best modelbest_model.summary()
Обсуждают сегодня