Skip to content

DQN can't find a good policy #11

@frostyduck

Description

@frostyduck

According your advice, I switched to Stable-Baselines instead of openAI baseline in the Kundur system training.

def main(learning_rate, env):
    tf.reset_default_graph()  
    graph = tf.get_default_graph()

    model = DQN(CustomDQNPolicy, env, learning_rate=learning_rate, verbose=0)
    callback = SaveOnBestTrainingRewardCallback(check_freq=1000, storedData=storedData)
    time_steps = 900000
    model.learn(total_timesteps=int(time_steps), callback=callback)

    print("Saving final model to: " + savedModel + "/" + model_name + "_lr_%s_90w.pkl" % (str(learning_rate)))
    model.save(savedModel + "/" + model_name + "_lr_%s_90w.pkl" % (str(learning_rate)))

However after 900000 steps of training DQN agent cannot find a good policy. Please see average reward progress plot

https://www.dropbox.com/preview/DQN_adaptivenose.png?role=personal

I used the following env settings

case_files_array.append(folder_dir +'/testData/Kundur-2area/kunder_2area_ver30.raw')
case_files_array.append(folder_dir+'/testData/Kundur-2area/kunder_2area.dyr')
dyn_config_file = folder_dir+'/testData/Kundur-2area/json/kundur2area_dyn_config.json'
rl_config_file = folder_dir+'/testData/Kundur-2area/json/kundur2area_RL_config_multiStepObsv.json'

Mu suggestion is that in the baseline scenario kunder_2area_ver30.raw (without system loading), short circuit might not lead to loss of stability during the simulation. Therefore, (perhaps) DQN agent finds a "no action" policy, that so as not to receive the actionPenalty = 2.0. Because according the reward progress plot, during training agent cannot find a policy better than mean reward 603.05. When testing, mean_reward = 603.05 means "no action" policy (please see figure bellow)

https://www.dropbox.com/preview/no%20actions%20case.png?role=personal

However it's only my suggestion, I can wrong. I thought to try scenarios with increasing load in order to get for sure loss of stability during simulation.

Originally posted by @frostyduck in #9 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions