Sep 30, 2024
354 words
Model Based RL

python cs285/scripts/run_hw4.py -cfg experiments/mpc/halfcheetah_0_iter_layer_1_size_32.yaml python cs285/scripts/run_hw4.py -cfg experiments/mpc/halfcheetah_0_iter_layer_1_size_16.yaml python cs285/scripts/run_hw4.py -cfg experiments/mpc/halfcheetah_0_iter_layer_2_size_16.yaml Get predictions pred_obs_deltas_normalized = self.dynamics_modelsi pred_obs_deltas = pred_obs_deltas_normalized * self.obs_delta_std + self.obs_delta_mean pred_next_obs = obs + pred_obs_deltas ``` Action Selection rewards = np.array([

Nov 29, 2023
452 words
Offline RL

Note: All Yaml files are in the git repo: https://github.com/jimchen2/cs285-reinforcement-learning python cs285/scripts/run_hw5_explore.py \ python cs285/scripts/run_hw5_explore.py \ python cs285/scripts/run_hw5_explore.py \ The Random Network Distillation algorithm encourages exploration by training another neural network to approximate the output

Nov 24, 2023
521 words
Q Learning and SAC

Compute Action and use epsilon greedy action = torch.tensor(random.randint(0, self.num_actions - 1)) action = self.critic(observation).argmax(dim=1) Step environment Add data to replay buffer replay_buffer.insert(...) Sample from replay buffer batch = replay_buffer.sample(config["batch_size"]) Train agent, we update the

Nov 20, 2023
872 words
Policy Gradients

There are 2 kinds of estimator for Policy Gradients, full trajectory and and "reward-to-go" We run the two configs on Cartpole with different parameters, specifically, rtg means reward to go, na means normalizing the advantages.

Nov 18, 2023
371 words
Imitation Learning with DAgger

We run imitation learning and Dagger based on expert policies. In this experiment the expert policy is directly sampled out from a trained Neural Network, so Dagger differs from real world applications in that it

Jun 01, 2023
1594 words
Berkeley in My Eyes

Downtown Berkeley is really ugly and there are many homeless people. They yell, beg for money, shout profanity, and sometimes run around nude. I could smell the scent when walking past them, like they hadn't

It has been a while since I posted, and Class is starting tomorrow. Today is Martin Luther King Day. Everybody takes a day off. Today is a fine day, the sun shining bright and clear

Visiting-Berkeley-High-SchoolVisiting-Berkeley-High-School
Jan 12, 2023
784 words
Visiting Berkeley High School

Today I went to BHS for a visit; I mean, I basically sneaked into the campus for a whole morning. Now I am going to talk about the impressions I had of that school. I

When I applied, I wasn't very serious; I just wanted to take a chance. I started the form on Oct 7th. At that time, I was still hesitating about whether to go to an exchange