Today I decided to play around with the BSD distros. I have used many Linux distros and found they are pretty much alike, no big difference anyways except for package managers. Anyways, I hopped on

Recently, the idea of self-hosting was so intriguing that I decided to code a full-stack video platform. There are many Object Storage providers, including Amazon AWS, Akamai, Digital Ocean, Cloudflare, Alibaba Cloud. Nearly all of

Mar 09, 2024
1968 words
Booting LineageOS(Finally)

I've failed countless times while experimenting with different phones and operating systems. Unlike computers, which have a universal architecture, each phone possesses its unique structure. This makes installing operating systems on them exceedingly challenging. Furthermore,

Mar 04, 2024
134537 words
Random Writings in 2024

There is blocksite chrome extension which helps you block websites. But it is inefficient because Guest Mode no extension 2. I can easily use another browser like, firefox, opera, palemoon, slim, links2, etc By changing

Mar 03, 2024
2398 words
Self Hosting and Consumer Apps

Cloud Tour Video: LLM Setup: Recently, the idea of self-hosting fascinates me. This is because I find I can self-host pretty much all websites easily, including forums, video platforms, blogs. The only thing is the

Nov 29, 2023
452 words
Offline RL

Note: All Yaml files are in the git repo: https://github.com/jimchen2/cs285-reinforcement-learning python cs285/scripts/run_hw5_explore.py \ python cs285/scripts/run_hw5_explore.py \ python cs285/scripts/run_hw5_explore.py \ The Random Network Distillation algorithm encourages exploration by training another neural network to approximate the output

Nov 24, 2023
521 words
Q Learning and SAC

Compute Action and use epsilon greedy action = torch.tensor(random.randint(0, self.num_actions - 1)) action = self.critic(observation).argmax(dim=1) Step environment Add data to replay buffer replay_buffer.insert(...) Sample from replay buffer batch = replay_buffer.sample(config["batch_size"]) Train agent, we update the

Nov 20, 2023
872 words
Policy Gradients

There are 2 kinds of estimator for Policy Gradients, full trajectory and and "reward-to-go" We run the two configs on Cartpole with different parameters, specifically, rtg means reward to go, na means normalizing the advantages.

Nov 18, 2023
371 words
Imitation Learning with DAgger

We run imitation learning and Dagger based on expert policies. In this experiment the expert policy is directly sampled out from a trained Neural Network, so Dagger differs from real world applications in that it

Visiting-Ocean-BeachVisiting-Ocean-Beach
Oct 27, 2023
679 words
Visiting Ocean Beach

Today I visited Ocean Beach. I was in a very bad mood and distorted these days. I couldn't get any work done and think about living in remote, isolated regions. I hopped on a Bart

Tags:
All
Journal
Russian
Ustc
Berkeley
Reading
Linux
Web-and-cloud
Youtube
Language-learning
Chinese
Javascript
Math
Alaska
Reinforcement-learning
My-website
Shanghai
Hefei
C
German
Security
Selenium
French
Smartphone
Embedded-systems
Hong-kong
Random-writings
Middle-school
San-francisco
Fremont
Richmond
Elementary-school
Next-js
Lineageos
Poems
High-school
Source-code
Russia
Concord(ca)
Phone
Lafayette
Llm
Walnut-creek
Typing
Userscripts
Browser
Summer-camp
Go
Emeryville
Boston
Orinda
Bsd
Oakland
Sort:
Newest
Oldest
Most Words
Least Words
Theme:Loading...

RSS

Profiles

Utilities

Latest Comments:

Loading...