Cloud Tour Video:
LLM Setup:
Recently, the idea of self-hosting fascinates me. This is because I find I can self-host pretty much all websites easily, including forums, video platforms, blogs. The only thing is the cost, which, actually isn't that much if hosting small instances.
For example, purchasing 1 TB of object storage is only 15 dollars one month on Cloudflare R2. Akamai Linode offers this at 20 dollars a month for 1 TB object storage. AWS offers similar pricing around 10 dollars to 20 dollars (it has so many S3 models). Notably, Dropbox even has a cheaper price of $9.99 a month for 2 TB. There are also cheaper object storages out there, except their performance and stability are not known. But Dropbox is proprietary, and I have no idea what's going on, and Dropbox points my files to other files in the cloud with the same Hash values, and Dropbox GUI can sometimes be non responsive, which is annoying.
When I was a child, I didn't really know and think about such things, with evil Windows making everything foreign and mysterious for me. I didn't know what was going on in those websites, and that made me succumb to them. But over-relying is not a good practice, and I shall learn to be independent and strong. That is what I dislike about Chinese tech companies, and also some Google and Microsoft services. They do create awesome available services, only for the individual to be used at the cost of losing one's own mind and freedom in the digital Internet.
The sort of over-reliance on manipulating, proprietary, and buggy apps makes me feel uncomfortable. It is this discomfort that drives the urge to go self-host. My website has few unique visitors and few comments so far.
Also I am hosting platforms primarily for my personal utility like FreshRSS or Task Manager or Metabase, not for other people to look or join.
Invidious/Piped
There are loads of instances out there, and the one I use most frequently is yewtu.be. Invidious doesn't store videos on its own servers; it traffics the YouTube videos. But Invidious frequently encounters bugs, which, of course, is because to YouTube's desire for monopoly.
Mastodon
Mastodon is like a Twitter alternative. Nitter stopped functioning, so it probably died. But there are lots of other small forums good for a niche community.
Video-Hosting
There are lots of video hosting platforms out there, thousands of millions, judging by the large amounts of adult websites. Using Peertube or MediaCMS or YouPHPTube are some mature hosting platforms, but they require login to post videos by default. Anyways, raw coding is quite easy with the MERN stack and object storage.
Hosting MongoDB
MongoDB Atlas is easy to use, however, self hosting one is far easier to manage and control.
- Hook up a ec2 instance, expose port 27017
- Install the MongoDB Docker Image, configure admin password
- Start Docker Image
Then access it with :27017/test?authSource=admin
Portainer: Managing Containers
So like I quickly hooked up 20 apps on a ec2, and some of them are my own, and a web interface, in this case, greatly helped me mitigate the complexities.
Reverse Proxy
I personally preferring configuring Nginx myself so I usually run docker with external ports and then map those ports to websites. The Nginx configuration files might also be causing some troubles as some point.
API vs Web Interface
I found myself appreciating API much more than the actual web interface. I mean, using an API provides minimal distraction and are generally more efficient, like you can tune it for your specific needs(like the different endpoints).
Also I found myself not willing to pay for services(like I am stopping paying ChatGPT), but more like paying and using the API service. The ChatGPT web interface frequently freeze and goes into bugs.
Security
Basically expose only the 443 and 80 port and the ssh port (I also exposed port 27017). Exporting too many port is generally fine too but like that's not the default on AWS so I guess I should follow the default.
So Why Not Go Back to Popular Platforms?
There are multiple reasons, and I thought I already stated them clearly, but just to clarify once again:
Lack of customization
I don't get to customize the homepage with HTML and CSS! Someone designed the ugly crap! I don't get to customize my contents! I have to follow its term of service, which makes me very uncomfortable.
Proprietary
I don't have any idea what's going on in the backend! I run into an error in the platform, then there is no solution at all, which makes me panic! Imagine having a bug and not being able to debug it, and the only way is to beg a proprietary company's programmers to solve it for you!
There isn't an API to upload stuff. I must crawl my way through a proprietary buggy web interface daily only to upload and engage in discussions.
Vulgar eye-catching stuff
There are many of those channels with a million subscribers that are not "misleading".
Few views/comments as well
Yeah, like I once opened a Bilibili account and posted my programming content to it, ended up getting around 0-50 views with no comments at all, which is sad. I got angry. I deleted and left the shit damn platform.
Data Concerns
As we all know, privacy is a huge concern, especially for these video platforms. You use a phone number to register, and those Chinese platforms practically know who you are. They aren't E2E encrypted, and everything is in public.
Can I use my own database for the social media? I can rent a Neon PostgreSQL database provide them with the url, and they can store my data inside my database, so I can be in control. It's certainly plausible since in mastodon you can bring your own instance and database and federate with them.
These days I export my social media data (most of them allow exporting) every month to my S3 bucket.
No Rights for Users
Some platforms have absolutely no rights for unsigned-in users, like Twitter, Instagram, or Bilibili. Without an account, you can't view or see other people's posts. They even restrict those privacy frontends like Invidious (dead) and Nitter (dead) and Bibliogram (also dead).
BaiduNetDisk requires downloading their shithole client to download any files(which in turn not only bloats the system but always shows a floating icon, which I find annoying). It simply doesn't respect the user. BaiduNetDisk also restricts the free tier downloading speed to 50 KB per second, absolutely ridiculous and egregious behavior from the company. I thought the network speed depends on my Internet's bandwidth and the company's server's bandwidth, and it lacks transparency and respect for its users.
WeChat's public account (just a blogging platform) has no web interfaces and no external links. It only allow commenting to a very small amount of accounts. Bilibili's blog also doesn't permit external links. I consider this very hostile.
Requires "Clever" Javascript Tricks
You need to use clever Javascript tricks on some websites to do certain stuff. Sometimes you need to switch to mobile sites on browser console or whatever. These websites usually tell users misleading stuff.
I dislike CSDN because it hides the copy paste functionality with some stupid Javascript intentionally, forcing users to log in. These acts doesn't benefit the company but forces the users to pay or log in. Copying code can be done easily from the browser debug console anyways. They simply disregard the need to benefit the users, instead, focusing on maintaining a dictatorship and autonomy, without actual benefits or superior quality, which makes me annoyed.
I don't use windows not because it is proprietary. I still use Google search. I dislike Windows because it isn't ideal for development and lacks a good package manager, hides too much things through GUI. You will no longer have any happiness left if you use Windows.
Poor Performance
Don't use the argument "It is a global and popular video platform, so it must have good performances". Wrong!
There are those bloated trackers. Maybe a person doesn't want 3rd party plugins or those cookie sessions, and wants a faster load. The point is these platforms treat users like dumb idiots without giving them options. It has great power over its users, powers that it shouldn't have in the first place.
Brain-Poisoning
Social media can be largely filled with spam and propaganda, with a recommendation system that constantly feeds you with the "hype" videos that ain't gonna make you better in computer science.
20240718: Migrating and Living on AWS
Basically on AWS they let you manage all your stuff in a single console. Also migrating to AWS gives a experience of working in an enterprise literally with so many configuration options like scaling things up and down.(using aws cloud utils to quickly configure the ram, disk usage). It was awesome.
How much money did I waste?
I wasted a lot of money. At least 11 month of Proton Mail Subscription, 40 dollars in PikaPod, 10-20 dollars in Runpod and VastAI, 25 dollars transferring the domain to AWS.
LLM
Use Amazon Bedrock for the most powerful LLM Claude-Sonnet-3.5, as of summer 2024, then set up a frontend like lobe-chat with docker.
S3 Backup
Google Drive, or any kind of consumer apps, lack the extent of customizability and control.
Create a script to backup Github to S3 everyday(just shallow clone for the most recent repos is ok) Call the Github API with curl -s -H "Authorization: token $TOKEN" "https://api.github.com/user/repos?per_page=100"
then clone and zip it to S3.
You would also want to backup documents(probably stored in a NoSQL like MongoDB) to AWS, it's simple, just periodically dump the MongoDB database zip it then upload to AWS.
SES
AWS SES is simple yet really powerful and customizable email sending service.
When receiving emails they go automatically into your S3 bucket, making backups easy and manageable!
There is a workmail solution for SES which can enable IMAP so you can literally use your mail in any client including Gmail.
Storage Remotes
Obviously one misconfig can delete all my files, that's why beside AWS S3 I am using Backblaze B2 for its cheap pricing! I manually backup(basically sync) the S3 to B2 once in a while, and B2 is also S3-compatible.
IAM Policies
Set up extensive IAM policies for each services to avoid using root too much, which can cause unintended destructions!
Secrets Manager
Manage all api keys in AWS Secrets. Store them here (backed by AWS's nosql key value database).
Please don't connect to MongoDB on a http network(it will simply get blocked!)
Docker on EC2
Of course, we cannot forget docker on ec2. You can easily hookup with dockers and configure PWA for Android apps after dockerizing your favorite apps, like FreshRSS or RSSHub or Markdown Parser or StirlingPDF, anyways, like so convenient!
CloudWatch
After all of these you would wanna check out CloudWatch for some monitoring! Add the websites endpoints and check if there are any errors easily!
Other Services I played with (but didn't use extensively)
I also played with Lambda(serverless), api gateways, basically a gateway for lambda or s3 bucket, Sagemaker (though I prefer hooking up Jupyter manually on EC2), Glacier(it became S3-glacier these days and considered legacy, basically a vault for colder storage), marketplace(buy premium for other apps on AWS).
Disaster Recovery
I have Backblaze for my S3 data storage and I also have Gmail as my backup mail, so almost nothing goes wrong.
Self Hosting Proxy Server
Hook up a Shadowsocks server on a cloud vps and then copy the configuration to Clash proxy locally, there you go.
Expose the outbound ports, and start a docker container with the shadowsocks.
proxies:
- name: us-east-1-t4g-medium
type: ss
server: 3.228.73.172
port: 8388
cipher: aes-256-gcm
password: [hidden]
udp: true
Self Hosting Remote Storage Manager
Just found this project on Github [https://github.com/alist-org/alist], it is so freaking good! Like, literally, has a beautiful interface with efficient management and streaming and control of all the remote data(s3, google drive, photo)
Filebrowser and Filestash are both not ideal in my opinion.
Self Hosting LLM frontend
This is so cool!
This allows you to
- Bypass Regional Restrictions
- No rate limiting(far less rate limiting on API)
- Don't have to rely on buggy ChatGPT frontend(or any other frontend)
- Manage "on demand" usage instead of subscription fee
I honestly don't understand why ChatGPT frontend constantly crashes(come on, it's using NextJS and is a large company, so I thought extensive testing is going on there, but it still constantly crash and frequently runs into bugs, not sure if it is intentional or not)
Self Host Markdown Parser(file manager)
I don't like Joplin because it doesn't have a web interface. I mean, come on, why won't it have a web interface? Moreover it's syncing runs frequently into bugs.
So I wrote a markdown parser(with file management) myself. This allows me to edit seamless stuff from mobile to laptop and easily copy paste screenshots in blogs(like basically when I ctrl v in the browser it gets automatically uploaded to my S3 bucket with a url returned).
Conclusion
Self hosting offers a premium experience of everything with the perfect control (even control over pricing as you can choose the VPS and buy the capacities on the cloud).
I am running 20 premium services on my cloud with only 4 GB of RAM, and everything works seamlessly.
It offers the user perfect control--that's exactly what I aim for. Like, control and convenience and not having to deal with companies(which becomes very inconvenient).
I can literally live inside the cloud and not get out. After all with Openai and Gemini's generous free tier I am essentially only paying AWS, and nothing else. It allows me to manage my service in one place and focus on better stuff.
I also utilize Google, I mean, the Colab and the Drive and Gmail and Photo and Android, they are quite basic and convenient, and don't offer distractions or ads or forced subscriptions. I would like that too. I am not turning myself against popular services but rather focusing on control and convenience. The whole point of self hosting is the free legit premium experience I am having!
Self hosting made my life look hopeful amidst the mental health crisis surrounding me lately. Anyways, these are just some of the many useful stuff.