Me

Browse In PDF

by
published on

Yes, it's stupid.

https://browseinpdf.com

It's also quite a lot of fun.

Making this, I learned a LOT about NGINX.

Because it's an http proxy, we need encoded urls within unencoded urls. By default, when using proxy_pass (passing to an internal server), urls get decoded. Not what I wanted.

Also, because these PDF files are quite heavy to generate, and people would only really want to see a few sites (google.com, news.ycombinator.com, etc.)

 My cache config:

proxy_cache_path /tmp/nginxcache levels=1:2 keys_zone=cache:10m max_size=2g inactive=60m use_temp_path=off;

This creates a cache in /tmp/nginxcache which has 2 "levels" of folders. This means there isn't one large directory - fairly standard stuff.

Location config:

location /browse/ {
rewrite ^ $request_uri;
rewrite ^/browse/(.*) $1 break;
return 400;
proxy_cache cache;

proxy_cache_methods GET;
proxy_cache_valid 200 10h;
proxy_cache_valid 301 302 1h;
proxy_cache_valid any 1h;
proxy_ignore_headers Cache-Control;
proxy_redirect off;
proxy_pass http://127.0.0.1:8004/$uri;
}