Cgar — guard file downloads with reCaptcha
Cgar (reCaptcha Guarded Archive Retriever) is a simple service that retrieves files from blob stores (e.g., S3) to serve to browsers once a reCaptcha is passed. On an initial GET request for a certain file, an HTML page with reCaptcha is shown. If passed, the file is served as a download attachment on a POST request from the Go
button. Find it on GitHub at https://github.com/bettercallshao/cgar.
Motivation
The motivation came from my job hunt. I wanted my resume to be publicly accessible by recruiters, but since it has my phone number on it, I wanted to protect it from web crawlers. I made the first version on a VM on Digital Ocean as part of a larger deployment with the file stored on the file system. When I migrated to Kubernetes, I also modernized it to be more cloud-native.
Configuration
A single YAML file is used to configure the behavior of Cgar:
RECAPTCHA_SITE_KEY: "..."
RECAPTCHA_SECRET_KEY: "..."
downloads:
- id: "example"
bucket: "..."
key: "..."
accessKeyId: "..."
secretAccessKey: "..."
endpoint: "..."
-
Sign up for reCaptcha from http://www.google.com/recaptcha/admin. Cgar only supports V2 at the moment. Then
RECAPTCHA_SITE_KEY
andRECAPTCHA_SECRET_KEY
should be available. -
Upload the target file to an S3-compatible blob store. Only S3 protocol is supported at the moment, while most cloud providers offer S3 compatibility.
-
Create a new item in the
downloads
section of the config file, and fill inbucket
,key
(i.e., path in bucket),accessKeyId
,secretAccessKey
, andendpoint
with target file information. -
Name the target by assigning a unique value to the
id
field. Let's assume we name itexample
.
Deploy locally with docker-compose
- Clone the repo
- Create the config file as
./secret/config.yaml
- Run
docker-compose up
, which will build the docker image and run the service.- A restart is required for reconfiguration.
- Two endpoints should be available:
localhost:8080/ready
: The readiness check, which is always available.localhost:8080/download/example
: The download page for example, which should produce a download attachment after the reCaptcha is passed, given the configuration is correct.
Deploy to Kubernetes
-
Optionally create a new namespace:
kubectl create ns cgar
-
Create a config map called
cgar-config
withconfig.yaml
as the key and our config file as the value:kubectl -n cgar create cm cgar-config --from-file=config.yaml=./secret/config.yaml
-
Create specification
cgar.yaml
for the Cgar deployment and service:apiVersion: apps/v1 kind: Deployment metadata: name: cgar spec: selector: matchLabels: app: cgar strategy: type: Recreate template: metadata: labels: app: cgar spec: containers: - name: cgar image: bettercallshao/cgar:latest volumeMounts: - name: secret mountPath: "/secret" env: - name: CONFIG_PATH value: "/secret/config.yaml" - name: PORT value: "80" ports: - containerPort: 80 readinessProbe: httpGet: path: /ready port: 80 initialDelaySeconds: 5 periodSeconds: 5 livenessProbe: httpGet: path: /ready port: 80 initialDelaySeconds: 5 periodSeconds: 60 volumes: - name: secret configMap: name: cgar-config --- apiVersion: v1 kind: Service metadata: name: cgar-service spec: type: ClusterIP ports: - port: 80 targetPort: 80 protocol: TCP selector: app: cgar
Note:
- Prebuilt images are available on DockerHub at
bettercallshao/cgar:latest
. - The config map
cgar-config
is treated as a volume and mounted in the container as/secret
. The config could have been a k8s secret as well instead of a config map. - The
CONFIG_PATH
environment variable is set to/secret/config.yaml
to point to theconfig.yaml
we mounted from the config map. - The
PORT
environment variable is set to 80, which is optional. - Probes are set up to monitor the
/ready
endpoint for health. The service will crash if the blob config is incorrect; it relies on k8s to restart it.
- Prebuilt images are available on DockerHub at
-
Create the specified resources:
kubectl -n cgar create -f cgar.yaml
For practical deployment on bettercallshao.com, Minio is used in conjunction. To see how Minio and other components are deployed, refer to the (less well-documented) repo https://github.com/bettercallshao/bcs-lke.