Cgar — guard file downloads with reCaptcha

Cgar — guard file downloads with reCaptcha

Cgar (reCaptcha Guarded Archive Retriever) is a simple service that retrieves files from blob stores (e.g., S3) to serve to browsers once a reCaptcha is passed. On an initial GET request for a certain file, an HTML page with reCaptcha is shown. If passed, the file is served as a download attachment on a POST request from the Go button. Find it on GitHub at https://github.com/bettercallshao/cgar.

Motivation

The motivation came from my job hunt. I wanted my resume to be publicly accessible by recruiters, but since it has my phone number on it, I wanted to protect it from web crawlers. I made the first version on a VM on Digital Ocean as part of a larger deployment with the file stored on the file system. When I migrated to Kubernetes, I also modernized it to be more cloud-native.

Configuration

A single YAML file is used to configure the behavior of Cgar:

RECAPTCHA_SITE_KEY: "..."
RECAPTCHA_SECRET_KEY: "..."
downloads:
  - id: "example"
    bucket: "..."
    key: "..."
    accessKeyId: "..."
    secretAccessKey: "..."
    endpoint: "..."
  1. Sign up for reCaptcha from http://www.google.com/recaptcha/admin. Cgar only supports V2 at the moment. Then RECAPTCHA_SITE_KEY and RECAPTCHA_SECRET_KEY should be available.

  2. Upload the target file to an S3-compatible blob store. Only S3 protocol is supported at the moment, while most cloud providers offer S3 compatibility.

  3. Create a new item in the downloads section of the config file, and fill in bucket, key (i.e., path in bucket), accessKeyId, secretAccessKey, and endpoint with target file information.

  4. Name the target by assigning a unique value to the id field. Let's assume we name it example.

Deploy locally with docker-compose

  1. Clone the repo
  2. Create the config file as ./secret/config.yaml
  3. Run docker-compose up, which will build the docker image and run the service.
    • A restart is required for reconfiguration.
    • Two endpoints should be available:
      • localhost:8080/ready: The readiness check, which is always available.
      • localhost:8080/download/example: The download page for example, which should produce a download attachment after the reCaptcha is passed, given the configuration is correct.

Deploy to Kubernetes

  1. Optionally create a new namespace:

    kubectl create ns cgar
    
  2. Create a config map called cgar-config with config.yaml as the key and our config file as the value:

    kubectl -n cgar create cm cgar-config --from-file=config.yaml=./secret/config.yaml
    
  3. Create specification cgar.yaml for the Cgar deployment and service:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: cgar
    spec:
      selector:
        matchLabels:
          app: cgar
      strategy:
        type: Recreate
      template:
        metadata:
          labels:
            app: cgar
        spec:
          containers:
          - name: cgar
            image: bettercallshao/cgar:latest
            volumeMounts:
            - name: secret
              mountPath: "/secret"
            env:
            - name: CONFIG_PATH
              value: "/secret/config.yaml"
            - name: PORT
              value: "80"
            ports:
            - containerPort: 80
            readinessProbe:
              httpGet:
                path: /ready
                port: 80
              initialDelaySeconds: 5
              periodSeconds: 5
            livenessProbe:
              httpGet:
                path: /ready
                port: 80
              initialDelaySeconds: 5
              periodSeconds: 60
          volumes:
          - name: secret
            configMap:
              name: cgar-config
    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: cgar-service
    spec:
      type: ClusterIP
      ports:
        - port: 80
          targetPort: 80
          protocol: TCP
      selector:
        app: cgar
    

    Note:

    • Prebuilt images are available on DockerHub at bettercallshao/cgar:latest.
    • The config map cgar-config is treated as a volume and mounted in the container as /secret. The config could have been a k8s secret as well instead of a config map.
    • The CONFIG_PATH environment variable is set to /secret/config.yaml to point to the config.yaml we mounted from the config map.
    • The PORT environment variable is set to 80, which is optional.
    • Probes are set up to monitor the /ready endpoint for health. The service will crash if the blob config is incorrect; it relies on k8s to restart it.
  4. Create the specified resources:

    kubectl -n cgar create -f cgar.yaml
    

For practical deployment on bettercallshao.com, Minio is used in conjunction. To see how Minio and other components are deployed, refer to the (less well-documented) repo https://github.com/bettercallshao/bcs-lke.