Traefik + Let's Encrypt + CloudFlare

Traefik + Let's Encrypt + CloudFlare

A little troubleshooting log/guide that may help you

ยท

9 min read

Why I am writing this post

Hi. I'm a data scientist and I am currently developing a platform for MLOps using Docker Compose. This is to say, now that I am playing the role of an engineer, there are a lot of concepts that I am learning fast, and it can feel overwhelming and even scary. If you're new to deploying applications with CloudFlare and configuring a reverse-proxy with Traefik, I wrote this guide for you.

The goal of this post is also to show you how you can get that sweet green lock for SSL certificates by having Traefik fetch Let's Encrypt (LE) certificates for you and how to place your application behind a CloudFlare proxy. For sake of preview, here are some pain points that I've faced as a starting engineer.

  • Cloudflare as your DNS server.
  • CloudFlare flexible, full and strict modes.
    • Watch out for https redirects in Traefik.
  • Exposing your server in CloudFlare: Development mode and temporarily disabling CloudFlare to bypass its proxy.
    • Traefik configuration to fetch Let's Encrypt.

This post is not supposed a complete tutorial to Docker Compose, Traefik, CloudFlare and Let's Encrypt - there is already a lot of resources out there for that purpose. Rather, it is almost a personal log of the obstacles I encountered that I hope helps other people overcome them.

What is CloudFlare, Traefik and Let's Encrypt?

CloudFlare (CF) is mainly a DNS server with extra features - these extra features are attributed to CloudFlare's (reverse-)proxy functions, which you can enable and disable whenever you want.

For example, you set your DNS records to point your domain and subdomains to the IP of the server where your application is running. When you set these records to 'proxied' (i.e. orange cloud) you benefit from the proxy functions that CloudFlare carries out, such as concealing the real IP address of your server and DDoS protection.

Traefik is a reverse-proxy that sits at the edge of your application on the server. This reverse-proxy is Wait, what? Another "proxy"?! Well yes, but this Traefik is configured in your docker-compose application and has no overlap in functionality with CloudFlare. Traefik will intercept requests to a given route, say a-route.your-domain.com and match with any existing rules that you have set to a service running in Compose.

Let's Encrypt (LE) is a Certificate Authority (CA) that signs and ensures that your certificates are genuine to encrypt the connection between the clients and your server. The best part is that by using LE, you are taking advantage of the ACME protocol that provides you with autorenewals of your certificates. You could use a self-signed certificate, but there are disadvantages to that. You can use Let's Encrypt from Traefik to minimize set up effort.

Setting up Traefik

Setting up proxy entrypoints

It's best to test things locally as much as one can before deploying it to a live server. Your setup should have two entrypoints, which I have chosen to name web and websecure for the sake of having distinct names than http and https, so that we know which fields are arbitrary names given by us.

services:
  traefik:
    command:
      - '--entrypoints.web.address=:80'
      - '--entrypoints.websecure.address=:443'

The entrypoints are part of the dynamic configuration of Traefik. This setup is less than ideal as it allows for unecrypted http connections. In Traefik, a redirection to https can be done in the following way.

services:
  traefik:
    command:
      - '--entrypoints.web.address=:80'
      - '--entrypoints.web.http.redirections.entryPoint.to=websecure'
      - '--entrypoints.web.http.redirections.entryPoint.scheme=https'
      - '--entrypoints.websecure.address=:443'

One of the options available from CloudFlare is none other that HTTPS redirection, so the headers are already rewritten at CloudFlare's proxy. The redirection configured in Traefik (~'origin server' as per CloudFlare's terminology) acts as a failsafe should you disable CF's proxy.

Setting up Let's Encrypt (from Traefik)

This step is entirely optional if you're just developing on your machine. TLS can be enabled without LE, in which case, Traefik issues its own certificates.

Of course, what is desirable in production is to have CA certificates. Under the hood, Traefik uses lego, an LE CLI client, to connect to LE servers and fetch certificates. You only need to supply Traefik with an email and some options and Traefik will handle the rest for you. Here's an example of a setup.

services:
  traefik:
    command:
      - '--certificatesresolvers.le.acme.email=${ACME_EMAIL}'
      - '--certificatesresolvers.le.acme.storage=/letsencrypt/acme.json'
      # TLS challenge serves the certificates back to the CA in order to renew them
      - '--certificatesresolvers.le.acme.tlschallenge=true'
      # Optionally use the staging server to prevent exhausting rate limits
      - '--certificatesresolvers.le.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory'

The bare minimum is comprised of the first two lines: the email and the TLS challenge. This challenge is the simplest one to setup, as the only thing to do is to enable a boolean flag. However, taking into account CloudFlare, CF does not work with the TLS challenge, and either the DNS challenge or the HTTP challenge must be configured in order to be able to have the edge proxy enabled.

As a note, the default method used for ACME authentication by the Let's Encrypt client utilizes the DVSNI method. This will fail for a domain which has Cloudflare enabled as we terminate SSL (TLS) at our edge and the ACME server will never see the certificate the client presents at the origin. Using alternate ACME validation methods, such as DNS or HTTP will complete successfully when Cloudflare is enabled.

Note that challenges are mutually exclusive per certificate resolver, and each router can only use one certificate resolver at a time. Effectively, a router can only use one type of ACME challenge at a time. The snippets below are illustrative of choosing just one ACME challenge for the le certificate provider that is being configured.

Setting up Let's Encrypt with HTTP challenge

services:
  traefik:
    command:
      - '--certificatesresolvers.le.acme.httpchallenge=true'
      - '--certificatesresolvers.le.acme.httpchallenge.entrypoint=web'

Setting up Let's Encrypt with DNS challenge

services:
  traefik:
    command:
      - '--certificatesresolvers.le.acme.dnschallenge=true'
      - '--certificatesresolvers.le.acme.dnschallenge.provider=cloudflare'
    environment:
      # you may choose to use secrets instead of environment variables like this
      - CF_API_EMAIL=${CF_API_EMAIL}
      - CF_API_KEY=${CF_API_KEY}

In this example, the cloudflare provider is being used because that's where the DNS records are set up - i.e. the nameservers of the domain are pointing to CloudFlare. If you are using another DNS server, then you must set the environment variables specific to your provider.

Enable the use of Let's Encrypt in a router

Refer to the section Using the certificate resolver, this blog post will be more clear than me on how to use Traefik :)

Setting up CloudFlare

DNS records and subdomains

You only need one A record for the root of the platform you're developing. For any subdomain, you can set CNAMEs that point to that A record. The A record should point at the IP where Traefik is being served. Traefik will handle routing to the correct machines itself i.e. load balancing - should you have your application running in Swarm mode.

For example, DNS records on CloudFlare could look like this:

image-20220125103230059.png

As a general rule, you only need to set A records (@ and www) that point to the real IP of your server. If the services you defined in Traefik follow the template a-service.yourdomain.com, then you only need to set CNAME records with the Name a-service pointing to the Content yourdomain.com.

Note the orange clouds: they indicate that that requests to the specificied (sub)domain are being proxied by CloudFlare's edge proxy. They can be disabled one by one, but that is not of interest to what I write in this post.

CloudFlare options

In general, and from my experience, the default configurations work well. I only call your attention to the CF's encryption modes (which I mention in the Troubleshooting section) and the HSTS option, that could be problematic for your case.

Tips

A very useful tip is to enable CF "Development Mode" so that web requests bypass any cache that CF may have created and allows you to check for changes without interference of a cache.

If you have set up LE only with the TLS challenge: use the option "Pause CloudFlare on Site" before doing docker-compose up in your server. This will allow Let's Encrypt to find the real IP of your server without CF hiding it. Related with this is the use of a DNS client (or dnschecker.org) to check where the nameservers are pointing your names to. Using dog (a CLI DNS client), you should get a record pointing to the real IP of your live server:

$ dog some-domain.com
A some-domain.com. 38s   xx.yyy.xx.yyy

Once the TLS challenge is complete and the certificates are issued, you should be able to check that

  • the certificates are stored in the acme.json file
  • in the browser window, at the page of the service being served at the specified route rule, you should click on the lock icon in the address bar, open "certificates" and verify that they are in fact issued by Let's Encrypt. This means you set up Traefik + LE correctly!

Once everything is working as it should, you can click on the option "Enable CloudFlare on site" and let CF hide your IP again. The output of the DNS check should be something like this

$ dog auth.some-domain.com
A auth.some-domain.com. 5m00s   104.21.28.234
A auth.some-domain.com. 5m00s   172.67.147.200

Those IP's belong to CF proxy and the output confirms that CF is enabled again.

Some troubleshooting

The list below describes issues that I have experienced with CloudFlare and Traefik. It may not be exactly your case, but they are noted down in case you face them.

  • Browser throws ERR_TOO_MANY_REDIRECTS when accessing a service: If you deployed using CloudFlare's DNS servers and set CF to 'Flexible mode' then you will incurr in this error. This is because you have https from the client to the CF proxy, and forced http between the CF proxy and the server (forced by CF). As Traefik was configured with an http-to-https redirection itself, this will cause an infinite loop of redirects. In order to solve this issue, you need to set CF to Full encryption mode, and even without LE (using Traefik's self signed certificate) your website/application is now accessible.

  • Browser throws ERR_SSL_VERSION_OR_CIPHER_MISMATCH when accessing a service: I ran into this error because the "edge certificates" (the certificates sitting at CloudFlare's proxy) did not have the same coverage for the routes I was setting up with Traefik.

    • I was setting up routes with the structure service.platform.your-domain.com. This is called a 4th level subdomain.
    • The "edge" certificates provided by CloudFlare (the "edge" being CF's proxy) only provide SSL up to 3rd level subdomains, specifically, to your-domain.com and *.your-domain.com.
    • Thus the LE certificates fetch by Traefik did not match the names "made available" by CloudFlare, and the error above was produced.
    • The only solution is to upgrade to a Enterprise subscription to CloudFlare ๐Ÿ’ธ, as this allows you to issue edge certificates for any level of subdomains you want.

Closing words

I hope that you have found at least one solution to your problems if you're working with this particular set of tools (Docker Compose, Traefik, CloudFlare and Let's Encrypt). I may update this post as I add more functionalities to the platform I am developing.

ย