This is a really simple, compact and lightweight implementation of URL shortener using Python, Flask and PostgreSQL database. It is packed within a Docker container using uWSGI server which makes it easy to use and easy to deploy. The implementation can be used either as a standalone application or as a service behind the bigger server (e.g., load balancer).
Live version of the code can be found at go.sbo.sk.
There are, in general, three methods to install / deploy the service. Each of this method needs further configuration, which can be found in the section Configuration
To install the service using this method, you must have Docker installed. The installation steps can be found here.
Create docker compose file docker-compose.yaml
and use package jefinko/url-shortener:latest
.
services:
url-shortener:
image: jefinko/url-shortener:latest
restart: unless-stopped
environment:
PY_LOGGING: "WARNING"
ports:
- "8000:8000"
volumes:
- ./.env:/url-shortener/.env.
The service will be accessible on port 8000
. To change this, the line - "8000:8000"
must be changed to
- "8000:<desired port>"
The service will be logging its workflow.
The default log level is set to WARNING
. To change this, the line PY_LOGGING: "WARNING"
must be changed to
PY_LOGGING: "<desired level>"
You can find other available levels here.
Run the service with
docker compose up
To install the service using this method, you must have Docker installed.
First, you must download the repository from GitHub and build a Docker image by yourself
# clone the repository from GitHub
git clone https://github.com/jozef-sabo/url-shortener.git
# change directory to the cloned folder
cd url-shortener
# build the image, Docker build tag will be "url-shortener"
docker build . -t url-shortener
These steps will create new Docker image with tag url-shortener
.
Warning
The Docker image build builds own Python interpreter and psycopg2 library.
Be aware, because this uses some amount of data (around 1GB) and takes some time (around 15 minutes) to build!
After the build, follow the steps in part Docker and DockerHub using the tag url-shortener
as
services:
url-shortener:
image: url-shortener
...
To install the service using this method, you must have Python, version at least
3.11
, installed.
First, you must download the repository from GitHub and create a Python virtual environment
# clone the repository from GitHub
git clone https://github.com/jozef-sabo/url-shortener.git
# change directory to the cloned folder
cd url-shortener
# crete virtual environment in the folder "venv"
python -m venv venv
# use the newly created Python virtual environment
source ./venv/bin/activate
# install the requirements
python -m pip install -r requirements.txt
Then, change directory to src
and run app.py
# change directory to src
cd src
# run app.py file
python app.py
The service is running on port 8000
. To change this, code in app.py
must be changed as
app.run(host="0.0.0.0", port=<desired_port>, debug=False, load_dotenv=True)
Warning
Be aware, this method does not use uWSGI
nor any other WSGI server. This is not recommended as it can be potential security risk!
To configure the service, there are three (two) main files, where you can change the variables to desired values.
This file configures the app itself. Each variable has its own description which helps better to understand the variable
This file configures the uWSGI
server (only applicable in Docker installations). It is recommended to configure this server to experienced users only.
The regular user can be interested in two of the variables
# number of processes on which the service runs (do not be mistaken with threads)
processes = 4
# the local address and port of the service
http = 127.0.0.1:8000
Caution
This file contains mainly the credentials, which must stay secret.
Do not allow any third person nor user of the service to access this file in any occasion. Otherwise, it is a security risk!
The file must contain at least these variables
# secret value which Python uses for session-related functions
SECRET_KEY=""
# string in the format of "dbname=name user=user password=password host=host_ip port=port"
# where `dbname` is the name of the database used, `user` is the name of teh user accessing the database,
# `password` is the secret key to access the database, `host` is the IP address or domain-name-like address of the host
# `port` is usually 5432 for the PostgreSQL database
DB_STRING=""
# secret value used to create custom links
ADMIN_PASS=""
# secret value used with reCAPTCHA feature enabled
RECAPTCHA_SECRET_KEY=""
In order to use the configuration files, based on the installation method, they need to be bound to (existing) file representations inside the application. For Docker based installations, add the following lines under the volumes
section into docker-compose.yaml
file
- <path to the folder>/config.toml:/url-shortener/config.toml
- <path to the folder>/uwsgi.ini:/url-shortener/uwsgi.ini
- <path to the folder>/.env:/url-shortener/.env
For the standalone installation, edit the existing files and create and add .env
file
The application depends on the PostgreSQL database service (using psycopg2
python library). The database must be correctly set up to be accessible from the app.py
file resp. uWSGI
service.
The default, initial, scheme can be found in utils/database.sql file and must be installed before the first application usage.
The software is written in accordance with extensibility and modularity. Because of that, it also contains some features, which can be turned on or off depending on your needs. Those features have so-called feature switch, state of which can be customized completely on deployment.
From now, features are labeled by (feature).
disabled by default, [network.proxy] section in config.
The service can be accessed directly as a webserver publishing the 8000
port in container as port 80
on host.
This is not recommended in practice, because it can open the doors to many security risks.
To minimize those risks, the service called reverse proxy can be used.
One, most commonly used is with the help of nginx, a reverse proxy HTTP server.
To set up the service using this method, you must have nginx installed.
Firstly, you need to create a new site within nginx. E.g., example.com
can be used instead of <your_domain>.
# create empty nginx configuration file
touch /etc/nginx/sites-available/<your_domain>
# then, reserve the space on the disk to the service by creating new directory
mkdir /var/www/<your_domain>
mkdir /var/www/<your_domain>/public_html
mkdir /var/www/<your_domain>/log
Then, put the following configuration into the newly created file
server {
# service files will be stored here
root /var/www/<your_domain>/public_html;
# use port 80 to access the site (IPv4 and IPv6)
listen 80 ;
listen [::]:80 ;
# domain
server_name <your_domain> www.<your_domain>;
# logs
error_log /var/www/<your_domain>/log/error.log;
access_log /var/www/<your_domain>/log/access.log;
# this files will be searched for first in order to serve the index page
index index.html index.htm index.nginx-debian.html;
# the configuration itself
location / {
# address and port, where is the service running
# most probably the IP will be 127.0.0.1 and port 8000
proxy_pass http://<ip_address>:<port>/;
proxy_set_header Host $http_host;
proxy_set_header Upgrade $http_upgrade;
# this following lines are for Flask to be informed about users IP address, etc.
# also, those need to be enabled in the further service config
# x-for
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
# x-proto
proxy_set_header X-Forwarded-Proto $scheme;
# x-host
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
After inserting the contents to the file and putting the correct values instead of placeholders, the domain must be enabled and the nginx service must be restarted
# put the symlink into sites-enabled folder for nginx to register
ln -s /etc/nginx/sites-enabled/<your_domain> /etc/nginx/sites-available/<your_domain>
# restart nginx service
systemctl restart nginx
Now, the site should be enabled and running. The reverse proxy should work.
The reverse proxy should be running flawlessly; however, it cannot recognize the user's IP addresses.
To solve this problem, the reverse proxy feature must be enabled in the service config.
To do this, change the following lines in config.toml
[network.proxy]
# tell the service it is running behind the reverse proxy
enabled = true
# select which headers are sent to the service from nginx
x_for = true
x_proto = true
x_host = true
x_port = false
x_prefix = false
The entries set to true
were set up in nginx config already.
To enable more entries, add the corresponding lines to the nginx configuration file.
disabled by default, [recaptcha] section in config.
The implementation is written as the REST API. This is useful for machine-to-machine communication, as it does not come with challenging representation and data parsing, but it brings security risks. The API calls can be automatized and done by bots in a very high number of requests. Such a behavior can easily flood the inbound traffic and fill the disk space by growing database. To protect against such actions, we can use Google's reCAPTHCA.
Warning
Be aware that fees may apply!
It is important to set up reCAPTCHA service correctly. Also, high traffic may occur and increase the usage. Check the pricing page here.
To be able to use reCAPTCHA service, you must register shortener service to Google's Console.
Register to the service here by clicking Get Started
button and filling out required fields.
It is important to fill in a correct domain (this will be checked on Google's side) and select Score based (v3)
option as reCAPTCHA type (v2 will not work).
Google Cloud then returns two keys, each of a different type:
- site key: public, put to the website
- private key: private, accessible only on backend, should not be published
With two keys retrieved from Google Console, the configs need to be filled in.
Edit the config.toml
and add retrieved site key
# tell the service to serve the page with reCAPTCHA fields
enabled = true
# minimal score to be accepted as legitimate user,
# more on that at https://developers.google.com/recaptcha/docs/v3#interpreting_the_score
minimal_score = 0.5
# if true, url-shortener sends the requesting user's IP to reCAPTCHA service
verify_ip = true
# site key retrieved by Google
site_key = "<site_key>"
If verify_ip
is set to true
, url-shortener sends the requesting user's IP address to reCAPTCHA service, and it verifies if the IP address is the same as when pressing the button on website (if user_ip
on FE is the same as user_ip
on BE).
It is important to have set up parsing the IP addresses correctly, especially, when the service is behind reverse proxy.
On the incorrect setup, reCAPTCHA will return a negative outcome.
Thereupon, edit the .env
file and add the following line with a retrieved secret key
RECAPTCHA_SECRET_KEY="<secret_key>"
After that, when accessing the index page of the site, small reCAPTCHA in the right down corner should appear.