Short URL Service Design V2


Building a short url service with your customized domain is FUN!

You can archive the web pages you like for future quick reference, such as bosong.link/puppy links to my favorite puppy video. You can also shorten the URL and send it over to your friends or publish it on the internet.

Sounds exciting right? I’ll walk you through the whole process to build this service. The problems you might encounter and how to solve them.

At the end of this page, the link to the github source code and docker hub image are provided with the MIT license.

User journey

There are two user journeys. Link creation and link redirection.

Link creation

When user land on the home page (bosong.link) or a short link that does not exist in the database (bosong.link/link-does-not-exist). The user shall be redirected to the link creation page, which is the home page.

Link redirection

When user land on a short link that exists in the database (http://bosong.link/test for example), the user shall be redirected to the original url.

System overview

There are 3 components in the short link service – the frontend code (written in Typescript with Angular framwork), the backend code (written in NodeJS with Express framework) and the database (MongoDB)

As you might noticed, the MEAN stack is used in this project, which stands for Mongodb, Express, Angular and NodeJS. The alternative stack is LAMP, which stands for Linux, Apache, Mysql and PHP. MEAN is chosen since it is relatively new and popular. Javascript/Typescript is used in both frontend and backend which makes it relatively easy to maintain. (You don’t need to learn two languages)(reference)

Frontend code

The frontend code (also called as client) runs in the browser. It interacts with the user behaviors directly, sends requests to the backend server and renders the result on the page, or, in our case, redirect the browser to the target url.

Angular framework is used to build the single-page application, where only one full page load is needed (the first request). All subsequent requests are sent via async calls to the server. The frontend gets the response and update the page, and sometimes, updates the url in the browser address bar as well.

Single-page application vs Multi-page application

Prior to the single-page application, the web page is reloaded every time when the url in the browser address bar is changed, this is called multi-page application.

For example, in the multi-page application era, when you search for puppy at www.google.com/search?q=puppy , the full data in the page is sent to your browser. Then you search for kitty, the browser requests the page www.google.com/search?q=kitty , the server sends the full page to your client again.

This solution works, but is not efficient, because shared components in the two pages, such as the nav bar in the top, are sent by the server twice, which could have been cached in the browser to reduce network traffic and latency.

The single-page application solves this problem by only requesting the full page once when user lands on the website, and then incrementally requests the need-to-change data in the following requests.

For example, in the single-page application, when user switch to the kitty search result page, only the search result content, which is under the shared nav bar in the screenshot below, is requested. The shared nav bar is cached in the browser and reused. Thus less data is sent via network, latency is reduced.

Following is the screenshot marking the shared component in google search result page.

Single-page application explained in depth

In a multi-page application, when user requests an uri, for example, this page, which is https://www.bo-song.com/short-url-service-design-v2/ . The server analysis this request, assemble the page, and send the whole page to client.

In a single-page application, in the other hand, when user requests an uri in the website for a first time, the server always sends back the homepage (HTML, JS and other static files) with the uri parameter. The browser figures out what data it needs, and sends request to the server. Different from the requests in the multi-page application, these requests are not requesting a page (HTML), but requesting the data they need (could be in JSON format). And then the browser assembles the page, render it and present it to user.

Typescript comes with the Angular framework, which is a wrapper on javascript providing type checking at compile time. This mechanism helps the developer writes less error prone code and find bugs as early as at the compile time.

Under the hood, the typescript compiler compiles Typescript files to Javascript and runs it.

Backend code

Backend code runs in your server. It shall serve two purposes.

  1. serve the static files, HTML, CSS, JS, etc.
  2. provides APIs for the client to call

The backend could be written in PHP (Apache), JS(NodeJS), Java(Tomcat), Python (Django), or even in C/C++.

In this shortlink service, it’s written in Javascript with the NodeJS runtime with Express framework.

Runtime system and framework

Most programming languages have some form of runtime system that provides an environment in which programs run. This environment may address a number of issues including the management of application memory, how the program accesses variables, mechanisms for passing parameters between procedures, interfacing with the operating system, and otherwise.

For Java, the runtime system is the JVM; for JavaScript running in chrome, the runtime system is V8; for JavaScript running in a server, the runtime system is NodeJs.

You might be familiar with the document and window object when writing the frontend js code. These objects are provided by the V8 runtime. In NodeJS, there are no such objects. Instead, some filesystem APIs, network APIs are provided.

Framework writes the common boilerplate components for you so that you can focus on the real business logic. Some framworks, such as Angular, instruct developer on how to organize their code in components.

For example, the Express framework hides the complexity of listening to a port, handling a uri with a correct method, intercepts a request and so on. In a developer’s perspective, they only need to write the following code to start a server, listen to port 3000 and return “hello world” for all uris. Express, or say, the framework, does all the dirty and detailed work for you.

import express from "express";
let app = express();
const port = 3000;
app.get("*", function(req, res) {
  res.send("Hello world!");
});
app.listen(port);

Database

Database stores the data for your application. MongoDB is used in this project for its scalability. The table is quite simple, containing short_link and original_link column.

const LinkSchema = mongoose.Schema({
  short_link: {
    type: String,
    required: true,
    index: true,
    unique: true,
  },
  original_link: {
    type: String,
    required: true,
  },
});

MongoDB vs MySQL comparison

MongoDB is NoSQL and MySQL is SQL (relational database). So, what’s the diffrence? (reference)

You need to use SQL if you need ACID compliancy (Atomicity, Consistency, Isolation, Durability) and your data is structured and unchanging . Such as bank account transactions, stock broker DB. Relational database is not easy to scale horizontally when rows in a table grows. In other words, it’s not easy for a relational database to shard and partition.

You need to use NoSQL DB if you store large volumes of data without structure, use cloud computing and storage, and rapid development like agile.

In the shortlink case, we do not need ACID compliancy, and we should focus on the scalability cause the shortlink volume could grow rapidly.

Development workflow

The Docker is used to containerize the program to be environment independent, thus the the binary could be shipped to different machines. There are two containers in this project. One for the mongo, and the other for the frontend and backend code.

Details

The details of the design could be found at the github page.

References

Short link service V1 Design doc: https://www.bo-song.com/short-url-service/

MEAN and Docker development notes: https://www.bo-song.com/docker-and-mean-note/

—- FOLLOWING IS THE RAW NOTE I WROTE WHEN WRITING THE SERVER —

User journey in V1

Get link

  1. user issues /{short_link} request to server
  2. server redirects every page to redirect.php to further handle it.
  3. Server checks the DB, if the {short_link} exist, goes to step 3, otherwise goes to step 4
  4. server sets 302 status code with the {original_link} in response header <END>
  5. server returns the index page with the {short_link} fulfilled in the form.

Post link

  1. user enters the index page by either requesting / page or being redirected from a /{short_link} page where {short_link} does not exist in the DB
  2. user enters the {short_link} and {original_link} in the form and submit to server
  3. server write the record to DB; clinet reset the url in the browser to the /{short_link}

Problems in V1:

  1. In the redirection stage, the server reset the response header to instruct browser to the new location.
    1. Pros: simple logic, fast turn around time.
    2. Cons: low extensibility, hard to add more client side behavior before redirecting, such as integrating with the Google Analytics for link usage tracking.
  2. The bosong.link uses a same server and DB with bo-song.com blog.
    1. pros: save money, save resources
    2. cons: single point of failure.
  3. Apache mod url rewrite is used to remapping short url to the redirect.php page.
    1. pros: it works
    2. cons: the mod rules is hard to maintain.

Solutions

Browser redirects the page

Sample code

When user lands on the /{short_url} page, server always returns the single index page. The index page then parses the {short_url}, sends request to the server to request for the long url.

Probably we need to set up a new subdomain to handle the API request, such as api.bosong.link/

Set up separate docker server

Use docker for easy migration and maintenance.

Kubernetes

Kubernetes is like borg in Google; Docker image is like mpm package in Google.

https://www.digitalocean.com/community/tutorials/an-introduction-to-kubernetes

Using MEAN Stack

In the V2, the shortlink service will switch to the MEAN stack and contained in docker.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.