Deployment into Cloud#

In the previous section, we have seen the examples how to create a simple web application, and even simpler API service. However, both of these examples were only running locally. They weren’t exposed to the world, resp. to our client. Let’s deep dive into the deployment process then!

Containerizing the App#

Imagine a situation in which we create a script, written in python and dependent on a few libraries, for exmaple our favorite ones: pandas, and numpy. Because we did a good job, our colleague asks us to share the script with him, so we send the code via the Slack. But, the the colleague starts complaining that the code isn’t working.

In this example we can basically say that our script is an application that we want to deploy on our colleague’s local machine. And the reason why it isn’t working is simple. We have different environments. The colleague didn’t say that he uses Python 2 only, and he hasn’t installed pandas package yet. Because we are in theory, he could even use a different OS, and definetely, he has got a different file system, so none of the absolute paths used in our script wouldn’t work.

But what can we do to prevent this from happening again? As the simplest solution, we can define the python environment using environment.yml file for conda environment configuration, so our colleague can create brand new environment based on our specifications. Or we can share requirements.txt file with list of all the python dependencies that can be installed via pip/conda in the similar way. But more sophisticated method would be to containerize the app which means to create an isolated environment with:

its own OS,
its own file system,
the only one (and correct) version of Python,
minimum necessary requirements.

And this is something we can create using docker. Docker uses its own nomenclature, so let’s describe the whole process a little bit.

The isolated environment is defined in script called Dockerfile. It contains specification about OS, file system, dependencies and applications to be ran. When the environment is built based on specs in Dockerfile, docker image is created. Docker image is a logical entity that can be shared between users. The running instance of docker image is called docker container. Docker container is environment where our apps are actually running.

As stated before, Dockerfile is a script using docker conventions in order to define docker image. For Dockerfile, there are commonly used commands.

FROM - indicates which base image to use.
ARG - defines variables that users pass in at build-time.
USER - sets username or user group when running docker image.
COPY - copies files and directories into environment.
WORKDIR - sets the working directory inside container.
ENV - sets environment variable.
RUN - runs shell command.
EXPOSE - informs docker that the container listens on the specified network ports at runtime, used for testing docker applications locally.
CMD - is used only once at the end of Dockerfile and contains the final command to run when executing the container.

Example#

For example purposes from now on, we will use “ds-academy-api-titanic” repository. There is defined an API for titanic survival model. As you can see the following Dockerfile is used.

# as base image, we will use python:3.9.15-slim-buster that contains Linux OS and Python 3.9 
FROM python:3.9.15-slim-buster

# we set our working directory to /app
WORKDIR /app

# we copy the requirements.txt file into our working directory inside the container
COPY requirements.txt .

# we install all the python dependecies
RUN pip install --no-cache-dir wheel
RUN pip install --no-cache-dir -r requirements.txt

# we copy the all files into our working directory inside the container
COPY . .

# we run the API server insede the docker container according to the documentation
# https://fastapi.tiangolo.com/deployment/server-workers/
CMD gunicorn api:app --workers 4 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:$PORT

When the Dockerfile is ready, we can try to build the docker image with api-titanic-survival-model:latest tag. We have to run the following command from the project root directory where the Dockerfile is present.

docker build -t api-titanic-survival-model:latest .

If docker image is built successfully, we can try to run the docker container. In the command, we define environment variable PORT that is used inside the docker container for running the web server on the specific port. Also, we will use -p argument that lets us to publish a container’s port to the host. With -p 8000:8000 we are basically saying docker to expose container port 8000 and map it to port 8000 of our local machine. Lastly, we need to define which image we would like to run. For that purpose we use the tag from previous build phase.

docker run -i -e "PORT=8000" -p 8000:8000 api-titanic-survival-model:latest

If there is no problem, we should be able to check the following endpoints of our API:

http://0.0.0.0:8000/survival_proba - endpoint with Titanic survival model prediction,
http://0.0.0.0:8000/docs - endpoint with Titanic survival model docs.

Deployment into Cloud Service#

Now, when we know what docker is and we have our environment prepared, nothing holds us back from deployment.

Again, there is a lot of services we can use to deploy our code, and expose it to the whole world. In the following example, we will deploy our app to Heroku. It is a cloud platform as a service (PaaS) that allows applications to be hosted on the cloud. With a few limitations, Heroku has been offering free tier for smaller applications. Their pricing plan will change at the end of the November 2022, so for purposes of the DS Academy 2022, we can still benefit from the current pricing plan. Within the free plan, we are able to connect Heroku directly to our GitHub repository, and deploy code from there.

The following manual describes what needs to be done for deploying the API service, not web app.

Setup GitHub repository#

At the beginning, we need to set up the public GitHub repository, and push our application scripts into it.

Create public GitHub repository.
Based on the example repository, create the following files needed for the deployment.
1. api.py
The file with the API logic - model loading, parameters definition, user input checks, and logic of returning the results. It is not necessary to stick to the file name, but it is used on other spots, so if changed, it needs to be changed accordingly elsewhere.
1. Dockerfile
As stated before, this is the file used for defining docker image. It is necessary for deployment to Heroku.
1. requirements.txt
File with python dependencies list. These dependencies are installed into docker image that is deployed to Heroku. Be aware of listing everything you need! On the other hand, there is a memory limit of 512 MB per application, so list only the necessary libraries!
1. heroku.yml
Heroku specific file which is used for deployment using docker.

Setup Heroku account & app#

In the second phase, we can go to Heroku web UI, and set up the deployment process for our appliaction.

Go to Heroku website, and create an account for free.
Install Heroku CLI and log in to your Heroku account: heroku login.
In the Heroku web UI, create an app.
1. Choose the unique name.
2. Add your app into Europe region.
In the Heroku web UI, integrate the app with GitHub repository.
In the Heroku CLI, set stack for the project to container: heroku stack:set container -a=<name of the app>.
In the Heroku web UI, ror the first time, manually deploy your master branch by clicking on “Deploy” button in according application section.
In the Heroku web UI, click “View” button to open the new tab with your app running.

Now, we should be ready for the final presentation!

Data Science Academy

Deployment into Cloud