May 4, 2021
The HubHaus infrastructure transformation
HubHaus Coliving: Reimagined Shared Housinghttps://thehubhaus.com/
- Laravel (PHP), Blade
- Nuxt.js (Frontend)
- MySQL, Redis, Redshift
The system consists of several applications written in PHP using Laravel framework, providing REST API and templated frontend applications using Blade templating.
In addition to the main PHP stack, the company developed new frontend application using Nuxt.js framework.
Application are utilizing MySQL and Redis databases, along with Redshift acting as a data warehouse. Both backend and frontend applications have unit and integration tests written for them.
Original state of infrastructure
The applications in production are running in AWS environment. There is no testing or development environment. The original infrastructure was created with many architectural and security anti-patterns, rendering it possibly insecure, unreliable and not scalable.
The backend applications are deployed on separate EC2 instances using Envoyer. They are not dockerized and have no CI/CD pipelines written for them. In its current state the applications are not horizontally scalable and cannot recover from disaster states on their own.
The frontend application is dockerized and is deployed in ECS. It also has CI/CD pipeline written in CircleCI, managing its build, testing and deployment, In its current state the application is not scalable as well.
The databases for production are deployed in AWS, in different VPC than the applications accessing them. The communication is transmitted over public channels and all databases are publicly accessible, with only security groups protecting them.
To summarize, the way all of the applications are deployed is rendering them unscalable and recovering from disaster states require manual actions. No testing environment is used and the applications are deployed directly from master branch with limited automated testing. Each instance the application is running on is unique and cannot be simply replaced by a new one in case of any problems.
- Create isolated environments for staging and production
- Isolate nonpublic applications and service into private subnets
- Provide VPN access to all services in AWS
- Migrate applications to Kubernetes clusters and make them scalable
- Automate build, test and deployment of each application
- Provide robust monitoring system
Separating utilities and tools common for other environments
To correct the security and architectural defects, we created separate utility and staging VPCs in AWS. The utility VPC is dedicated to host all common tools and utilities, such as VPN server and cluster monitoring tools. The utility vpc also acts as a single point of entry to all private sectors of other VPCs.
By placing all services that do not need to be publicly accessible into private subnets, we are able to mitigate many security risks. We are also providing a way to access them for internal purposes, using the VPN connection.
To ensure better reliability we created separate staging environment, which can be used to test new deployments before going to production. Later on, we are able to easily replicate the staging environment into new production environment thanks to using Terraform and Ansible tools for creating and provisioning resources.
Moving towards Kubernetes
We are moving all of the application to Kubernetes to provide better scalability and to be able to manage deployments more easily. We are also able to run Kubernetes nodes in multiple availability zones, making them more reliable in case of any outage than the original infrastructure running on EC2 instances in a single zone.
To be able to use Kubernetes, we needed to dockerize the existing applications and deploy several tools to automate tasks such as managing DNS entries (external-dns), issuing certificates (cert-manager), cluster scaling (cluster-autoscaler) and more. We also added means of scraping logs from the deployed application using fluentd.
Monitoring kubernetes clusters
To improve the ability to systematically debug potential issues, we deployed cluster monitoring using Prometheus. The Prometheus server is running in the utility VPCs, outside the Kubernetes cluster, scraping metrics from all environments. To visualize the data we are gaining, we deployed Grafana and created several custom dashboards, namely to monitor nodes and deployments in the cluster.
To further automate the deployment process, we deployed Jenkins, which together with CircleCI provides means to manage continuous integration and deployment. Jenkins is also being used to manage data migration and obfuscation from existing environments.
Database migration and obfuscation
All of this work was done following the Infrastructure as Code principle, using tools such as Terraform to create resources in AWS and Ansible to provision some of them. The created environment is therefore easily transferable to other environments, such as production and development.
- Automatic build, test and deployment within minutes of making a commit
- Automatic scaling based on current load
- Automatic handling of DNS entries and SSL certificates
- Improved logging
- “One-click” database migration with data obfuscation
- Complex cluster monitoring using Prometheus and Grafana
To sum up, we were able to dramatically improve the reliability and scalability of the applications. In case of any issues we are now able to evaluate many metrics and logs and have improved the options to monitor the whole technology stack in general.
The infrastructure is well manageable and maintained by following Infrastructure as Code principle. Thanks to that fact, we are able to easily replicate the created environment and create additional environments with little modification.