Controlling outbound communication from your Amazon Virtual Private Cloud (Amazon VPC) to the internet is a critical component of your overall preventative security measures. You can help prevent instances from downloading malware, interacting with bot networks, or assaulting internet hosts by restricting outbound traffic to trustworthy domains (a process known as "whitelisting"). However, blocking all outgoing web traffic is impractical. Frequently, you'll want to grant access to well-known domains (for example, to communicate with partners, to download software updates, or to communicate with AWS API endpoints). In this post, I'll show you how to use a web proxy with custom domain whitelists or DNS to block outbound web connections from your VPC to the internet.


Benefits and deliverables of the solution


Squid, an open source HTTP proxy, serves as the foundation for this solution. All workloads operating in the VPC, such as Amazon Elastic Compute Cloud (EC2) and AWS Fargate, can use the proxy. The following advantages are provided by the solution:


  • An outbound proxy that allows connections to whitelisted domains that you choose while displaying configurable error messages when connections to unapproved domains are attempted
  • Optional DNS-based domain content screening provided by other providers such as OpenDNS, Quad9, CleanBrowsing, Yandex.DNS, and others. You must be a customer of these external services in order to use this option
  • Because of the extraction of domain information from the Server Name Indication (SNI) extension in TLS, encryption is handled transparently. Encryption in transit is retained, as is end-to-end encryption
  • An auto-scaling group comprised of Elastic Demand Balancing (ELB) Network Load Balancers distributed over many of your current subnets (and Availability Zones) and scaling based on CPU load
  • For internet connectivity, each proxy instance has one Elastic IP address. Sometimes the websites with which you are talking require your IP address in order to receive traffic from you. Giving the proxies elastic IP addresses lets you know from which IP addresses your web connections will originate
  • CloudWatch Records receives proxy access logs
  • CloudWatch Metrics provides proxy metrics
  • AWS CloudFormation was used to automate solution deployment


Out of bounds


This solution does not support applications that do not support proxies. Deep packet inspection is likewise out of the question.


End-to-end TLS encryption is maintained, and only the SNI extension is inspected. Only the host header is examined for unencrypted communication (HTTP).


DNS content filtering must be provided by a third party; our solution just integrates with it.


Service utilisation, cost, and performance


The following services are used by the solution:

  • Network Load Balancers on AWS. Pricing for Elastic Load Balancing may be found here
  • Four AWS Elastic IP addresses, which are taxed if not utilised, as specified on the price page for Elastic IP Addresses
  • The domain list is stored in AWS Secrets Manager. Pricing for AWS Secrets Manager may be found here.
  • Squid is a free and open source proxy server
  • Amazon EC2 on-demand instances will be used to operate the Squid proxies. Pricing for Amazon EC2 may be found here
  • Amazon Linux 2 and AutoScalingGroup are both free to use


The Squid access log will be stored in Cloud Watch Logs. Pricing for Cloud Watch is available here.



Solution architecture



As seen in Figure 1:

  • An AWS CloudFormation template is used to instal the solution automatically
  • The Squid access log is saved in CloudWatch Logs so you may search and analyse it
  • AWS Secrets Manager keeps track of the permitted (whitelisted) domains. The Amazon EC2 instance uses a cronjob to get the domain list every 5 minutes and updates the proxy settings if the list changes. Secrets Manager values are supplied by CloudFormation and can only be viewed by proxy EC2 instances
  • The EC2 instance's client must have proxy settings that refer to the Network Load Balancer. The load balancer will forwards the request to the target groups's fleet of proxies


Prerequisites


  1. You'll need a VPC that's already up and running, with public and private subnets split over several Availability Zones (AZs). Default VPC Setup offers an explanation of how to set up your VPC environment.
  2. Only traffic from a public subnet can access the internet, therefore you'll need an internet gateway with routing set up.


You won't need to set up a NAT (network translation address) gateway because the outbound proxy will take care of that.


Integration with DNS services for content screening


If you need content filtering from a third-party service, such as OpenDNS or Yandex.DNS, you must first register and become a customer. Many provide free services as well as premium options for those who want sophisticated data or bespoke categories. As a consumer, you are responsible for this. (Learn more about AWS and the customer's shared responsibility.)


A set of DNS IP addresses will be assigned to you by your DNS service provider. When you provide, you'll need to input the IP addresses (see Installation below).


You can send the source IPs of the proxies to the DNS provider if they request it. In the stack output, there are four reserved IP addresses (see Output parameters below).


Constructing (one-time setup)


To run the CloudFormation template, click the Launch Stack button:



The "Launch Stack" button is a shortcut to the "Launch Stack" menu.

  • To launch the stack in the necessary region, you must first login in to your AWS account. The most recent version of the stack may be found on GitHub, where you can also contribute to the example code.


As illustrated in Figure 2, provide the following proxy parameters:

  • Domains that are permitted: Enter the domains you've added to your whitelist. To denote subdomains, use a leading dot (“.”)
  • Optional custom DNS servers: List any DNS servers that the proxy will utilise. If you leave the default setting, the Amazon DNS server will be used
  • Proxy Port: Type in the proxy's listener port
  • Instance Type: Select the type of EC2 instance you wish to use for the proxies. Vertical scaling capabilities and solution cost are influenced by instance type. See Amazon EC2 Instance Types for further details
  • To be used as an AMI ID: The Amazon Machine Image (AMI) ID obtained in the AWS Systems Manager Parameter Store is prepopulated in this field. It will default to the most recent Amazon Linux 2 image. This value does not need to be changed
  • Name of SSH Key (optional): Enter the name of your proxy EC2 instances' SSH key. This is only significant if you need to connect in to the proxy servers for debugging purposes. Instead of SSH, consider utilising AWS Systems Manager Session Manager


Then, as indicated in Figure 2, supply the following network parameters:

  • The VPC where the solution will be deployed is identified by its VPC ID.
  • Subnets where the proxies will be installed are known as public subnets. Choose between two and three subnets.
  • Subnets where the Network Load Balancer will be installed are known as private subnets. Choose between two and three subnets.
  • Clients who are allowed to use the CIDR: The proxy security group will be added to the value you specify here. The private IP range 172.31.0.0/16 is permitted by default. The block size permitted is between a /32 and a /8 netmask.



Figure 2: Launching the CloudFormation template

  • Select Next once you've input all of your proxy and network information. Keep the default options and choose Next and Create Stack on the next wizard windows.


Determine the parameters for the output


You'll need to write down the output parameters to setup your clients after the stack status has changed to "deployed." In the Outputs tab of the stack, look for the following parameters:

  • The proxy's domain name, which must be set on the client.
  • The proxy's port, which must be set on the client
  • For the proxy's instances, there are four elastic IP addresses. These are used to connect to the Internet in the outbound direction.
  • For access logs, use the CloudWatch Log Group.
  • The Security Group that the proxies are a part of.
  • Set the proxy on Linux with this command. This is something you can copy and paste into your shell.



Figure 3: Stack output parameters


Make use of the proxy


Every programme has its own proxy setup settings. The environment variables http proxy and https proxy are used by most Linux applications.

  1. Log in to the Linux EC2 instance that has been given permission to utilise the proxy.
  2. Execute the following export commands to set the shell parameter temporarily (just for the current shell session):


[ec2-user@ip-10-0-1-18 ~]$ export http_proxy=http://<Proxy-DOMAIN>:<Proxy-Port> [ec2-user@ip-10-0-1-18 ~]$ export https_proxy=$http_proxy 


a. Proxy-DOMAIN> should be replaced with the load balancer's domain, which may be found in the stack output parameter.


b. Proxy-Port> should be replaced with your proxy's port, which is also provided in the stack output parameter.

  1. After that, you may test the connection with cURL (for example). Replace url with one of the URLs on your whitelist:

       

 [ec2-user@ip-10-0-1-18 ~]$ curl -k <URL> -k                                                                
<!DOCTYPE html>

  2. You can add the proxy parameter permanently to interactive and non-interactive shells. If you do this,             you won’t need to set them again after reloading. Execute the following commands in your application           shell:

               

[ec2-user@ip-10-0-1-18 ~]$ echo 'export http_proxy=http://<Proxy-DOMAIN>:<Proxy-Port>' >> ~/.bashrc
[ec2-user@ip-10-0-1-18 ~]$ echo 'export https_proxy=$http_proxy' >> ~/.bashrc

[ec2-user@ip-10-0-5-18 ~]$ echo 'export http_proxy=http://<Proxy-DOMAIN>:<Proxy-Port>' >> ~/.bash_profile
[ec2-user@ip-10-0-5-18 ~]$ echo 'export https_proxy=$http_proxy' >> ~/.bash_profile

           

  •     Replace <Proxy-DOMAIN> with the domain of the load balancer.
  •     Replace <Proxy-Port> with the port of your proxy.


Customize the page that says "Access Denied"


When a user's access is restricted or there is an internal issue, an error page will appear. According to the Squid error directory tag, you can change the appearance and feel of this page (HTML or styles).


The proxy access log can be used


The proxy access log is a useful debugging tool. It includes the client's IP address, destination domain, port, and timestamps for failures. Squid's access logs are submitted to CloudWatch. As illustrated in the picture below, you can locate them in the CloudWatch interface under Log Groups with the prefix Proxy.



Figure 4: CloudWatch log with access group


You can use CloudWatch Insight to analyze and visualize your queries. See the following figure for an example of denied connections visualized on a timeline:



Figure 5: Access logs analysis with CloudWatch Insight


CloudWatch allows you to keep track of your stats.

The following proxy metrics are uploaded to CloudWatch Metrics in the proxy namespace every five minutes:

  • client_http.errors /sec – errors in processing client requests per second
  • client_http.hits /sec – cache hits per second
  • client_http.kbytes_in /sec – client uploaded data per second
  • client_http.kbytes_out /sec – client downloaded data per second
  • client_http.requests /sec – number of requests per second
  • server.all.errors /sec – proxy server errors per second
  • server.all.kbytes_in /sec – proxy server uploaded data per second
  • server.all.kbytes_out /sec – proxy downloaded data per second
  • server.all.requests /sec – all requests sent by proxy server per second


In the figure below, you can see an example of metrics. For more information on metric use, see the Squid project information.



Figure 6: Example of CloudWatch metrics


Configure the proxy settings


You may wish to add or delete domains from the whitelist from time to time. You must edit the input values in the CloudFormation stack to change your whitelisted domains. The values in Secrets Manager will also be updated as a result of this. The proxies will take the list from Secrets Manager every five minutes and update it as needed. This implies that your modification might take up to five minutes to take effect. Without terminating or deploying any instances, the modification will be propagated to all of them.

The Squid proxy processes are restarted when the whitelist is changed, which will stop ALL connections going through them at that moment.

You may update the CloudFormation stack with new values if you wish to alter additional CloudFormation parameters, such as DNS or Security Group settings. The CloudFormation stack will start a new instance and shut down existing ones (a rolling update).

By changing the CloudFormation template (section AWS::CloudFormation::Init) and upgrading the stack, you may alter the proxy Squid settings. You should only do this if you have extensive AWS and Squid expertise.


Refresh the instances


You may change the stack to update your AMI. A rolling update will reinstall the EC2 instances and Squid software if the AMI has been upgraded to a newer version.This streamlines the process of applying security and other upgrades to managed instances. There will be no update if the AMI has not changed.

You may also end the instance, and the auto scaling group will create a new one with the most recent Squid and OS upgrades, starting from scratch. During the moment when the load balancer switches to an active instance, this technique may cause a brief service disruption for the customers serviced by this instance.


Credits:
Polyana Barbosa