Shell Scripting Task "Cost Optimization" S3

In this project we tried to optimize cost of real world case scenario where a client was using observability to store metrics, traces and logs.

Observability → Log Stash (collecting logs) → ELK Stash (storing) & (helped in log analysis, debug and troubleshoot)

Q. Issues faced, always had a huge cost for ELK stash, so needed alternate ways?

They were not using managed ELK stash but a self hosted one. In self hosted ELK stash they had cluster of VM on that ELK was hosted and connected with “Elastic Search database” and this database was connected to “Volumes”

They had a lot of inflow of logs, app logs, infra logs and more. So, they were facing both “compute and storage” costing for ELK Stack.

How We Resolved the Major Inflow of Application Logs from "Jenkins"

The client was storing Jenkins UAT, staging, pre-production, and production logs in the ELK Stack, Elastic Search database, and Volumes.

In the Jenkins setup for UAT and staging environments, many logs were being created, taking up a lot of space in the Volumes and ELK. Jenkins already had alerts set up via email and Slack, so if any error occurred, developers received instant notifications. Therefore, Jenkins logs were only stored for backup and storage, not for monitoring.

So, we moved the Jenkins logs from ELK Stack → to S3 buckets.

Almost 50% cost reduction we have seen with the simple shell scripting technique

Where Jenkins store Logs?

/var/lib/jenkins/jobs

Solution to reduce cost?

We will create a shell script that will trigger everyday at night and takes all the logs of Jenkins to S3 bucket.

A Brief about Cost Optimization Project With Shell Scripting

Used a shell script to optimize costs.
Problem: Jenkins logs stored in the ELK Stack were generating high costs.
Existing setup: Notification system for Jenkins build failures via Gmail/Slack.
Logs were not used for analysis.
Solution: Moved Jenkins logs to an S3 bucket.
Implementation: A shell script runs nightly, accessing the Jenkins directory to loop over all build logs and upload them to the S3 bucket.
Cost-saving: S3 bucket uses lifecycle management to move older logs to Glacier or Deep Archive, reducing costs further. Any infrequent used logs or older logs than 3 month or more will move to lifecycle management of S3 which provide even less costing.

Practical Steps:

Create S3 bucket with name “Jenkins-Cost-Optimization”
Create an EC2 ubuntu machine “jenkins-server” in AWS

Configuration in EC2 or ubuntu LTS

Install Jenkins:

Step 1: Add the Jenkins Repository Key

Download and save the Jenkins GPG key to your system:
```
 sudo wget -O /usr/share/keyrings/jenkins-keyring.asc https://pkg.jenkins.io/debian-stable/jenkins.io-2023.key
```
Step 2: Add the Jenkins Repository

Add the Jenkins repository to your system's sources list:
```
 echo "deb [signed-by=/usr/share/keyrings/jenkins-keyring.asc] https://pkg.jenkins.io/debian-stable binary/" | sudo tee /etc/apt/sources.list.d/jenkins.list > /dev/null
```
Step 3: Update the Package List

Update your system's package index to include the Jenkins repository:
```
 sudo apt-get update
```
Step 4: Install Jenkins

Install Jenkins using the following command:
```
 sudo apt-get install -y jenkins
```
Configure AWS CLI:
```
 aws configure
```
In order to communicate with our AWS Resources

Wrirte a shell script to move your log file of jenkins to S3:

 #!/bin/bash

 # Variables
 JENKINS_HOME="/var/lib/jenkins"  # Replace with your Jenkins home directory
 S3_BUCKET="s3://jenkins-cost-optimization-amit"  # Replace with your S3 bucket name
 DATE=$(date +%Y-%m-%d)  # Today's date

 # Check if AWS CLI is installed
 if ! command -v aws &> /dev/null; then
     echo "AWS CLI is not installed. Please install it to proceed."
     exit 1
 fi

 # Iterate through all job directories
 for job_dir in "$JENKINS_HOME/jobs/"*/; do
     job_name=$(basename "$job_dir")

     # Iterate through build directories for the job
     for build_dir in "$job_dir/builds/"*/; do
         # Get build number and log file path
         build_number=$(basename "$build_dir")
         log_file="$build_dir/log"

         # Check if log file exists and was created today
         if [ -f "$log_file" ] && [ "$(date -r "$log_file" +%Y-%m-%d)" == "$DATE" ]; then
             # Upload log file to S3 with the build number as the filename
             aws s3 cp "$log_file" "$S3_BUCKET/$job_name-$build_number.log" --only-show-errors

             if [ $? -eq 0 ]; then
                 echo "Uploaded: $job_name/$build_number to $S3_BUCKET/$job_name-$build_number.log"
             else
                 echo "Failed to upload: $job_name/$build_number"
             fi
         fi
     done
 done

To execute the shell script:

 sudo chmod +x costopt.sh
 ./costopt.sh