Shell Scripting Task "Cost Optimization" S3

In this project we tried to optimize cost of real world case scenario where a client was using observability to store metrics, traces and logs.

ObservabilityLog Stash (collecting logs) → ELK Stash (storing) & (helped in log analysis, debug and troubleshoot)

Q. Issues faced, always had a huge cost for ELK stash, so needed alternate ways?

They were not using managed ELK stash but a self hosted one. In self hosted ELK stash they had cluster of VM on that ELK was hosted and connected with “Elastic Search database” and this database was connected to “Volumes”

They had a lot of inflow of logs, app logs, infra logs and more. So, they were facing both “compute and storage” costing for ELK Stack.

How We Resolved the Major Inflow of Application Logs from "Jenkins"

The client was storing Jenkins UAT, staging, pre-production, and production logs in the ELK Stack, Elastic Search database, and Volumes.

In the Jenkins setup for UAT and staging environments, many logs were being created, taking up a lot of space in the Volumes and ELK. Jenkins already had alerts set up via email and Slack, so if any error occurred, developers received instant notifications. Therefore, Jenkins logs were only stored for backup and storage, not for monitoring.

So, we moved the Jenkins logs from ELK Stack → to S3 buckets.

Almost 50% cost reduction we have seen with the simple shell scripting technique

Where Jenkins store Logs?

/var/lib/jenkins/jobs

Solution to reduce cost?

We will create a shell script that will trigger everyday at night and takes all the logs of Jenkins to S3 bucket.

A Brief about Cost Optimization Project With Shell Scripting

  • Used a shell script to optimize costs.

  • Problem: Jenkins logs stored in the ELK Stack were generating high costs.

  • Existing setup: Notification system for Jenkins build failures via Gmail/Slack.

  • Logs were not used for analysis.

  • Solution: Moved Jenkins logs to an S3 bucket.

  • Implementation: A shell script runs nightly, accessing the Jenkins directory to loop over all build logs and upload them to the S3 bucket.

  • Cost-saving: S3 bucket uses lifecycle management to move older logs to Glacier or Deep Archive, reducing costs further. Any infrequent used logs or older logs than 3 month or more will move to lifecycle management of S3 which provide even less costing.

Practical Steps:

  1. Create S3 bucket with name “Jenkins-Cost-Optimization”

  2. Create an EC2 ubuntu machine “jenkins-server” in AWS

    Configuration in EC2 or ubuntu LTS

    Install Jenkins:

    Step 1: Add the Jenkins Repository Key

    Download and save the Jenkins GPG key to your system:

     sudo wget -O /usr/share/keyrings/jenkins-keyring.asc https://pkg.jenkins.io/debian-stable/jenkins.io-2023.key
    

    Step 2: Add the Jenkins Repository

    Add the Jenkins repository to your system's sources list:

     echo "deb [signed-by=/usr/share/keyrings/jenkins-keyring.asc] https://pkg.jenkins.io/debian-stable binary/" | sudo tee /etc/apt/sources.list.d/jenkins.list > /dev/null
    

    Step 3: Update the Package List

    Update your system's package index to include the Jenkins repository:

     sudo apt-get update
    

    Step 4: Install Jenkins

    Install Jenkins using the following command:

     sudo apt-get install -y jenkins
    

    Configure AWS CLI:

     aws configure
    

    In order to communicate with our AWS Resources

  3. Wrirte a shell script to move your log file of jenkins to S3:

     #!/bin/bash
    
     # Variables
     JENKINS_HOME="/var/lib/jenkins"  # Replace with your Jenkins home directory
     S3_BUCKET="s3://jenkins-cost-optimization-amit"  # Replace with your S3 bucket name
     DATE=$(date +%Y-%m-%d)  # Today's date
    
     # Check if AWS CLI is installed
     if ! command -v aws &> /dev/null; then
         echo "AWS CLI is not installed. Please install it to proceed."
         exit 1
     fi
    
     # Iterate through all job directories
     for job_dir in "$JENKINS_HOME/jobs/"*/; do
         job_name=$(basename "$job_dir")
    
         # Iterate through build directories for the job
         for build_dir in "$job_dir/builds/"*/; do
             # Get build number and log file path
             build_number=$(basename "$build_dir")
             log_file="$build_dir/log"
    
             # Check if log file exists and was created today
             if [ -f "$log_file" ] && [ "$(date -r "$log_file" +%Y-%m-%d)" == "$DATE" ]; then
                 # Upload log file to S3 with the build number as the filename
                 aws s3 cp "$log_file" "$S3_BUCKET/$job_name-$build_number.log" --only-show-errors
    
                 if [ $? -eq 0 ]; then
                     echo "Uploaded: $job_name/$build_number to $S3_BUCKET/$job_name-$build_number.log"
                 else
                     echo "Failed to upload: $job_name/$build_number"
                 fi
             fi
         done
     done
    

    To execute the shell script:

     sudo chmod +x costopt.sh
     ./costopt.sh