## Using GitHub to Backup icCube Reports (Linux)

In this post we are going to explain how to use a Git repository as a backup for icCube data. It's a tool we use
internally. The post uses GitHub but the same is valid for other Git Providers.

The idea is to backup in a regular basis the icCube reports to the Git repository. Some basic knowledge on
Git is assumed. The general architecture is:

- a GitHub repository with a 'Deploy Key' to push changes,
- a clone of the GitHub repository in the server icCube is located (we use Debian).

### 1. GitHub: Setup a Git repository

The first step is to create a private GitHub repository with a README file. In our example,
ic3-software/dev-reports-backup.
Eventually, we are going to use the content of the README file to check backups are working as expected.

![Create GitHub Repository](./images/githubCreateRepo.png)

#### Create a GitHub Deploy Key and Clone

We need to install Git if it's not available in the server running icCube. Like:

```
sudo apt update
sudo apt install git
# check git installed
git --version
````

To access the repository we are going to use a Deploy Key that will only have access to our backup repository so we
don't need a personal account ([GitHub doc](https://docs.github.com/en/developers/overview/managing-deploy-keys)).

First we need to create in our icCube server a pair of private/public keys:

````
ssh-keygen -t ed25519 -C "your_github_email@example.com"
````

This will create the keys in a local directory (usually ~/.ssh).

Add you public key as a desploy key in GitHub (ed25519.pub):

![Create Deploy Key](./images/githubCreateDeployKey.png)

In the server's SSH configuration file usually ~/.ssh/config) add an alias entry for the repository:

```
Host github-ic3-dev-backup
        HostName github.com
        IdentityFile=~/.ssh/id_ed25519
````

And that's it, we can now clone our repository in a local drive:

```
mv ~/git-backup/
git clone git@github-ic3-dev-backup:ic3-software/dev-reports-backup
````

Now, we should have a directory ~/git-backup/dev-reports-backup with the Git repository.

#### Script to Backup icCube Data to GitHub

Let's create a script to copy reports from the icCube data directory to the Git clone and push it to our GitHub
repository. We need to decide what are the files we want to backup, in our case we are only interested in the reports.
But you can extend this script to backup schema definitions. The script file looks like (`runIC3backup.sh`):

```
#!/bin/sh

IC3_SOURCE=/home/dev/icCube/9494/ic3data/docsRepository/ic3-reporting/data
GIT_HOME=/home/userName/git-backup/dev-reports-backup
GIT_TARGET=$GIT_HOME/reports

#
# Force pull (needed if you have other files in the repository, i.e. actions)
#
cd $GIT_HOME
git fetch --all; git reset --hard HEAD; git merge @{u}

#
# sync icCube files
# 
rsync -r $IC3_SOURCE $GIT_TARGET

#
# if 'force' is the first argument, 
#   change first line README.MD to add current datetime (this will force a push on each call)
# 
if [ "force" = $1 ]; then
    currentDate=`date '+%d.%m.%Y %k:%M:%S'`
    tag="###  $currentDate"
    sed -i "1s/.*/$tag/" $GIT_HOME/README.md
fi 

#
# Force add/commit/push 
#
git add -A .
git commit -m "backup" 
git push -f
````

#### Add to Cron Jobs

Add a script run to the cron table (using directly the script from git it's dangerous as you might
indirectly give access to your dev server to a malicious user) :

````
crontab -e

````

add a line like (every day at 10pm):

````
0 22 * * * /home/userName/git-backup/backup-reports.sh force
````

We could add a second line that backups each hour without a 'force' option

````
30 * * * * /home/userName/git-backup/backup-reports.sh
````

#### Add a GitHub Action to Check we Backup Daily

How do we know the daily backup is working?

The final task is to add an action to our GitHub repository to check that the scripts are running on the server.
For this we are going to check the time difference between the latest Git commit and action start.
If it's greater than x hours it will fail. As for GitHub actions an email will be sent on Action failure.

This is the rational of the 'force' in the shell, ensure there is at least a file change when running the script. Take
into account the datetime of the README.MD is the one from your server not the one of GitHub.
```
name: CheckBackup

on:
  workflow_dispatch:
  schedule:
    - cron: '0 1 * * *'  # every day at 1:00am
jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
    - uses: actions/checkout@v3

    - name: Check dates difference
      run: |
        echo "GitHub last update : " +  `git log -1 --format=%cd$`
        echo "Now : " +  `date`
        let now=`date +%s`
        let backupTime=`git log -1 --format=%cd$ --date=raw | grep -o "^\w*\b"`
        deltaHours=$((now-backupTime))
        deltaHours=$((deltaHours/3600))
        echo 'Delta of : ' + $deltaHours
        if [ $deltaHours -ge 8 ]; then
             echo 'failed : ' + $deltaHours + 'hours'
             exit 1
        else     
             echo 'ok'
             exit 0
        fi
````

And that's it, you are done.

_