Guide: How to automate detection of secrets

In this tutorial, we will expand our knowledge of detecting secrets using GitLab hooks and its CI/CD pipeline.

29 Oct 2024 Matěj Smyčka DevOps

No description

In the previous guide, we discussed how to start with secret detection and we introduced multiple tools which can be used for secret detection. If you haven't read the first guide yet, we recommend you do so.

In this guide, we will expand our knowledge about secret detection through pre-commit hooks and Continuous Integration and Continuous Delivery (CI/CD) pipelines. The code in this guide focuses more on GitLab, but we will also cover GitHub in the last section.

Pre-commit detection

Pre-commit hooks are small scripts that run before each commit to ensure that certain conditions are met before the commit is allowed. We will use a python tool called pre-commit, which is a framework for managing and maintaining pre-commit hooks in your Git repository. You can install pre-commit with this command pip install pre-commit.

In this way we can define each hook in .pre-commit-config.yaml file similar to pipeline definition. The hooks are run locally, this way, you can see the results before you push the commit to the remote repository. Git has built-in support for hooks. We will use a tool called pre-commit to manage them.

Setup

pip install pre-commit
# Ensure that you have pip install path `~/.local/bin` in your `$PATH`.
pre-commit install

Adding tools

First, we will integrate the tools, Gitleaks and Black formatter for Python.

Create a file named .pre-commit-config.yaml with the following content:

---
repos:
  - repo: https://github.com/ambv/black
    rev: 23.11.0
    hooks:
      - id: black
        language_version: python3.11

  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.18.0
    hooks:
      - id: gitleaks

Example

We have created this file:

print("a")
secret = '8dyfuiRyq=vVc3RRr_edRk-fK__JItpZ'  
for i in range(10): print("x")

After creating the commit, you can see that both tools work. Black formatted the main.py file, and Gitleaks found the hardcoded secret. Alternatively, you can run your local pipeline without creating a commit like this: pre-commit run --all-files.

CI/CD pipelines

A CI/CD pipeline is a series of steps that streamline the software delivery process. Automated pipelines can help prevent errors resulting from manual processes, allow for rapid product iterations, and provide consistent feedback during development. Each step of a CI/CD pipeline is a subset of tasks grouped into pipeline stages. Source: https://about.gitlab.com/topics/ci-cd/cicd-pipeline/

We will use the tools from our first guide, but we will integrate them into the pipelines in such a way that we can see the scan results before or after pushing the commit to our repository.

Detection in CI/CD pipelines

The standard pipeline offers an even easier setup. With this pipeline, every commit you push to the remote repository triggers an automatic run. In the event of a pipeline failure, you will receive an alert email.

You can see that the code was not properly formatted and the secret was found.

Setup

  • Create a file named .gitlab-ci.yml.
  • To achieve the same result as in the pre-commit pipeline, add the following content:
---
stages:
  - test

black:
  stage: test
  image: python
  script:
    - pip install black
    - black --check .

gitleaks:
  stage: test
  image:
    name: zricethezav/gitleaks:latest
    entrypoint: [""]
  script: >
    gitleaks detect
    -v 
    --log-opts=$CI_COMMIT_SHA

Now, after pushing a commit with our main.py file, you can see that the pipeline is failing for the same reasons the pre-commit job was failing.

Scan results can be found in the GitLab folder build > pipelines.

GitLab built-in CI/CD templates

Such a pipeline can also look like this:

---
include:
  - template: Security/Secret-Detection.gitlab-ci.yml

These are GitLab built-in templates. Results from the scan are nicely formatted in the GitLab "Security Dashboard". Here is the default configuration of the pipeline: https://gitlab.com/gitlab-org/security-products/analyzers/secrets/-/blob/master/gitleaks.toml. Using the default pipeline is more limited in configuration, but you can change the behavior to some extent.

This pipeline uses Gitleaks under the hood, however the default GitLab rules are smaller than in the default configuration of the tool.

Overall, this template provides great value for no effort.

Why not use the default GitLab templates?

The default GitLab templates are great, but they are limited in configuration. You have a little control over tools running in the pipeline and integrating them with other tools can be tricky. It is expected to use the combination of the default templates and custom pipelines.

Example of the custom pipeline

Below is an example how you can integrate multiple tools into one pipeline. The pipeline consists of these features:

  • Default GitLab templates
  • Gitleaks
  • TruffleHog
  • Ansible-lint check
  • Terraform format check
---
include:
  - template: Security/SAST.gitlab-ci.yml
  - template: Security/SAST-IaC.gitlab-ci.yml
  - template: Security/Dependency-Scanning.gitlab-ci.yml
  - template: Security/Secret-Detection.gitlab-ci.yml

stages:
  - test

gitleaks:
  stage: test
  allow_failure: true

  image:
    name: zricethezav/gitleaks:latest
    entrypoint: [""]
  script: >
    gitleaks detect
    -v 
    --log-opts=$CI_CI_COMMIT_SHA

trufflehog:
  stage: test
  allow_failure: true

  image:
    name: trufflesecurity/trufflehog:latest
    entrypoint: [""]
  script: trufflehog filesystem

ansible-lint:
  stage: test

  image: python:slim
  script:
    - pip install ansible ansible-lint
    - echo "$VAULT_KEY" > vault_key.txt
    - chmod 700 .
    - ansible-lint --version
    - ansible-lint playbooks/*.yml

terraform-validate:
  stage: test

  image:
    name: hashicorp/terraform:light
    entrypoint: [""]
  script:
    - cd terraform
    - terraform fmt -check

How pipelines work on GitHub

The pre-commit tool works exactly the same on GitHub as it does on GitLab, because it's local. The two services differ only in the CI/CD remote pipelines; both use structured YAML, but GitHub has slightly different keywords, and it calls its pipelines "Actions".

Here is the official GitHub Action for Gitleaks:

name: gitleaks
on: [pull_request, push, workflow_dispatch]
jobs:
  scan:
    name: gitleaks
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0
      - uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
name: TruffleHog Secrets Scan
on: [pull_request]
jobs:
  TruffleHog:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
        with:
          fetch-depth: 0
      - name: TruffleHog OSS
        uses: trufflesecurity/trufflehog@main
        with:
          path: ./
          base: ${{ github.event.repository.default_branch }}
          head: HEAD
          extra_args: --debug --only-verified

More Actions can be found on the GitHub Marketplace, which is a collection of community-made actions ready for your usage.

Contact

If you want to know more or need help with deployment, feel free to contact us at csirt@muni.cz.

You are running an old browser version. We recommend updating your browser to its latest version.

More info