ARC Data Analytics Handbook

Version 0.0.2

All things data analytics at ARC Resources.

Git Workflow

Git Branching Strategy

Overview

Our general Git workflow uses three main branches:

  • develop: Development branch
  • uat: User Acceptance Testing branch
  • master: Production

Project work begins on the develop branch, with feature branches created for individual tasks or enhancements. All work is completed in feature branches and merged into develop via Pull Requests (PRs)

Branch Naming Conventions

  • Feature branches should follow the format: feature/[ADO_task_number]_[description]
    • Example: feature/46499_drilling_anomaly_xgboost_experiment

Workflow Steps

1. Feature Development

  • Create a feature branch from the project’s develop branch.
  • Complete work and commit changes to the feature branch.
  • When ready, open a Pull Request to merge the feature branch into develop.
  • All merges into develop must be completed via PRs, with policies in place to enforce this.
  • When merging, squash commits into a single commit and delete the feature branch after merging.

2. Deploying to UAT

Once features are ready for testing, changes are moved from develop to uat.

  • All code should be reviewed to ensure it satisfies coding standards.
  • Sync branch before merging develop into uat.
  • All merges into uat must be done via PRs.
  • At least one reviewer should be added for merging into uat.
  • Direct updates to uat are prohibited.
  • Do not delete the develop or uat branches after merging.
  • After merging, develop and uat should have similar, linear histories.
  • CI/CD Integration: Opening a PR to uat triggers automated CI pipelines that run unit and integration tests. These tests validate model training, inference, and monitoring workflows in the UAT Databricks workspace. Logs and test results are available in the PR UI. Failed tests must be resolved before merging.

3. Deploying to PRD

  • After successful UAT testing and approval, changes are merged from uat into master.
  • Sync branch before merging uat into master.
  • All merges into master must be done via PRs, with team leads and infrastructure admins as approvers.
  • Direct updates to master are prohibited.
  • Do not delete the uat or master branches after merging.
  • After merging, uat and master should have similar, linear histories.
  • CI/CD Integration: Merging into master triggers automated deployment to the production Databricks workspace. The trained model is registered as a challenger in Unity Catalog and compared against the current champion. If it passes validation, it is promoted to champion and deployed for production use.

Hotfixes

  • For urgent fixes in production, create a hotfix branch from master.
  • Complete the hotfix and squash commits into master via a PR.
  • After deploying the hotfix to master, pull the fix back into develop and uat and sync to ensure all branches remain up to date.

Preparing Your Branch for a Pull Request

Before opening a Pull Request (to branches such as develop, uat, or master), it’s best to synchronize your branch with the latest changes from the target branch. This ensures you:

  • Resolve potential merge conflicts early.
  • Test your code against the most up-to-date version.
    Databricks Git integration provides three options for this:

Access the Git Reset operation by selecting it from the kebab menu (⋮) in the upper-right corner of the Git operations dialog.

  • git reset replaces the contents and history of your branch with the most recent state of another branch.
  • Use this only if your local edits conflict with the upstream branch and you are fine with discarding those edits entirely.
  • Resetting will cause you to lose all uncommitted and committed changes, both locally and remotely, if you force-push afterward.

⚠️ Because of the risk of data loss, this option is generally not recommended.

Databricks Git Reset Example

Access the Git Merge operation by selecting it from the kebab menu (⋮) in the upper-right corner of the Git operations dialog.

  • The merge option works like a standard git merge.
  • It combines the commit history from the target branch into your current branch.
  • The result preserves all commits from both branches, creating a “merge commit” if necessary.
  • Unlike rebasing, merging does not rewrite commit history and therefore does not require force-pushing.

👍 This makes merge the safest and most collaboration-friendly approach, and the recommended option when working in shared repositories.

Databricks Git Merge Example

3. Rebase

Access the Git Rebase operation by selecting it from the kebab menu (⋮) in the upper-right corner of the Git operations dialog.

  • Like merging, rebasing integrates changes from one branch into another.
  • However, rebase rewrites commit history to create a linear sequence of commits:
    1. Saves your current branch’s commits temporarily.
    2. Resets your branch to the target branch.
    3. Reapplies your saved commits one by one.

This results in a cleaner, linear history — but it comes with risks:

  • Rebasing can cause issues for collaborators if they have already pulled the original branch, since history has changed.
  • Rebasing often requires --force pushing to update the remote branch.

⚠️ Use rebase only if you and your team are comfortable with rewriting history and are aligned on its usage.

Databricks Git Rebase Example

Pull Request Checklist

  • PRs should only be submitted after the work has been completed and thoroughly tested.
  • Each PR to uat and master must include:
    • A clear description of the proposed changes.
    • A summary of improvements or issues resolved by the PR.
    • Any updates to documentation or deployment considerations.
    • Confirmation that the code follows the guidelines provided in this document.
  • For production deployments, ensure a change request is created, approved, and attached to the PR.

Approvals & Policies

  • develop: All merges via PR, enforced by policy.

  • uat: All merges via PR, with designated approvers.

  • master: All merges via PR, with team leads and infrastructure admins as approvers.

    Notes

  • All merges should be synchronized to maintain a clean, linear history.

  • Feature branches are deleted after merging into develop.

  • develop, uat, and master branches are persistent and should not be deleted.

  • Hotfixes are managed via dedicated branches from master and merged back into develop and uat.

  • All merges into develop, uat, and master must follow PR-based workflows.

  • CI/CD Enforcement: Ensure pipeline YAML files (e.g., mlopstemplate-tests-ci.yml) are added to branch policies for uat to enforce validation and deployment checks.

Diagram

--- config: logLevel: 'debug' theme: 'base' themeVariables: commitLabelFontSize: '12px' 'git0': '#00A9E0' 'git1': '#DA291C' 'git2': '#9EA2A2' 'git3': '#97D700' 'git4': '#CDEA80' 'git5': '#CDEA80' gitGraph: rotateCommitLabel: false showBranches: true showCommitLabel: true mainBranchName: 'master' --- gitGraph commit id: "Initial" branch uat order: 2 branch develop order: 3 checkout develop commit id:'feature1' branch feature1 order: 3 checkout feature1 commit id: 'c1a' commit id: 'c1b' checkout develop merge feature1 tag:'squash features (single commit)' id: 'c1' commit id: 'feature2' branch feature2 order: 3 checkout feature2 commit id: 'c2a' commit id: 'c2b' checkout develop merge feature2 tag:'squash features' id: 'c2' checkout uat merge develop tag: 'multi commit' commit id: 'c1 uat' commit id: 'c2 uat' checkout master commit id: 'hotfix' branch hotfix order: 1 checkout hotfix commit id: 'HF1' checkout master merge hotfix tag: 'HF1' checkout develop merge hotfix tag: 'HF1' checkout uat merge develop tag: 'HF1' checkout master merge uat tag: 'c1c2'
Last updated on 16 Sep 2025
Published on 16 Sep 2025
 Edit on GitHub