ARC Data Analytics Handbook

Version 0.0.2

All things data analytics at ARC Resources.

How Dewaxing LLM Works

The Dewaxing LLM project uses large language models to automate the extraction and structuring of job summaries and run details from WellView job logs. This enables more efficient data capture, reporting, and optimization of dewaxing operations.

Model Background

Job Summary

Modeling Strategy

For dewaxing jobs in WellView, comments from the job time log are extracted and used as input for the large language model (LLM). A labeled dataset is used to create prompt instructions and provide examples for few-shot prompting. A subset of unseen labeled examples is reserved for evaluation.

Objective: The model aims to generate the number of runs, relevant dates, and a comprehensive job summary. Summaries are created for each day, and the final job summary is post-processed to create daily summaries.

Example Input and Output:

Log Data:

Job Id: FD1D2182A580450DAFC7607319174610 Well Name: 7GEN HZ 102 KARR 13-11-64-4 Pad Name: 16-16-064-04W6 Job Category: Workover Time Log Start Date: 2022-08-23 00:00:00 Time Log End Date: 2022-08-23 11:00:00 Comments: Well flowing through Production … (additional time log entries)

Summary Output:

  • Job Summary:
    Date: [“2022-08-31”]
    Number of Runs: 10
    Summary:
    Crew time on well: 300 minutes (11:30 to 16:30).
    4 runs with barbed spear reaching depths from 100 m to 750 m, retrieving brown medium wax (100 L flowed back per run).
    3 runs with wax knife to 750 m, retrieving brown medium wax: 2 runs at 100%, then 1 run at 10% (100 L flowed back per run).
    2 runs with gauge, no issues, to 750 m and 2763 m.
    1 run with BHBS assembly set in profile at 2763 m.
    Bumper spring preparation: Yes (BHBS installed and plunger cycling).

  • Daily Summary:
    Crew time on well: 300 minutes (11:30 to 16:30).
    4 runs with barbed spear reaching depths from 100 m to 750 m, retrieving brown medium wax (100 L flowed back per run).
    3 runs with wax knife to 750 m, retrieving brown medium wax: 2 runs at 100%, then 1 run at 10% (100 L flowed back per run).
    2 runs with gauge, no issues, to 750 m and 2763 m.
    1 run with BHBS assembly set in profile at 2763 m.
    Bumper spring preparation: Yes (BHBS installed and plunger cycling).

Model Selection and Configuration

At the time of this report, the model used “Claude Sonnet 4” and few-shot prompting with structured output.

For the most up-to-date model pipeline, refer to the main branch of the code repository:
src/data_science/models/sources/fewshot_llm_model.py

For configuration, model serving, prompts, few-shot examples, and output structures, see:
src/data_science/models/sources/job_summary_llm_config.yaml

Model Update History

VersionDateUpdates
1Initial release of model

Using The Model

The model is registered in the Unity Catalog at:
prd_zone3.dewaxingllm.job_summary_llm_model

Inference results are stored at:
prd_zone3.dewaxingllm.job_summary_inference

Key Columns in the Inference Table:

ColumnTypeDescription
APIUWIstringUnique well identifier
WellIDstringWellView ID for the well
JOBIDstringWellView ID for the job
JobStartDatetimestampStart date of the job
JobEndDatetimestampEnd date of the job
PRIMARYJOBTYPEstringType of the job
SECONDARYJOBTYPEstringSub type of the job
LogDatastringConcatenated string containing job and time log details for LLM input
job_summary_predictedstructExtracted job summary from LogData using LLM
DatearrayDates, extracted from job_summary_predicted
NumberOfRunsintNumber of runs, extracted from job_summary_predicted
JobSummarystringJob summary, extracted from job_summary_predicted
DailySummaryarrayDaily summary, extracted from job_summary_predicted
mlflow_run_idstringUnique identifier for each MLflow run
model_registry_namestringThree-level name of the registered model in Unity Catalog (catalog.schema.model)
model_typestringType of the machine learning model used
model_versionstringVersion of the machine learning model used
load_datetime_utctimestampDate and time of data loading in Coordinated Universal Time (UTC)

Integration with WellView API:

  • Job summaries are sent to the summary field in the Jobs table.
  • Daily summaries are sent to the summary field in the Daily Operations table.
  • All AI-generated values are tagged with ‘AI Generated’.

Run Details

Modeling Strategy

For dewaxing jobs in WellView, comments from the job time log are extracted and used as input for the LLM. The objective is to identify and extract details from each run within a job. To achieve this, a labeled dataset is used to create prompt instructions and provide examples for few-shot prompting. A subset of unseen labeled examples is reserved for evaluation.

Data Structure and Field Descriptions:

  • SwabNo (integer): Sequential numbers beginning with 1 at the start of the job.
  • StartDate (string): Timestamp in the format YYYY-MM-DD HH:MM:SS.
  • SolventOrSteam (string; enum: “0”, “0.1”, “1”, “1.1”):
    • 0 = no steam, no solvent
    • 0.1 = solvent only
    • 1 = steam only
    • 1.1 = steam and solvent
    • 0 if not mentioned.
  • WaxProperties (string or null; enum):
    • Wax type code determined by properties mentioned in the log.
    • null if not mentioned for that run.
  • WaxPercentage (number or null):
    • Decimal value of wax percentage.
    • 0.0 if the log indicates no wax was observed.
    • null if percentage is not mentioned or cannot be inferred.
  • DepthPull (number or null):
    • Depth reached during the run in meters.
    • null if not mentioned.
  • VolFluidRec (number or null):
    • Volume in cubic meters (e.g., 0.05 for 50 L).
    • null if not mentioned.
  • TempWH (number or null):
    • Wellhead temperature in Celsius. Usually recorded only for the first swab (SwabNo 1) at the start of the log.
    • null if not mentioned.
  • Com (string or null; enum: null, “spear”, “wax knife”, “gauge ring”, “bumper spring”, “plunger”):
    • Specify the tool used only if it matches listed categories.
    • null if no tool is mentioned or does not match.

Example Input and Output:

Log Data:

Job Id: 0F95E3E0E3104E5684F6E55DD5A9D062 Well Name: 7GEN HZ KAKWA 4-30-64-4 Pad Name: 16-17-064-04W6 Job Category: Workover Time Log Start Date: 2019-12-17 00:00:00 Time Log End Date: 2019-12-18 00:00:00 Comments: Record pressures FTP=2992kPa SICP=5764kPa 09:30 - RIH with spear to 10m, flow back to P-tank 100L 09:45 - RIH with spear to 50m, flow back to P-tank 100L … (additional time log entries)

Extracted Run Details:

  • SwabNo: 1 StartDate: 2019-12-17 09:30:00 SolventOrSteam: ‘0’ WaxProperties: null WaxPercentage: null DepthPull: 10.0 VolFluidRec: 0.1 TempWH: null Com: spear, AI Generated
  • SwabNo: 2 StartDate: 2019-12-17 09:45:00 SolventOrSteam: ‘0’ WaxProperties: null WaxPercentage: null DepthPull: 50.0 VolFluidRec: 0.1 TempWH: null Com: spear, AI Generated (Additional runs follow the same structure.)

Model Selection and Configuration

At the time of writing, the model used was “llama 4 maverick” with few-shot prompting and structured output.

For the latest model pipeline, refer to the main branch of the code repository:
src/data_science/models/sources/fewshot_llm_model.py

For configuration, model serving, prompts, few-shot examples, and output structures, see:
src/data_science/models/sources/run_details_llm_config.yaml

Model Update History

VersionDateUpdates
1Initial release of model

Model Usage

The model is registered in the Unity Catalog at:
prd_zone3.dewaxingllm.run_details_llm_model

Inference results are stored at:
prd_zone3.dewaxingllm.run_details_inference

Key Columns in the Inference Table:

ColumnTypeDescription
APIUWIstringUnique well identifier
WellIDstringWellView ID for the well
JOBIDstringWellView ID for the job
JobStartDatetimestampStart date of the job
JobEndDatetimestampEnd date of the job
PRIMARYJOBTYPEstringType of the job
SECONDARYJOBTYPEstringSub type of the job
LogDatastringConcatenated string containing job and time log details for LLM input
run_details_predictedstructExtracted run details from LogData using LLM
SwabNobigintSequential run number within the job, extracted from run_details_predicted
StartDatetimestampDate the run occurred, extracted from run_details_predicted
SolventOrSteamstringCoded value indicating steam and/or solvent usage, extracted from run_details_predicted, captured as PresTub column in WellView database
WaxPropertiesstringCoded value describing wax color and hardness, extracted from run_details_predicted, captured as PresCas column in WellView database
WaxPercentagedoublePercentage of tool capacity filled with wax (null if unspecified, 0 if noted as empty), extracted from run_details_predicted, captured as TankGauge column in WellView database
DepthPulldoubleMaximum depth reached during the run, extracted from run_details_predicted
VolFluidRecdoubleVolume of fluid recovered (in cubic meters), extracted from run_details_predicted
TempWHdoubleWellhead temperature; entered on the first swab when mentioned, extracted from run_details_predicted
ComstringTool used for the run and AI Generated, extracted from run_details_predicted
mlflow_run_idstringUnique identifier for each MLflow run
model_registry_namestringThree-level name of the registered model in Unity Catalog (catalog.schema.model)
model_typestringType of the machine learning model used
model_versionstringVersion of the machine learning model used
load_datetime_utctimestampDate and time of data loading in Coordinated Universal Time (UTC)

Integration with WellView API:

  • Job run details are sent to the Swab Details table.
  • All AI-generated values are tagged as ‘AI Generated’.

Pipeline Overview

Training Workflow: prd-dewaxingllm-training

The training workflow consists of several steps to ensure robust model development and deployment:

  1. Source Data Validation: Source data is tested to confirm adherence to expected schema. If the data does not meet these criteria, the pipeline will fail.

    #NameSchemaValidationsKey Notes
    1jobtimelog_v1prd_zone2.wellviewetl.jobtimelogv1SchemaColumns: STARTDATE, COMMENTS, ENDDATE, IDJOB
    2job_v1prd_zone2.wellviewetl.jobv1SchemaColumns: JOBCATEGORY, ID, PRIMARYJOBTYPE, SECONDARYJOBTYPE, STARTDATE, ENDDATE
    3wells_v1prd_zone2.wellviewetl.wellsv1SchemaColumns: PADNAME, WELLNAME, APIUWI, ID

    For detailed validation criteria, refer to src/tests/data/conftest.yml.

  2. Feature Engineering: The feature engineering pipeline processes the entire historical dataset, consolidating time logs for each job.

  3. Model Training: Model training is executed in parallel for both job summary and run details models:

    • job_summary_training
    • run_details_training

  4. Model Validation: A set of labeled sample data is curated for both job summary and run details tasks, and stored in:

    • src/data_science/models/sources/job_summary_labeled_samples.yaml
    • src/data_science/models/sources/run_details_labeled_samples.yaml
      After training, the model is evaluated using these labeled samples. Final accuracy metrics are calculated and logged in both the metrics table and MLflow. Model validation compares these metrics against the defined criteria. If the criteria are met, the model advances to the deployment step. Validation is performed in parallel for both models:

    • Job Summary Model Validation Metrics:
      • Compares ‘NumberOfRuns’ (difference ≤ 1), ‘Date’ arrays (exact match), and ‘JobSummary’ character length (difference ≤ 500).
      • Computes per-column accuracy and overall accuracy across all columns.

    • Job Summary Model Validation Criteria:
      • Overall Accuracy: The model must achieve at least 80% overall accuracy.
      • Number of Runs Accuracy: At least 70% accuracy in correctly identifying the number of runs.
      • Date Accuracy: At least 70% accuracy in extracting relevant dates.
      • Job Summary Length Accuracy: At least 70% accuracy in generating job summaries of appropriate length.

    • Run Details Model Validation Metrics:
      • Checks for exact matches between all columns, including StartDate, SolventOrSteam, WaxProperties, WaxPercentage, DepthPull, VolFluidRec, TempWH, and Com.
      • Calculates per-column accuracy and overall accuracy across all columns.

    • Run Details Model Validation Criteria:
      • Overall Accuracy: The model must achieve at least 80% overall accuracy.
      • Start Date Accuracy: At least 70% accuracy in correctly identifying start dates.
      • Solvent or Steam Usage Accuracy: At least 70% accuracy in identifying solvent or steam usage.
      • Wax Properties Accuracy: At least 70% accuracy in identifying wax properties.
      • Wax Percentage Accuracy: At least 70% accuracy in identifying wax percentage.
      • Depth Pull Accuracy: At least 70% accuracy in identifying the depth reached during each run.
      • Recovered Fluid Volume Accuracy: At least 70% accuracy in identifying the volume of fluid recovered.
      • Wellhead Temperature Accuracy: At least 70% accuracy in identifying wellhead temperature.
      • Tool Used Accuracy: At least 70% accuracy in identifying the tool used in each run.
      • Extra Predictions: The proportion of extra (unnecessary) predictions must not exceed 10%.
      • Missing Predictions: The proportion of missing (unreported) predictions must not exceed 10%.
  • This pipeline runs in parallel for both models:
    • job_summary_model_validation
    • run_details_model_validation
  1. Model Deployment: Once training is complete, the model is assigned the “Challenger” alias. If the model passes all minimum validation criteria, it is promoted to “Champion.” This pipeline runs in parallel for both models:

    • job_summary_model_deployment
    • run_details_model_deployment

  2. Inference: Inference is performed on a small subset of data to test and create the inference table if not already available. This runs in parallel for both models:

    • job_summary_inference
    • run_details_inference

  3. Output Data Validation: Output data is validated for schema, comment structure, null percentage, value ranges, etc. If the data does not meet these criteria, the pipeline will fail.

    #NameSchemaValidationsKey Notes
    1input_featuresprd_zone3.dewaxingllm.input_featuresSchema, Allowed Values, Null Thresholds, Compound Uniqueness, Column CommentsAllowed values for job types; uniqueness on (JOBID, WELLID); all key fields non-null
    2rundetailsmetricsprd_zone3.dewaxingllm.run_details_metricsSchema, Ranges, Null Thresholds, Compound Uniqueness, Column CommentsAccuracy metrics 0–100; uniqueness on (mlflow_run_id, load_datetime_utc); all columns non-null
    3rundetailspredictionprd_zone3.dewaxingllm.run_details_predictionSchema, Null Thresholds, Compound Uniqueness, Column CommentsAll critical fields non-null; uniqueness on (WellID, JOBID, mlflow_run_id, load_datetime_utc)
    4rundetailsinferenceprd_zone3.dewaxingllm.run_details_inferenceSchema, Null Thresholds, Compound Uniqueness, Column CommentsPartial nulls allowed (e.g., WaxProperties ≤ 99%); uniqueness on (WellID, JOBID, SwabNo)
    5jobsummarymetricsprd_zone3.dewaxingllm.job_summary_metricsSchema, Ranges, Null Thresholds, Compound Uniqueness, Column CommentsAccuracy fields 0–100; uniqueness on (mlflow_run_id, load_datetime_utc); all columns non-null
    6jobsummarypredictionprd_zone3.dewaxingllm.job_summary_predictionSchema, Null Thresholds, Compound Uniqueness, Column CommentsNo nulls in core fields; uniqueness on (WellID, JOBID, mlflow_run_id, load_datetime_utc)
    7jobsummaryinferenceprd_zone3.dewaxingllm.job_summary_inferenceSchema, Ranges, Null Thresholds, Compound Uniqueness, Column CommentsAllows small null tolerance (JobSummary ≤ 10%); NumberOfRuns ≥ 0; uniqueness on (WellID, JOBID)

    For more details, refer to src/tests/data/conftest.yml.

Schedule: This pipeline is not scheduled for automatic retraining and will run on demand.

Inference Workflow: prd-dewaxingllm-inference

The inference pipeline performs the following steps to update feature and inference tables and send data to the Peloton WellView API. Tasks are parameterized to support real-time, backfill, or CI testing processes. For backfill, the time range of jobs and job IDs can be specified.

  1. Feature Engineering: Runs feature engineering for real-time or backfill processes.

  2. Inference: Executes inference for both models in parallel, with parameters defining the type of inference (CI_test, backfill, realtime) and the time range or job IDs for backfill.

    • job_summary_inference
    • run_details_inference

  3. WellView API Push: Sends data from inference tables to WellView tables. The operation type (CI_test, backfill, realtime), time range, job IDs, and model selection (jobsummary, run_details, or both) can be specified.

Rules for API Integration:

  • Job Summary:

    • Data is sent to the summary column in the Jobs (wvJob) table with the system tag ‘AI Generated’.
    • If an existing summary exceeds 100 characters and does not begin with ‘Summary generated by AI’, it will not be updated.
    • The summary will be updated if it is missing, shorter than 100 characters, or begins with ‘Summary generated by AI’.
    • All AI-generated summaries begin with ‘Summary generated by AI’.
    • If the summary exceeds WellView’s 2000-character limit, the existing human-generated summary is retained or a default message is written: “AI-generated job summary exceeded the 2000-character limit. Please see the daily summaries for full details.”

  • Daily Summary:

    • Data is sent to the summary (summaryops) column in the Daily Operations ([wvJobReport]) table with the system tag ‘AI Generated’.
    • The same character limit and update rules apply as for job summaries.

  • Run Details: Data is sent to the Swab Details (wvSwabDetails) table, populating available fields:

    • SwabNo: Sequential run number (integer)
    • StartDate: Time of the run (datetime)
    • PresTub (SolventOrSteam): Solvent or steam usage (double)
    • PresCas (WaxProperties): Wax properties (double)
    • TankGauge WaxPercentage: Wax percentage (double)
    • DepthPull: Maximum depth reached (double, meters)
    • VolFluidRec: Volume of fluid recovered (double, cubic meters)
    • TempWH: Wellhead temperature (double, °C; entered only for the first swab)
    • Com: Tool used for the run, marked as AI Generated (string, max 2000 characters)

    • If the Swabs Table (wvSwab) does not exist, a new record is created using the run start time.
    • All new records are tagged as ‘AI Generated’.
    • Existing swab records are retained, and new AI-generated records are added.
    • All Com columns include the ‘AI Generated’ tag.

  • API Integration Logs: All logs with error/success status are saved in prd_zone3.dewaxingllm.api_integration_log.

Schedule: The pipeline runs daily at 3:30 AM Mountain Time.

Monitoring Workflow: prd-dewaxingllm-monitoring

The monitoring pipeline tracks both job summary and run details models to detect anomalies, missing data, performance degradation, and logs/alerts for action.

  • Job Summary Monitoring:
    • Verifies new data has been added within the past week.
    • Checks for null values outside defined ranges within the past week.
    • Identifies missing records or errors between API integration logs and inference tables within the past week.
    • Uses a custom LLM to judge completeness, formatting, and relevance of a sample of data from the past week.

  • Run Details Monitoring:
    • Verifies new data has been added within the past week.
    • Checks for null values outside defined ranges within the past week.
    • Identifies missing records or errors between API integration logs and inference tables within the past week.
    • Uses a custom LLM to judge formatting and relevance of a sample of data from the past week.

  • Overall Model Monitoring and Alerts:
    • All models are monitored weekly, with results stored in prd_zone3.dewaxingllm.monitoring_summary.
    • The model_drifted_overall column flags any failed monitoring checks, with additional columns providing details on specific issues.
    • If any table has model_drifted_overall = 1, a SQL query in the Databricks workspace automatically triggers an email alert.

Schedule: The monitoring pipeline runs weekly on Tuesdays at 1 AM Mountain Time.