We turn complex data into clear information, giving you confidence in every decision.

Well-Level Allocation with Machine Learning

Background

In states like Texas, oil production is often reported only at the lease level. This makes it challenging to understand well-level production trends, which are critical for analysis, investment, and operational decision-making. Traditional approaches rely on decline curves or rigid proportional splits, which can introduce bias and limit adaptability.

Our Methodology

We propose a modern, transparent approach to well-level allocation that leverages public data and machine learning:Train on States with Well-Level Data

In many jurisdictions outside Texas (e.g., North Dakota, New Mexico, Colorado, Oklahoma), regulators publish per-well monthly production.

  • These datasets provide the ground truth labels needed for training.
  • Features include well attributes (completion date, lateral length, formation), historical performance, and operating status.
  1. Machine Learning Prediction

    A regression model (e.g., gradient boosting, random forest, or neural networks) is trained to predict per-well monthly oil output from the available features.

    • The model captures nonlinearities and differences across operators, basins, and well types.
    • It generalizes learned patterns to wells in Texas, where well-level oil is not reported.
  2. Constraint-Based Reconciliation

    For each lease-month in Texas:

    • The model generates a predicted oil volume for each well.
    • Predictions are then scaled proportionally so that the sum of all well estimates matches the official lease total reported to the Railroad Commission of Texas.
    • This ensures accuracy at the lease level and granularity at the well level.

Advantages Over Traditional Methods

  • Data-Driven: Uses real per-well training labels from other states.
  • Adaptive: Learns new allocation patterns automatically as new data is ingested.
  • Transparent: Methodology is simple to explain—no black box proprietary code.
  • Extensible: Can incorporate gas/water data, GOR trends, and even well test results.

Why It Matters

By combining public datasets, machine learning, and reconciliation techniques, we create a defensible, repeatable allocation framework. This approach avoids the pitfalls of proprietary black-box allocation methods and gives stakeholders confidence that the numbers are both accurate and explainable.

LinkedIn