Background
In states like Texas, oil production is often reported only at the lease level. This makes it challenging to understand well-level production trends, which are critical for analysis, investment, and operational decision-making. Traditional approaches rely on decline curves or rigid proportional splits, which can introduce bias and limit adaptability.
Our Methodology
We propose a modern, transparent approach to well-level allocation that leverages public data and machine learning:Train on States with Well-Level Data
In many jurisdictions outside Texas (e.g., North Dakota, New Mexico, Colorado, Oklahoma), regulators publish per-well monthly production.
- These datasets provide the ground truth labels needed for training.
- Features include well attributes (completion date, lateral length, formation), historical performance, and operating status.
-
Machine Learning Prediction
A regression model (e.g., gradient boosting, random forest, or neural networks) is trained to predict per-well monthly oil output from the available features.
- The model captures nonlinearities and differences across operators, basins, and well types.
- It generalizes learned patterns to wells in Texas, where well-level oil is not reported.
-
Constraint-Based Reconciliation
For each lease-month in Texas:
- The model generates a predicted oil volume for each well.
- Predictions are then scaled proportionally so that the sum of all well estimates matches the official lease total reported to the Railroad Commission of Texas.
- This ensures accuracy at the lease level and granularity at the well level.
Advantages Over Traditional Methods
- Data-Driven: Uses real per-well training labels from other states.
- Adaptive: Learns new allocation patterns automatically as new data is ingested.
- Transparent: Methodology is simple to explain—no black box proprietary code.
- Extensible: Can incorporate gas/water data, GOR trends, and even well test results.
Why It Matters
By combining public datasets, machine learning, and reconciliation techniques, we create a defensible, repeatable allocation framework. This approach avoids the pitfalls of proprietary black-box allocation methods and gives stakeholders confidence that the numbers are both accurate and explainable.