Estimating oil reserves from high-resolution satellite imagery has become rather popular in our budding geospatial-analytics-from-space industry. Oil is typically stored in tanks with floating roofs, therefore the fill of the oil tank can be estimated from the shadow cast on the inside of the tank as the lid sinks. A pretty neat idea.
How are oil tanks, filled to various degrees, detected? With sufficient training data, a neural network can probably learn to identify them. We have seen the power and flexibility of artificial intelligence algorithms in the past, when we successfully used the same neural network architecture to identify properties with pools in Australia and remote villages in Nigeria. However, training and deploying a well-performing model is expensive, both in time and money. Yes, computation on the cloud is relatively cheap, but the costs start to add up if you’re doing continental-scale feature detection.
Oil tanks are round, they look like bright disks when they are filled, and they are relatively big. Can we take into account these properties and come up with an image filtering algorithm to detect them? The max-tree  is a hierarchical representation structure which organizes the image information content in nested connected components based on intensity. In the max-tree, oil tanks appear as nodes which can be selected based on geometric attributes such as size and compactness.
Oil tank detection.
Here is an example of max-tree filtering on a small panchromatic image chip from Houston, TX. We have selected the max-tree nodes with area between 100m2 and 3500m2, and compactness greater than 0.97 (where unity is the compactness of the perfect disk).
Max-tree filtering at its finest.
Note that these operations are practically instantaneous for an image of this size. Here is the result of executing the same code on another chip from the same area:
The varying degrees of brightness and the ladders connecting the oil tanks affect the filter performance.
Note how we’re missing a number of the darker tanks. Moreover, the presence of ladders on and between tanks reduces the compactness of the corresponding nodes in the max-tree. This results in chunks of the tanks missing in the filtered image.
If we’re interested only in the larger oil tanks, all we need to do is increase the minimum value of the acceptable area. Here is the result if we set this value to 1000 (i.e., a radius of approximately 18m)
Just big oil tanks.
If we want to increase recall, we can decrease the compactness. For example, setting the minimum compactness to 0.8:
We have found more tanks but took a hit in precision.
We have detected more oil tanks but also picked up some noise due to objects which are relatively compact, but are not disks; the inevitable precision/recall tradeoff. We can remove this noise in a number of ways, one being feeding these results to our crowd to weed out the false positives. You can imagine this workflow at scale: using max-tree filtering to detect oil tank candidates in a large area, then having the crowd clean up the results. Another version of the crowd/machine combo we’ve deployed in the past.
Vectorizing the raster output, we obtain a geojson with the bounding boxes of the detected oil tanks.
Oil tank bounding boxes.
Having vectors makes it easier to count. There are 133 oil tanks (give or take!) in this image segment.
How about detecting oil tanks on an entire image strip and not just on a tiny chip?
This is the result of processing an entire WV2 strip over Cushing, Oklahoma, one of the biggest oil tank farms in the US, which takes about 15min on an AWS EC2 r3.2xlarge instance.
Max-tree filtering for oil tank detection on the entire strip.
The orderly spots to the north and south of the image center correspond to oil tanks, while the randomly scattered spots are noise. Here is a close-up:
Pretty good recall and precision in this scene.
And another one:
This is a tough scene; yet the accuracy is acceptable.
Deploying at scale
We’ve built [a GBDX task] that takes as input a panchromatic image, and produces the bounding boxes of the detected oil tanks. The steps performed in the task are morphological operations to remove features such as dark spots, ladders and pipes (which reduce the compactness of oil tanks), followed by max-tree filtering based on area and compactness, and vectorization of the raster output.
We have also put together a walkthrough for deploying the task over Houston, Texas. The results are shown in the slippy map below. The map was created by uploading the Houston image and the detection bounding boxes to Mapbox, and using Mapbox GL JS to display the corresponding raster and vector tilesets.
Oil tank detections in Baytown, TX, shown in green. (Full page view here.)
How can we improve accuracy? There are many avenues. An immediate step is to use Land Use Land Cover classification on the multispectral image to remove compact features on soil, water or vegetation, which can not be oil tanks. A more futuristic approach is to combine max-tree filtering with machine learning. Stay tuned for updates!
 P. Salembier, A. Oliveras, and L. Garrido. Antiextensive connected operators for image and sequence processing. IEEE Transactions on Image Processing, Vol. 7, No. 4, April 1998.