Cracking open the black box of machine learning and gaining insights from geodata.

Machine learning models are sometimes referred to as “black-box” models as we don’t know how and what they do. Although building a model with high accuracy is considered the main business goal for most of the companies that adopt machining learning techniques, it is not enough for the owners to know how a model is working. For instance, we know that distance to centre and distance to school might affect the housing value. However, we don’t know if the relationship between them is monotonic, linear or even more complex. To be able to understand their deliverables we reach out to other tools, which help us presenting their outcome.  


The partial dependence plot allows us to visualize the functional relationship between the features (distance to the city centre by bike) and the target (housing value) in a non-transparent model such as a random forest model. More specifically, the partial dependent function tells us that the average marginal effect on the prediction for a given value of the feature.

By integrating geodata provided by StraTopo, the machine learning-based housing valuation model explains how location attributes affect house value. 

The picture shows the housing value, before reaching 7k (kilometres) of distance to the center, falls as we get further suburbs in the Limburg area. But the value rises after the distance exceeds 7k, and stabilizes after 11k.

In conclusion, we can see that the price drops dramatically with the further distance to the city centre. which is a common situation. This graph has been prepared for model visualization and StraTopo is currently working on analyzing other factors and presenting their influence.

Want to know more or integrate StraTopo’s data? Contact StraTopo via