Mapping sinkholes of Switzerland based on DEM processing¶
Gwenaëlle Salamin (ExoLabs) - Clémence Herny (ExoLabs) - Clotilde Marmy (ExoLabs) - Alessandro Cerioni (Canton of Geneva) - Roxane Pott (swisstopo)
Proposed by the geology domain of the Federal Office of Topography swisstopo - PROJ-DOLINES
July to January 2024 - Published in June 2025
All scripts are available on GitHub.
This work by STDL is licensed under CC BY-SA 4.0
Abstract: Sinkholes, or dolines, are key geomorphological features in Switzerland’s karst landscapes, yet their mapping remains inconsistent due to heterogeneous inventories and subjective expert interpretation. This study aimed to standardize and accelerate sinkhole cartography across diverse Swiss geological contexts by testing four automatic detection methods using LiDAR-derived digital terrain models. Each method was optimized based on ground truth data and evaluated through quantitative metrics and expert validation. While six of eleven areas of interest showed satisfactory detection of sinkhole-dense zones, smaller sinkholes were frequently missed, and overall automation was hindered by and incomplete ground truth dataset. These findings underscore the necessity of establishing a clear, standardized definition of sinkholes and improving ground truth quality before reliable automatic detection can be achieved.
1 Introduction¶
Sinkholes or "dolines" are funnel-shaped depressions characteristic of karstic environments. In Switzerland, they are mapped on national and geological maps by the Federal Office of Topography (swisstopo). These are based on the swissTLM3D and GeoCover products respectively. Unfortunately, these inventories are very heterogeneous. As there is no standard definition in Switzerland for the characterization of sinkholes, their mapping is always subject to the appreciation of the geologists and cartographers in charge. This is not in line with the mission of swisstopo to provide stable and precise data over the whole Switzerland.
The establishment of an automatic detection method would help to standardize the mapping of sinkholes over the Swiss territory. Indeed, it would define sinkholes based on clear criteria and limit variability across experts to the post-processing controls.
Recent advancements in sinkhole detection methods have significantly improved the identification of these karstic landforms. In particular, the use of LiDAR-based Digital Terrain Models (DTM) has emerged as a cornerstone. Studies such as Telbisz et al. (2016)1 and Wu et al. (2016)2 demonstrated the superior resolution and reliability of LiDAR for mapping sinkholes compared to traditional methods, allowing for precise morphometric analyses even in vegetated areas.
A fundamental approach in sinkhole detection involves comparing the original digital elevation model (DEM) with its filled or smoothed version to identify depressions characteristic of karst terrain. Wu et al. (2016)2 utilized this principle in their automated delineation of sinkholes, where subtracting the filled DEM from the original effectively highlighted depressions. Bauer et Wagner (2024)3 used this method to detect sinkholes before performing morphometric analysis. Ferreira et al. (2022) extended this concept using UAV-based mapping in Brazilian karst, demonstrating that high-resolution DEM derived from drone imagery, when compared to their filled counterparts, could efficiently reveal sinkhole boundaries. Similarly, Touya et al. (2019)4 used the difference between a smoothed DEM and its original version to detect sinkholes in French karstic plateaus, improving the generation of contour lines. This simple yet powerful method allows for precise and scalable detection, making it a valuable tool in both automated geomorphological analysis and heritage protection.
A different method is used in Slovenia, Slovakia and Hungary. The method developed by Obu and Podobnikar in 2013 5, which combined hydrology modelling and morphological filters, provided a robust framework for identifying karstic features. Building on this approach, Telbisz et al. (2016)1 refined the method by integrating pre-processing for noise removal in the DTM. These improvements allowed for detailed analyses of sinkhole shapes, depths, and distributions67. Čonč et al. (2022)8 further utilized the refined methodology for morphological analysis, demonstrating its applicability in ecological studies, such as investigating the relationship between sinkhole landscapes and wild felid habitats. This evolution highlights the efficiency and relevance of this method, enabling its application across interdisciplinary domains.
Kokalj and Hesse (2017)9 introduced advanced visualization techniques to enhance the interpretation of subtle terrain features, such as sinkholes, from airborne laser scanning (ALS) data, by emphasizing their openness and shading in three-dimensional models. Building on the strengths of ALS data, Berthe (2021)10 combined LiDAR-based imaging with semi-automated cartographic techniques to map exokarst features on the Mountain of Reims, France. This approach leveraged the detailed elevation data to derive closed contours and advanced visualization strategies to help experts identify features, allowing to produce accurate, scalable representations of karst landscapes. Together, these studies highlight the critical role of visualization and semi-automated methods in improving the precision and accessibility of sinkhole detection and mapping. However, those method do not fully address the need of automation and precision in karst geomorphology yet.
The goal of this project is to develop a method or several methods that can automatically detect sinkholes over the various geological soils of Switzerland. Four different methods were tested and compared. In this documentation, only a brief description of each method is provided. Please, refer to the cited articles for their full description.
2 Data¶
The project makes use of a custom ground truth, the swissALTI3D DEM and several vector layers used to delimit coherent sinkhole areas.
2.1 Ground truth¶

Figure 1: Areas of interest by lithological type.
The dataset used as the ground truth (GT) was produced by swisstopo's expert for 11 areas of interest (AOI) (Fig. 1). The sinkholes were digitized as a point placed near the sinkhole center. The AOI were selected to cover each type of lithology susceptible to harbour sinkholes in Switzerland and areas were the quality of the Geocover product is deemed acceptable. The number of GT sinkholes in each AOI is shown in Table 1.
Name | Type | # of sinkholes |
---|---|---|
Cadagno (TI) | Marls on karst | 115 |
Chasseral (BE) | Marls on karst | 259 |
Chueboden (SG) | Molasse | 110 |
Gryon (VD) | Evaporites | 16 |
Les Verrières (NE) | Marls on karst | 157 |
Piz Curtinatsch (GR) | Various sedimentary rocks with a limestone matrix | 3 |
S-Chanf (GR) | Depressions in loose rock | 29 |
Schwarzsee (FR) | Molasse | 51 |
Silberen (SZ) | Bare karst | 106 |
Sissach (BL) | Marls on karst | 42 |
Val d'Alvra (GR) | Various sedimentary rocks with a limestone matrix | 41 |
Table 1: Areas of interest.
We can split the areas of interest in two larger groups: the standard sinkholes and the sinkholes in bare karst. In the Jura Mountains, sinkholes are mostly circular and funnel-shaped. They are smaller than in most literature, as the smallest ones have an area around 40 m2. In the areas of bare karst, sinkholes are more often elliptical, and their area can go down to 20 m2. We note that it was not possible in these areas for the expert to objectively determine for each hole based on the aerial data if it was or not a sinkhole and the ground truth was not exhaustive.
2.2 swissALTI3D¶
The product swissALTI3D is a digital elevation model without vegetation and elevated structures. It is updated globally every six years with LiDAR data. Some local updates are done automatically with the aerial imagery or manually in case of major changes. The accuracy in all three dimensions is 0.3 m for all AOI, except for Le Chasseral AOI where the accuracy is 0.5 m.
2.3 Additional datasets for pre- and post-processing¶
Several datasets were used for the post-processing of the detected depressions. In the case of the IGN's method, they were used in pre-processing to establish the mask for possible sinkhole areas.
- non-sedimentary zones: The non-sedimentary zones were provided by the expert in the form of polygons. He extracted this layer from GeoCover. By definition, it is not possible to have sinkholes outside of sedimentary soils.
- settlement areas: The settlement dataset is established by the Federal Office for Spatial Development and delineates in the form of polygons every area built-up for residential, work, or leisure purposes. We consider that any sinkhole in those areas would have been filled.
- water bodies: The water bodies are extracted in the form of polygons from the swissTLM3D product. It allows us filtering out the detected depressions that are water bodies or part of a water body.
- rivers: The river dataset is established by the Federal Office for the Environment and trace in the form of lines the Swiss rivers. A buffer of 3 m is applied to the lines, i.e. each riverbed is considered to be 6 m wide. We consider that depressions overlapping a riverbed can not be a sinkhole.
3 Method¶
Four methods based on the following approaches were implemented for this project:
- difference between a smoothed and the original DEM
- delineation of the bottoms of watersheds
- delineating the nested hierarchy of depressions
- stochastic determination of depressions
The area on which sinkholes are possible was always limited in pre- or post-processing based on the lithological type.
For all methods, the parameters were optimized with optuna
11 based on the F2 score for the different types of lithology and for all AOI at once.
3.1 IGN's method¶

Figure 2: Steps of the IGN's method for sinkhole detection, image from Touya et al. (2019)
The method, we called the IGN's method, is based on the based on the sinkhole detection part (Fig. 2) of the process described by Touya et al. (2019)4 to generate contour lines in karstic plateaus. The creation of the mask for flat areas was done differently. In our case, the mask for the flat area not only excluded areas with a steep slope, but also build-up areas, non-sedimentary zones, water bodies and rivers. An opening filter was performed on the binary mask to remove noise and ensure that potential zones have a sufficient size.
3.2 Watershed method¶

Figure 3: Steps of the watershed method, image from Telbisz et al. (2016).
The watershed method (Fig. 3) is based on the article by Telbisz et al. (2016)1. It considers that each sinkhole can be detected as a small watershed or is at the bottom of a watershed.
The initial method was tested in ArcGIS. Here, we implemented it with the functions from WhiteboxTools12 with the following adaptations:
- the
Sink
function produced depressions well beyond the expected area; theFindNoFlowCells
function was used instead, - no function was already available to perform the zonal fill; we developed our own code for this part.
Post-processing as described in Section 3.5 was performed to remove as many false positives as possible.
3.3 Level-set method¶

Figure 4: Steps of the level-set method as defined by Wu (2019).
The level-set method (Fig. 4) is based on the article by Wu et al. (2019)13. It allows determining the nested hierarchy of depressions and was implemented by the main author in the lidar
library14. We noticed that the output regions, i.e. depression compound, were sometimes, much wider than the expected area. Therefore, we filtered the produced dataset to keep only level 1 depressions. In exceptional cases were all the level 1 depressions in a larger depression are very small, the corresponding level 2 depression is kept instead.
Post-processing as described in Section 3.5 was performed to remove as many false positives as possible.
3.4 Stochastic depression analysis¶
WhiteboxTools12 has a function to perform a stochastic analysis of depressions within a DEM. This simple approach was tested and the results were post-processed as described in Section 3.5.
As this method is memory-extensive and very time-consuming, we did not optimize the parameters for the AOI of marls on karst and for all AOI at once.
3.5 Post-processing¶
To fit the methods to the Swiss geological landscape, we added the following post-processing steps:
- simplify depression geometry with the Visvalingam-Wyatt algorithm
- remove depressions outside of sedimentary soils
- remove depressions in settlement areas
- remove depressions intersecting too much with a lake
- remove depression intersecting too much with a river, except for the following river types:
- mountain rivers with a low flow rate in the Jura
- alpine rivers with a low flow rate
- remove depressions based on:
- minimum and maximum area
- minimum compactness
- minimum and maximum depth
- minimum diameter
- maximum standard deviation in elevation
The zonal statistics of the depression, namely depth, standard deviation in elevation, diameter, area, and compactness, were determined after the geometry simplification.
3.6 Metrics¶
To evaluate the results, the reliability of the detections, namely the precision P, and their exhaustiveness, namely the recall R, were computed for each area and globally. They were then combined to obtain the F2 score, which weights recall higher than precision. The respective formulas are presented below:
- \(P = \frac{TP}{TP + FP}\)
- \(R = \frac{TP}{TP + FN}\)
- \(F2 = 5 \frac{P\;\cdot\;R}{4P\; +\; R}\)
with:
- TP: true positives of the class, i.e. a detection overlapping a label
- FN: false negative of the class, i.e. a missed label
- FP: false positive of the class, i.e. an excess detection
The F2 score was used as the discriminating factor when comparing two methods. It was selected because we know the ground truth to be precise and 50% to 70% exhaustive. The best results would then have a high recall but not necessarily a high precision.
4 Bare karst¶
There is only one AOI for bare karst. It is at Silberen (SZ) and contains 51 sinkholes.
4.1 Results¶
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.11 | 0.61 | 0.32 |
Watershed | 0.10 | 0.53 | 0.28 |
Level-set | 0.08 | 0.72 | 0.27 |
Stochastic | 0.06 | 0.57 | 0.22 |
Table 2: Metrics for each method over the bare karst area with the results for the best F2 score in bold.
All the methods provided a precision lower than 0.11 (Tab. 2), which is very low. The recall is varying more than the precision, between 0.53 and 0.72. The IGN's method provided the best results with a F2 score of 0.32, thank to having the best precision. Knowing that the precision is extremely low anyway, we could instead choose the method with the best recall, which is the level-set method. However, a control of the results would then take substantially more time as there would be more or less 1000 detections against 600 for the IGN's method. These two methods have an acceptable recall, but all the other metrics are too low.

Figure 5: Detected sinkholes from the level-set method and GT sinkholes over an area of bare karst with several fissures.
In the end, the results of the level-set method were provided to the expert for a visual inspection. He reported that the sinkholes are poorly mapped and the results would not be usable in this form. Many small sinkholes are not recognised and several sinkholes are often grouped into large incomprehensible geometries. In addition, the typical distribution of sinkholes in lines along fractures of fissures is not mapped well enough (Fig. 5).

Figure 6: Dolines produced by the level-set methods over the bare karst area at Silberen, represented through their centroid and tagged as TP, FP, or FN after a comparison with the ground truth.
When visualising the whole area, we note that the detections are spread everywhere, except for the Rätschtal at the north-west of the area (Fig. 6). The density in detections is neither higher nor lower in the zones were ground truth was produced.
4.2 Discussion¶
No correspondence can be observed between the ground truth and the detections. Either the sinkholes digitized were not evenly distributed and largely underestimate the total amount of element, or the results are completely wrong. The visualization of the results leads us to think the both possibilities are true.
The sinkholes in the bare karst are very numerous and difficult to define and therefore very difficult to cartography. The region is very dense in depressions, such as it would be very time-consuming for someone to annotate every single one on such a large area. Maybe a smaller area of interest should be defined on such a complex topography.
If all detections are not sinkholes, better criteria than the ones defined in this project should be determined. Currently, it is not clear when a depression is a sinkhole or not, leading to the detection of a very high number of depressions that were not labelled as sinkholes by the expert.
Among false positives, a variety of shape and depth was observed. There is no obvious reason as to why some GT sinkholes are missed while so many depressions are detected. We note however that a large part of false negatives are very small sinkholes. As mentioned in the expert's review, very small sinkholes are not easily detected. However, detecting small objects would probably further deteriorate the precision based on the current ground truth.
5 Depressions in loose rock¶
There is only one AOI for depressions in loose rock. It is at S-Chanf (GR) and contains 29 sinkholes.
5.1 Results¶
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.75 | 0.62 | 0.64 |
Watershed | 0.70 | 0.66 | 0.66 |
Level-set | 0.69 | 0.62 | 0.63 |
Stochastic | 0.58 | 0.62 | 0.61 |
Table 3: Metrics for each method over the area of depressions in loose rock with the results for the best F2 score in bold.
The detection quality is quite equivalent across the methods with the F2 score varying by 0.05 point (Tab. 3). The best method is the watershed one, as it has the highest recall and F2 score. The IGN's one has the highest precision. The results obtained by this method were provided to the expert for a visual inspection, as the F2 score value is close to the best one and it produces less absurd false positives, as visible on Figure 7.
Based on the metrics, we would qualify the result as average.

Figure 7: Sinkholes detected through the IGN's method and the watershed method in the south-west of the AOI in S-Chanf.
The expert was quite satisfied with the results, as numerous sinkholes were correctly recognized. However, numerous other sinkholes were missed. In addition, he found a few false positives which are not sinkholes missed in the ground truth. These results could be used to create a map showing where sinkholes are generally located.

Figure 8: Sinkholes produced by the IGN's method over the DLR area at S-Chanf, represented through their centroid and tagged as TP, FP, or FN after a comparison with the ground truth.
Most of the detected sinkholes are on the east bank of the river Inn (Fig. 8), like the GT sinkholes. In the upper half of the area, the false positives as well as the false negative are located above steep round depressions. In the lower half of the area, they are located in a sparse forest with a very rough terrain (Fig. 7), making depressions and sinkholes hard to delimit and distinguish.
5.2 Discussion¶
As mentioned by the expert, the general position of sinkholes is correct. However, some large sinkholes are missed even though their form is easily visible in the terrain.
Concerning false positive detections, we could distinguish between two types:
- sinkholes that were not digitized and that are mostly in the upper half of the AOI,
- depressions in a rough terrain that are not sinkholes, mostly in the lower half of the AOI.
It should be noted that the method chosen here is not the best one based on the F2 score, but the one that was the more convincing visually. Indeed, the watershed method had the best F2 score, but it suffers from several false positive detections due to the presence of depressions near road and rail infrastructures (Fig. 7). The level-set method has the same downside. Tackling this problem would significantly improve the results of those two methods.
6 Evaporites¶
There is only one AOI for evaporites. It is at Gryon (VD) and contains 16 sinkholes.
6.1 Results¶
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.52 | 0.94 | 0.81 |
Watershed | 0.52 | 0.69 | 0.65 |
Level-set | 0.44 | 0.75 | 0.66 |
Stochastic | 0.55 | 0.69 | 0.60 |
Table 4: Metrics for each method over the area of evaporites with the results for the best F2 score in bold.
The IGN's method is significantly better than the others for the area at Gryon (Tab. 4). The method demonstrates one of the highest precision, equal to that of the watershed method, and exhibits a significantly superior recall compared to all other methods, with a difference of 0.19 points relative to the second-best method, the level-set method.
The metrics for the IGN's method are good. A precision of 0.5 is just acceptable as not all the sinkholes were digitized in the ground truth. The F2 score for the other methods are mediocre as their recall is at the limit of acceptance.

Figure 9: Sinkholes detected through the IGN's method in the area of evaporites at Gryon.
The expert recognized that the majority of the detected sinkholes are indeed sinkholes. Even though the recall is high, he spotted some missed elements, in particular among very small sinkholes. He also reported that some shapes are rather peculiar, diverging often from the usual convex elliptical shape (Fig. 9).

Figure 10: Sinkholes produced by the IGN's method over the evaporites area at Gryon, represented through their centroid and tagged as TP, FP, or FN after a comparison with the ground truth.
Considering the entire AOI, we note that GT sinkholes were all found at the north-west of the village of Gryon (Fig. 10). All but one were properly retrieved and the only missed sinkhole is actually encompassed in the detection of its neighbour.
There are several false positive detections in the rough forest terrain east of Gryon and several others in the village itself.
6.2 Discussion¶
Despite half the detections being labeled as FP, the expert did not complain about depressions wrongly detected as sinkholes. As the GT dataset is precise but not exhaustive, it would not be surprising that the detections evaluated as false positives are actually sinkholes. The estimated precision is then underestimated and the metrics of the IGN's method might be the best achievable result.
However, the expert reported several missed sinkholes, meaning that in his opinion the recall is overestimated. The GT dataset should have been exhaustive, but apparently some small sinkholes were missed.
7 Marls on karst¶
There are four AOI for marls on karst. They are at Cadagno (TI), Chasseral (BE), Les Verrières (NE), and Sissach (BL) and contain 573 sinkholes.
7.1 Results¶
No optimization was performed on the parameters of the stochastic method for areas of marls on karst, because the processing time was too long.
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.61 | 0.35 | 0.39 |
Watershed | 0.39 | 0.47 | 0.45 |
Level-set | 0.38 | 0.48 | 0.45 |
Stochastic | - | - | - |
Table 5: Metrics for each method over the areas of marls on karst with the results for the best F2 score in bold.
The results obtained for the watershed and the level-set methods are comparable (Tab. 5). The two aforementioned methods demonstrated superior recall in comparison to precision, whilst the opposite is true for the IGN's method. Based on visualization of the results, the watershed method was selected as the best method. We note however that no method is really satisfactory with F2 scores lower than 0.50.

Figure 11: Metrics per each AOI of marls on karst with the watershed method.
When considering the metrics per zone, the alpine area at Cadagno is performing significantly worse than the other areas of marls on karst in the Jura Mountains (Fig. 11). The regions of Le Chasseral and Sissach have a comparable F2 score, yet the former exhibits superior precision to recall with a difference of 0.25 point, while the latter demonstrates higher recall than precision with a difference of 0.45 points. The region of Les Verrières is performing the best with a recall of 0.85. The precision is however still a little low and would be expected to reach at least 60%.
The expert assessed each region separately. He deemed the results for the region of Cadagno unusable. The smaller sinkholes are missed even though they present the characteristic funnel shape. The detections are too big, with several sinkholes grouped into one misshapen geometry. The region of Sissach was assessed as problematic due to the large quantity of false positives in quarries, streams, road embankments, and artificial fillings. The results were good for the regions of Le Chasseral and Les Verrières. The precision was satisfying, but some sinkholes were still missed.

Figure 12: Sinkholes produced by the watershed method over the areas of marls on karst, represented through their centroid and tagged as TP, FP, or FN after a comparison with the ground truth.
In the region of Cadagno, the clusters of GT sinkholes and of detected sinkholes are not in the same parts of the AOI (Fig. 12). The results are completely wrong. In the region of Chasseral, the detections reproduce the line pattern expected along geological layers, but the number of detections is insufficient and many sinkholes are missed. The same pattern along geological layers can be observed in the north-west of Les Verrières, but at this place, most of the sinkholes are detected, even some tagged as false positives as they were too small to be included in the ground truth. We also note in this region the high quantity of false positives at the east of the region, i.e. at Haut de la Côte near the steep slope of the Bois de Réserve and at Les Maulieux on the rough terrain on a moraine down the slope. The hill in the region of Sissach is strongly marked by local hydrology and the temporary and dry riverbeds lead to many false positives.
7.2 Discussion¶
The quality of the results strongly varies depending on the lithological type. The F2 score seems very pessimistic for Le Chasseral and Les Verrières compared to the expert's appreciation. On the other hand, it does reflect the expert's appreciation for the region of Cadagno and Sissach. The very low precision and the absurdity of some false positive detections affected negatively the expert's review. It was judged worse than the region of Le Chasseral, which has a similar F2 score but a smaller difference between precision and recall.
Even if all the AOI cover a terrain made of marls on karst, the large difference in their metrics indicates that the optimal parameter set might not be the same for all. While it was working quite well for les Verrières and Le Chasseral, it seems less appropriated for Sissach, where the terrain is more shaped by hydrology and infrastructure, and for Cadagno, which is in an alpine setting. Based on the expert's insights, Les Verrières and Le Chasseral have the best GT, the most exhaustive, and the highest number of sinkholes. They then have a large influence on the optimal parameters by their GT quality and their sinkhole number.
The recall does not exceed 0.65, except in Les Verrières. The expert noted that the small sinkholes are missed the most. Maybe a lower threshold should be selected for the lower limit of sinkhole size. This would deteriorate the precision, but it seems that several small threshold were not digitized in the ground truth and therefore, the precision does not fully reflect the result quality.
8 Molasse¶
There are two AOI for molasse. They are at Chueboden (SG) and Schwarzsee (FR) and contain 161 sinkholes.
8.1 Results¶
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.52 | 0.70 | 0.66 |
Watershed | 0.44 | 0.76 | 0.66 |
Level-set | 0.47 | 0.72 | 0.65 |
Stochastic | 0.55 | 0.56 | 0.56 |
Table 6: Metrics for each method over the areas of molasse with the results for the best F2 score in bold.
The IGN's, watershed and level-set methods have comparable F2 score, even though they all have a different combination of precision and recall (Tab. 6). The watershed method has the lowest precision and the highest recall. The IGN's method has the lowest recall and the highest precision, even if this latter remains low. The stochastic method has the worst results, with a F2 score lower than the others by 0.09.
In reason of its high recall and good performance in the visual assessment, the watershed method was chosen as the best method for the area of molasse.

Figure 13: Metrics per each AOI of molasse with the watershed method.
The quality of the results is comparable between the regions of Chueboden and Schwarzsee (Fig. 13). They both have a higher recall than precision with a gap of 0.27 point between the two for the first region, against 0.41 points for the second one.
The expert reported that many sinkholes were correctly recognized and only a few of the detections were not actual sinkholes. However, like in areas of marls on karst, sinkholes with a very typical shape are also often missed. Consequently, the results are a good compilation to highlight areas with sinkholes, but a manual selection remains essential.

Figure 14: Sinkholes produced by the watershed method over the areas of molasse, represented through their centroid and tagged as TP, FP, or FN after a comparison with the ground truth.
Regarding the location of the detections, the false negatives are grouped over certain regions of the AOI, especially in the region of Chueboden (Fig. 14). On the other hand, false positives are very spread reaching even zones were no ground truth was produced.
The distribution of the sinkholes parallel to the rock horizons is well reproduced in the results.
8.2 Discussion¶
Based on the expert's review and the clustering of the ground truth on some AOI parts (Fig. 14), it seems that the precision was underestimated because not all sinkholes were digitized. The expert warned us that only 50% to 70% of the sinkholes were digitized. It seems that those were in addition not equally distributed over the AOI.
On the other hand, the recall is sufficient for this lithological type. However, once again, the expert emphasized that several small sinkholes had been overlooked. Most of the false negatives are indeed very small. It is possible that there were not enough of them present in the ground truth to enable the parameters to be adapted.
9 Various sedimentary rocks with a limestone matrix¶
There are two AOI for various sedimentary rocks with a limestone matrix (VSRLM). They are at Piz Curtinatsch (GR) and Val d'Alvra (GR) and contains 44 sinkholes.
9.1 Results¶
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.50 | 0.49 | 0.49 |
Watershed | 0.35 | 0.55 | 0.50 |
Level-set | 0.28 | 0.60 | 0.49 |
Stochastic | 0.26 | 0.38 | 0.35 |
Table 7: Metrics for each method over the areas of VSRLM with the results for the best F2 score in bold.
The stochastic method has the worst results with a F2 score 0.14 point lower than the other method (Tab. 7). When not considering it, all the method have similar low F2 score, with different combinations of precision and recall. The IGN's method has the highest precision and the lowest recall. The level-set method has the lowest precision and the highest recall. Upon visualization, the watershed method was selected as the best method for the area of VSRLM.

Figure 15: Metrics per AOI for each AOI of VSRLM with the watershed method.
The area of the Piz Curtinatsch has a precision of 100%, meaning that all the depressions were detected. However, the number of detections is twice the number of labels and the precision remains at 0.5.
The performance in the Val d'Alvra are worst, especially for the recall.
The expert concluded that the results were not satisfactory for this lithological type. The produced geometries are frequently misshapen and incoherent, and most of the time, they do not cover sinkholes. Several depression within fossil rock glaciers are notably detected. In addition, a significant number of sinkholes were not detected.

Figure 16: Sinkholes produced by the watershed method over the areas of molasse, represented through their centroid and tagged as TP, FP, or FN after a comparison with the ground truth.
Because of the little amount of labels and detections, it is hard to draw conclusions for the area of the Piz Curtinatsch. For the area of Val d'Alvra, we note a high number of false negatives in the north part and of false positives in the south part (Fig. 16). Most of the false negatives are small sinkholes or sinkholes with a very gentle curvature. The false positives are depressions in the rough terrain of the moraine walls or of the beads of rock glaciers.
9.2 Discussion¶
The metrics are average, but the visualization of the results (Fig. 16) show that there is no correspondence between the ground truth and the produced geometries. The results for the other methods were similar. The tested methods are not adapted to this kind of topography.
10 All areas¶
Finally, even if the AOI show different characteristics, processing of each methods on all AOI at once was tested.
10.1 Results¶
No optimization was performed on the parameters of the stochastic method for all AOI, because the processing time was too long.
Precision | Recall | F2 score | |
---|---|---|---|
IGN | 0.24 | 0.45 | 0.38 |
Watershed | 0.15 | 0.49 | 0.34 |
Level-set | 0.14 | 0.45 | 0.31 |
Stochastic | - | - | - |
Table 8: Metrics for each method over all the AOI with the results for the best F2 score in bold.
Considering the final F2 score, no method achieves a good quality (Tab. 8). The best method is the IGN's one with the highest precision and F2 score. The watershed method has the highest recall, but the precision is low and it achieves only a F2 score of 0.34.

Figure 17: Metrics per AOI for all the AOI with the IGN's method.
The F2 scores are worst for all AOI with the general parameters than with parameters specific to the lithological type, except for the AOI of Sissach. For this last region, we note an increase of 0.08 points (Fig. 11 & 17). For all the other, the F2 score drops by more or less 0.15 points (Tab. 2, 3, 4, and Fig. 11, 13, 15, and 17). In most of those AOI, the precision improved while the recall dropped compared to the results with the use of the specific parameters.
In general, it can be stated that the AOI exhibiting favourable metrics within specific parameters continue to demonstrate superior performance in comparison to those exhibiting unfavourable metrics. The ranking of the AOI results would remain relatively unchanged.
10.2 Discussion¶
The use of specific parameters has been tested. Simplifying the algorithm by using the same parameters for all AOI would deteriorate significantly the results.
Controlling the metrics per AOI confirmed that there is too much diversity in the AOI of marls on karst. The global parameters performed better in the region of Sissach than the specific parameters for this lithological type. Subdividing this type and performing new specific optimizations might improve the results.
11 Discussion¶
Lithological type | Best precision | Best recall | Best F2 score | Selected method |
---|---|---|---|---|
Bare karst | IGN | Level-set | IGN | Level-set |
DLR | IGN | Watershed | Watershed | IGN |
Evaporites | Stochastic | IGN | IGN | IGN |
Marls on karst | IGN | Level-set | Level-set & watershed | Watershed |
Molasse | IGN | Watershed | IGN & watershed | Watershed |
VSRLM | IGN | Level-set | Watershed | Watershed |
All areas | IGN | Watershed | IGN | - |
Table 9: Best method for each lithological type based on precision, recall, or F2 score, and selected method based on metrics and a visual inspection.
After analysing the results across all lithological types and AOI, we note that the IGN's method always achieves a higher precision than the watershed and level-set methods (Tab. 9). This two methods tend to detect too many depressions, especially when infrastructures or rivers are present in the area. However, they generally have a better recall than the IGN's method. We also note that the visual inspection showed that the results of the two methods are close and show little differences in their results. It is not suprising as they both rely on the same function for the detection of sinks.
The stochastic method has a long computation time and is always performing worse than the other methods. While the recall is on par with the other methods, the precision is lower. There are exceptions: it has the best precision and the lowest recall in the AOI with molasse, and the results that are globally worse for the AOI with VSRLM.
In the end, the IGN's method has the best metrics for two lithological types out of six and was selected as the best only for the area of evaporites.
Since the ground truth was not complete and we do not have a clear baseline, it is difficult to know if the tested methods really improved the situation for the cartography of sinkholes. We can however state that the achieved results are average at the very best, and that no method stands out as the best for a particular type of lithology.
The expert informed us that all the articles the methods are based on were tested initially on karstik plateaus. This type of topography with a less rough terrain and bigger sinkholes is quite rare in Switzerland. The closest type in the current project would be the marls on karst areas and more specifically the AOI of Sissach. Considering that the Swiss landscape is different from the ones tested in the literature review, tuning the parameters of existing algorithms is probably not a fit solution. The development of a new method would then be necessary.
However, no amount of development and tuning would allow to achieve good results as long as the object is not clearly defined and digitized in a ground truth. The quality of the ground truth has a visible impact on the metrics. The tested algorithms performed better for the AOI with a good ground truth, namely Le Chasseral, Gryon, les Verrières, and S-Chanf. In many of these areas, the expert complained that very small sinkholes were not detected. However, the recall remains good, because those very small sinkholes were never digitized in the ground truth.
This project had two goals, the standardization and acceleration of sinkhole mapping. For the first one, the expert established the ground truth. However, the limited time he could dedicate to the task and the difficulty to control the results in the terrain made it impossible to have an exhaustive and coherent dataset. This fact led us to use simple algorithms that we could understand well and tune easily, but even with a good control of the methods, we could not achieve good results. Using machine learning and training a model in these conditions would probably lead to even worst results, because it can not learn properly from such a dataset. The standardization of the sinkhole definition can not be delegated to a machine and need to be executed beforehand by an expert if we hope to achieve good results in the automatization.
The second goal, the automatization, could not succeed since no proper standardization was done beforehand. Some detections were produced and the expert judged their quality as satisfactory in some areas. Yet he still recommends a manual control to validate the results. With the lack of standardization, the manual control could nullify the little time economy and standardization that the automatization brought.
12 Conclusion¶
Sinkholes are important landmarks in the Swiss landscape. Their cartography should be performed regularly as they evolve over time, especially around inhabited areas2. However, it is difficult and time-consuming. As a results, their mapping lacks consistency across the different regions of Switzerland and across the various products of swisstopo, because it can not be performed by a single expert. The goal of this project was to standardize and fasten the cartography of dolines thank to methods of automatic detection.
Four different methods were tested and for each, its parameters were tuned to achieve the best F2 score on the ground truth. The results were assessed through metrics and through the visualization by the expert. The expert's review was in general in accordance with the determined metrics. The results were judged as good enough to know the zones dense in sinkholes for six out of eleven AOI. For the other ones, the results were unsatisfactory. It was noted that small sinkholes were missed in all AOI.
No satisfying automatisation can be achieved as long as the ground truth is not exhaustive and coherent. In this project, no clear definition of sinkholes, applicable to digitization through aerial views, was established and not comprehensive ground truth could be produced. The first priority to improve the results would be improving the ground truth. Then, the tuning of the test methods could be performed again, but we noted that the methods tested here may not be suited for the Swiss geology. Maybe, a new method should be developed or deep learning could be used. As deep learning on RGB imagery has little chance to work properly, but three bands based on the DEM processing might be a good choice. For example, the sky-view factor, the openness, or the local dominance could be considered9.
References¶
-
Tamás Telbisz, Tamás Látos, Márton Deák, Balázs Székely, Zsófia Koma, and Tibor Standovár. The advantage of lidar digital terrain models in doline morphometry compared to topographic map based datasets – Aggtelek karst (Hungary) as an example. Acta Carsologica, July 2016. Publisher: The Research Center of the Slovenian Academy of Sciences and Arts / Znanstvenoraziskovalni center Slovenske akademije znanosti in umetnosti (ZRC SAZU). URL: http://ojs.zrc-sazu.si/carsologica/article/view/4138 (visited on 2024-09-17), doi:10.3986/ac.v45i1.4138. ↩↩↩
-
Qiusheng Wu, Chengbin Deng, and Zuoqi Chen. Automated delineation of karst sinkholes from LiDAR-derived digital elevation models. Geomorphology, 266:1–10, August 2016. URL: https://linkinghub.elsevier.com/retrieve/pii/S0169555X1630280X (visited on 2024-08-26), doi:10.1016/j.geomorph.2016.05.006. ↩↩↩
-
C. Bauer and T. Wagner. From landform inventories to landscape evolution? - Karst development in the Central Styrian Karst (Austria). Geomorphology, 447:109024, February 2024. URL: https://linkinghub.elsevier.com/retrieve/pii/S0169555X23004440 (visited on 2024-05-22), doi:10.1016/j.geomorph.2023.109024. ↩
-
Guillaume Touya, Hugo Boulze, Anouk Schleich, and Hervé Quinquenel. Contour Lines Generation in Karstic Plateaus for Topographic Maps. Proceedings of the ICA, 2:1–8, July 2019. URL: https://ica-proc.copernicus.org/articles/2/133/2019/ (visited on 2024-09-16), doi:10.5194/ica-proc-2-133-2019. ↩↩
-
Jaroslav Obu and Tomaž Podobnikar. ALGORITEM ZA PREPOZNAVANJE KRAŠKIH KOTANJ NA PODLAGI DIGITALNEGA MODELA RELIEFA = Algorithm for Karst Depression Recognition Using Digital Terrain Model. Geodetski Vestnik, 57(2):260–270, 2013. URL: https://epic.awi.de/id/eprint/33195/. ↩
-
Tamás Telbisz. Lidar-Based Morphometry of Conical Hills in Temperate Karst Areas in Slovenia. Remote Sensing, 13(14):2668, July 2021. URL: https://www.mdpi.com/2072-4292/13/14/2668 (visited on 2024-11-22), doi:10.3390/rs13142668. ↩
-
Tamás Telbisz, László Mari, and Balázs Székely. LiDAR-Based Morphometry of Dolines in Aggtelek Karst (Hungary) and Slovak Karst (Slovakia). Remote Sensing, 16(5):737, February 2024. URL: https://www.mdpi.com/2072-4292/16/5/737 (visited on 2024-05-23), doi:10.3390/rs16050737. ↩
-
Špela Čonč, Teresa Oliveira, Ruben Portas, Rok Černe, Mateja Breg Valjavec, and Miha Krofel. Dolines and Cats: Remote Detection of Karst Depressions and Their Application to Study Wild Felid Ecology. Remote Sensing, 14(3):656, January 2022. URL: https://www.mdpi.com/2072-4292/14/3/656 (visited on 2024-05-22), doi:10.3390/rs14030656. ↩
-
Žiga Kokalj and Ralf Hesse. Airborne laser scanning raster data visualization: a guide to good practice. Number 14 in Prostor, kraj, čas. Založba ZRC, Ljubljana, 2017. ISBN 978-961-254-984-8. ↩↩
-
Julien Berthe, Olivier Lejeune, and Nicolas Bollot. Imagerie LIDAR et cartographie semi-automatique de l’exokarst de la Montagne de Reims (Champagne, France). Karstologia, pages 41–48, 2021. URL: https://hal.science/hal-03543184/document. ↩
-
Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. Anchorage AK USA, July 2019. ACM. URL: https://dl.acm.org/doi/10.1145/3292500.3330701 (visited on 2024-11-29), doi:10.1145/3292500.3330701. ↩
-
J.B. Lindsay. Whitebox GAT: A case study in geomorphometric analysis. Computers & Geosciences, 95:75–84, October 2016. URL: https://linkinghub.elsevier.com/retrieve/pii/S0098300416301820 (visited on 2024-12-10), doi:10.1016/j.cageo.2016.07.003. ↩↩
-
Qiusheng Wu, Charles R. Lane, Lei Wang, Melanie K. Vanderhoof, Jay R. Christensen, and Hongxing Liu. Efficient Delineation of Nested Depression Hierarchy in Digital Elevation Models for Hydrological Analysis Using Level‐Set Method. JAWRA Journal of the American Water Resources Association, 55(2):354–368, April 2019. URL: https://onlinelibrary.wiley.com/doi/10.1111/1752-1688.12689 (visited on 2024-08-26), doi:10.1111/1752-1688.12689. ↩
-
Qiusheng Wu. Lidar: A Python package for delineating nested surface depressions from digital elevation data. Journal of Open Source Software, 6(59):2965, March 2021. URL: https://joss.theoj.org/papers/10.21105/joss.02965 (visited on 2024-12-10), doi:10.21105/joss.02965. ↩