Skip to content

DSM2DTM: Now Your See It…now You Don’t

After years of frustrating trial-and-error we have finally found a technique to automatically convert digital surface models (DSMs) – generated using optical stereo images (i.e. using photogrammetry) – to digital terrain models (DTMs).

In case you are unfamiliar with the terminology, a DTM is a digital elevation model (DEM) that represents the elevations (heights) of the “bare earth” surface (terrain), whereas a DSM includes the heights of surface features (e.g. buildings and vegetation). DSMs are the raw products of digital photogrammetry, for instance GeoSmart’s DEMSA2 L1 product is a 2m resolution DSM. Figure 1 shows the DEMSA2 L1 product (left) compared to an aerial photograph (right) from which it was generated (notice that the radiometric quality of the aerial photograph is not ideal, with a noticeable horizontal difference in brightness between flight rows).

Dsm Hs 1

Figure 1 – DEMSA2 Level 1 raw DSM (left) and aerial photograph (right).

Although DSMs are great for some applications (e.g. telecommunication planning), many applications (e.g. construction planning) require DTMs. Some applications (e.g. 3D modelling of cities) prefer normalized DSMs (nDSMs) that only represent the heights of surface objects. nDSMs are generated by subtracting the DTM from the DSM, so one needs both a DSM and a DTM. So there is a need for methods that can convert DSMs to DTMs, which involves identifying surface features and then removing them to reveal the bare earth (terrain).

Figure 2 – Conceptual difference between a digital surface model (DSM) and a digital terrain model (DTM), By Yodin – Based on File:DTM DSM.png by User:MartinOver., CC BY-SA 4.0,


Although a range of DSM2DTM methods exist, they invariably make use of filtering techniques (e.g. regional statistics) to reduce the variation in a DSM, where high variation is assumed to represent surface features. However, such techniques require the operator to set the size of targeted surface features. For instance, in an industrial area the targeted objects are large (e.g. factories), whereas in a residential area the targeted objects are small. In rural areas an individual tree is small or a hedgerow (of trees) is narrow, while forests can be very large. These parameters need to be modified for each type of landscape.

Another problem with filtering techniques is that they often inadvertently remove important topographic information, such as hillcrests and mountain ridges, because these are also regarded by the filters as surface features (elevation variation). Filter approaches also tend to create a range of artefacts, such as pseudo-terraces in sloped areas.

LiDAR (point cloud) tools are often used for DSM2DTM conversion, but converting LiDAR data to a DTM is much easier because – unlike in photogrammetry where only the top of features are recorded – there are often multiple laser “returns”, or points, per unit area (pixel). In forested areas, some of these points penetrate the tree canopies and reach the ground. These so-called “last returns” can be used to produce a DTM. There are also a range of other information (e.g. return intensity) to work with, which is not available when a DSM is produced using photogrammetry. Nevertheless, LiDAR tools can be used to identify most surface features. Unfortunately the resulting DTMs often contain large quantities of errors such as surface features that were missed or terrain features that were erroneously removed, which requires extensive (costly) manual editing to fix.


The solution is a combination of machine learning, object-based image analysis (OBIA), and geospatial modelling. The first step is to find sample pixels that represent the ground (bare earth). This can be very tricky because some ground objects have very similar characteristics to surface features. The commission error (surface features that are incorrectly classified as ground) of ground samples needs to be 0% otherwise the resulting DTM will contain surface features, while the commission error of surface features (ground features that are incorrectly classified as surface features) should be kept as low as possible. We make use of a combination of machine learning and expert rules to identify ground samples at locations that best represent the surface (e.g. in valley bottoms and on crests). For instance, the figure below shows selected ground samples (red) overlain onto the DSM (left) and aerial photograph (right).

Dsm Hs With Ground
Aerial With Ground

Figure 3 – Ground samples overlain onto the DSM (left) and aerial photograph (right).


The next step is to remove all surface features by interpolating from the ground samples. Most interpolations assume that there is no topographic variation between samples, which is in reality is often not the case. For instance, the forest in the figures above is in a valley bottom and if interpolation is used the valley bottom will be flattened. This is demonstrated in the figure below, which shows the interpolation results for inverse distance weighting (left) and spline (right) interpolations respectively. Both methods produce noticeable artefacts, although spline has the ability to extrapolate (the range of interpolated values can be outside of the range of the input samples) which produces a more realistic and smooth surface in most areas. Although the artefacts at the forest are more noticeable in the spline interpolation, the resulting DTM is more accurate than the one generated using inverse distance weighting (IDW), which has many flat terraces.

Dtm Idw Hs
Dtm Spline Hs

Figure 4 – Interpolated result from using inverse distance weighting (left) and spline (right).


To avoid such interpolation artefacts, we developed an object-based interpolation technique that takes the topographic variation into account. In essence, the technique “senses” (infers) the ground elevations (e.g. underneath the forest) by considering the variation in the tree canopy height and the variation of the actual topography. The figure below shows the original DSM (left) and the result of our modelling (right), using exactly the same ground samples as for the IDW and spline interpolations. Magic!

Dsm Hs
Dtm2 Hs

Figure 5 – Original DSM (left) compared to the DTM that was produced using our object-based interpolation method (right).

The technique even automatically detects and smooths out water bodies, fixing the blunders (failures in extracting elevations) that often occur in DSMs (see top right corner). Some editing is still required here and there (e.g. some downslope edges of forests are still noticeable), but this technique effectively removes 95% of all surface features and anomalies caused by water bodies, which saves a LOT of editing time.

SU Corporate Horizontal With Slogan RGB 01
Geo Web Artboard 11
Back To Top