Deep learning has been used in building footprint delineation, road extraction, coastline delineation, among others, and the focus is on accurate boundary delineation. Below are the main-stream directions I am aware of, several of them appear in the SOA, some of them in daily experiments. Convincingly suggesting the most optimal methods or the combination of them for an ultimate solution may come soon.

1. Use Learning Attraction Field Representation.
The method is proposed for line segments detection Learning Attraction Field Representation for Robust Line Segment Detection, which reformulate the problem as a “coupled region colouring problem” [1].
2. Use more boundary-specific loss function.
Loss functions play an essential role in machine learning, lots of loss functions have been proposed, but it is still needed to comprehensive evaluate them:
As a first attempt and for binary segmentation, one can try RMSE on distance metric. Boundary-specific loss functions are proposed in:
2.1 Boundary Loss for Remote Sensing Imagery Semantic
2.2 Boundary loss for highly unbalanced segmentation
3. Extract boundary first with a conventional edge detection algorithm, use it as a feature input for training.
This simple addition has been proposed by a colleague and he obtained an incredible improvement in IOU, from around 55% to 62% in his study case of building detection. This really calls for a comparison between all the other more complex methods: what are the REAL reasons behind the improvements? Many people get increasingly disappointed by current publications as new methods are published with improvements seemingly a matter of chance and without linking to other possibilities.
4. Binary segmentation as an edge detection problem
Current deep learning applications in remote sensing image classification is mostly with image segmentation. Vector labels are commonly rasterised for training, this does NOT have to be the case! For binary problems such as building footprint delineation, one can turn the problem back to the edge detection solutions, this opens a new door of opportunities. For example, crisp edge detection below:

Credit: figure from Huan et al., Unmixing convolusional features for crisp edge detection.