## . Problem of defects in geodetic network

The numerical elaboration of a geodetic network usually consists of two stages. At first, the initial diagnostics of data is performed, which allows detection and removal of various types of observational errors and geometric defects. Secondly, the final elaboration of the network is performed, including adjustment of observations and calculation of the coordinates of points. Defects in a geodetic network, that is, situations requiring identification and elimination, can be divided into several types, as shown in Figure 1.

A metric defect in observations (outliers) means that the absolute value of the observational correction (error estimator) exceeds a certain threshold level (depending on the adopted probabilistic model and the significance level). Statistical methods are used to identify such errors, but the correct result depends on the network's local reliability (local redundance) (Baarda, 1968; Prószyński and Kwaśniak, 2019). The reliability of a geodetic network is related to the conditioning of the normal matrix in the problem of adjustment of the geodetic network using the least-squares method. In general, the point here is that in ill-conditioned systems, detection of a possible metric defect (outlier) can be difficult or impossible. The difficulty in identifying outliers is also due to the principle of least squares because this method causes some kind of ‘spilling’ of errors over the network elements located in a specific vicinity of the outlier observation. In addition, the identification task becomes more difficult as the size of the observational system increases. Therefore, other rules for observation adjustment are formulated, and the so-called robust estimation, in contrast to the averaging feature of the least-squares method, causes some differences between observations. These replace the least squares function by an objective function with spline components (Hampel (1971); Huber (1964) and many others’ proposals) or based on regular, but non-square components of the objective function (Holland and Welsch, 1977; Kadaj, 1978, 1988). A comprehensive review of the robust estimation methods can be found in the book by Wiśniewski (2013).

The outlier phenomena may be related to the coordinates of the tie points. A straightforward method of detecting such errors was presented by Prof. Hausbrandt (Hausbrandt, 1954; Janusz, 1958). It assumes that the coordinate reference points are treated as pseudo-observations with specific mean errors, estimated on the basis of information on the accuracy of the determination of points in the network of the appropriate class. The magnitude of the correction to such pseudo-observation indicates a possible existence of an outlier. The principles of robust estimations can also be used for the same purpose, although the Hausbrandt method seems to be sufficiently effective.

This article is not a complete dissertation on the above subject. We will focus mainly on the problem of detecting configuration defects, which – as shown by practical experience – are a very common phenomenon and difficult to identify, especially in large data sets.

The geometric (topology) defect (*u* = *u*_{ref} + *u*_{obs}) can be classified into two types, namely datum defect (missing of reference elements) and configuration defects (network observational structure defects). The first type implies that a certain number *u*_{ref} of elementary data is missing to determine the position of the object (network) in the coordinate system. It may be a situation intended to perform the so-called free network adjustment (Mittermayer, 1972) and may also be the result of an unintentional mistake. A configuration defect means the lack of a certain amount (*u*_{obs}) of observations for network determination. The reason may not only be the omission of certain observations, but also numbering errors of points in measurement plans. While the reference defect is easily identifiable, as it is a matter of several parameters, the configuration defect is usually difficult to spot, especially in the case of geodetic networks of considerable size in terms of the number of points and observations. The main topic of this publication is the ability to identify network configuration defects automatically.

## . The functional and stochastic model of a geodetic network

The functional model of a geodetic network can be presented in the form of a vector equation (*col* = single-column matrix):

=*L**col*[*L*:_{i}*i*= 1, 2, . . . ,*m*] is an observational vector, with some covariance matrix*C*,=*P**C*^{−1}is the weight matrix,=*V**col*[*V*:_{i}*i*= 1, 2, . . . ,*m*] is a vector of unknown corrections,=*X**col*[*X*:_{j}*j*= 1, 2, . . . ,*n*] is a vector of unknown parameters (point coordinates),(*F*) =*X**col*[*F*(_{i}) :*X**i*= 1, 2, . . . ,*m*] is a vector function of the vector.*X*

Let *X*_{0} denote the vector of approximate coordinates – the approximation of the vector ** X**. The linearized Equation (1) has the form:

= (*A**d*/**F***d*)**X**_{(0)}= [*a*]_{ij}_{(m×n)}; (*m*≥*n*);*a*= (_{ij}*dF*)_{i}/dX_{j}_{(0)}=*x*−*X**X*_{0};=*l*−*L*(*F**X*_{0})= vector of observational corrections in the linearized system (2).*v*

The least-squares solution for Equation (2) is the minimization of the function:

From the necessary condition for the extreme, we get the normal equation:

where:If *u* = *u*_{ref} + *u*_{obs} > 0, then the normal equation is singular, generating a linear subspace of solutions. In particular, if *u*_{ref} > 0 (datum defect) and the value of the defect is known, while *u*_{obs} = 0, then it is possible to apply ‘free adjustment’ (e.g. Mittermayer (1972)) using the so-called pseudo-inversion (generalized Moore–Penrose inversion; see more details in Bjerhammar (1973); Moore (1920); Penrose (1955)). An important feature of such a situation is that the size of the defect is usually known in advance. A completely different case occurs when a defect is unknown, both in terms of its value and its situational location in the network. We only know that it exists because the matrix of the system of equations is singular. Detection and localization of such a defect are the main issues of this work.

## . Least-squares adjustment in case if a defect is known

In the case of a non-zero and known defect, an unequivocal solution of the normal equation (4) may take place after applying an additional condition (conditions) to the vector of unknowns *x*. The universal condition in applications has the following form:

Then the unequivocal solution is written in matrix notation using the so-called generalized inversion (pseudo-inversion or Moore–Penrose inversion; see Moore (1920); Penrose (1955)):

The *B*^{†} pseudo-inversion, where the solution (7) satisfies the condition (6), is defined as follows (Deutsch, 1965). Let ** M**,

**be matrices of the size (**

*N**r*×

*n*) and

*rank*(

**) =**

*M**rank*(

**) =**

*N**r*and satisfy the condition

**=**

*B*

*M*^{T}·

**. Then**

*N*In particular, if we assume ** M** =

**=**

*N***then**

*R***=**

*M*

*R**·*

^{T}**and**

*R*The above condition is fulfilled by the ** R** matrix resulting from the Cholesky–Banachiewicz factorization of the

*B*matrix. It has a ‘trapezoidal’ form (Figure 2).

Note that in the case of zero defect (*u* = 0) *B*^{†} = *B*^{−1}, then formula (9) takes the form:

Another way of determining pseudo-inversion is the spectral decomposition of the normal matrix. This matrix is symmetric and non-negative definite, so the spectral decomposition takes the form:

where0

_{1}= 0_{2}= . . . = 0_{u}= 0,= matrix of eigenvectors as an orthonormal matrix:*S**S*·^{T}=*S*,*I*Λ = diagonal matrix of eigenvalues: λ

_{1}≥ λ_{2}≥. . . ≥ λ_{r}> 0,

Thus, using formula (8) or (9), we can adjust the network if the defect value is known. Typically, this situation is related to the datum defect and is intentional in the aspect of the so-called free network adjustment, consisting in the optimal fit of the network into a figure defined by approximate coordinates with simultaneous adjustment of the observations (Mittermayer, 1972). However, a troublesome problem may be if a non-zero defect of a network is unknown and is a result of mistakes in the network adjustment process. This will be mainly a network configuration defect (*u*_{obs} > 0); however, missing observational data or errors in point numbering may also result in missing reference elements, that is, datum defect. Usually, such a situation is of an error nature, for example, as a result of accidental omission of certain observations in the data sets, necessary to obtain the effect of network determination, or as a result of errors of point identifiers in the observation plans of the geodetic network.

Similar to gross observation error, a network configuration defect will not be generally easy to identify and locate, especially in large networks with a variety of observation data. For this purpose, procedures using the exact pseudo-inversion formula (8) or (9) will not be (in principle) effective. First of all, they require knowledge of the defect value itself, not to mention its location in the network. Is it possible, then, to resolve the above issues in an ‘automatic’ manner?

The problem of detecting and locating configuration defects can be solved using the Tikhonov regularization method. This method was formulated to solve singular, linear algebraic equations (Tikhonov, 1963, 1965). It should be mentioned here that the problem of regularization also refers to the original works: Levenberg (1944); Marquardt (1963, 1970); Phillips (1962). Phillips’ work formulated the problem of regularization when solving integral equations. The work of Levenberg and the first of Marquardt's works concerned the iterative solution of the systems of non-linear equations as a method optimising the convergence of the iterative process. Another work by Marquardt (1970) already referred to the problems of regularization in the iterative process.

## . Tikhonov procedure as alternative to generalized inversion and as a method for detecting configuration defects

### General formula for Tikhonov regularization

Tikhonov regularization (Tikhonov, 1963, 1965) is based on the condition:

from where: where*α*> 0 is a ‘small’ number on which the difference between solutions (14) and (7) depends.

If *α* decreases to zero, then solution (14) approaches (7) (Bjerhammar, 1973):

*B*^{†}pseudo-inversion. This approximation is the matrix based on a theorem, compatible with (15):

(matrix ** B** and vector

**are expressed by the formulas (4) and (5)).**

*w*The *Q* and *Q _{α}* matrices are also theoretical covariance matrices of the estimated parameters (coordinates of the network points), respectively, for the exact and approximate (regularized) solution of the system, assuming that the standard deviation of the observation with a unit weight

*s*

_{0}= 1.

### Influence of the regularization parameter on results of Equation (13)

By subtracting the normal equation (4) (least squares) from the normal equation (14) (least squares with Tikhonov regularization), we will estimate the size of the difference between the solutions using the relative error (η):

where: ‖ . ‖_{E}is the vector Euclidean norm (associated with the spectral norm ‖ . ‖

_{S}of the matrix) and λ

_{r}is the smallest non-zero eigenvalue of the matrix

*B*. Then,

**=**

*Q*

*B*^{†}. We can see that the relative error depends significantly on the value of the regularization parameter in relation to the smallest, non-zero eigenvalue. In the literature, e.g. Bell and Roberts (1973), we also find the estimate of η ≤

*α*(λ

_{r}+

*α*)

^{−1}, which, including the condition

*α*≪ λ

_{r}, should not differ significantly from (20). In any case, however, the estimate using the eigenvalue λ

_{r}is practically not verifiable unless we know the spectral decomposition of the matrix

*B*. Such an operation is not necessary as it is too expensive. The value of

_{x}. From the trace

*Tr*properties of the matrix, we have:

We will now return to (20) using the last estimate:

If the relative error was to be smaller than η_{max}, the parameter *α* should be taken as

For example, let η_{max} = 0.01, μ_{x} = 0.01 m, *n* = 1000 (number of unknowns), then *α* < 0.001.

The parameter *α* should also be limited from the bottom due to the rounding error in a digital machine (Wilkinson, 1994). For adding *α* to the diagonal element of the matrix ** B** to be a significant operation, its value should exceed the rounding error level of the diagonal elements of the matrix. Taking into account (23), we will get a selection range (Kadaj, 1979):

*b*is the diagonal element of the matrix

_{ii}*B*and

*t*is the number of significant binary digits in the floating-point notation of a real number.

An estimation of the relative error of the Tikhonov regularization method concerns the case of a one-time solution according to formula (14) or, otherwise, formula (13) in relation to formulas (4) and (6). In practice, regularization is used in iterative processes that correct a one-time solution (e.g. Bakushinskii (1992); Bell and Roberts (1973); George (2010); Hanke and Groetsch (1998); Marquardt (1963); Mehsner (2013); Scherzer (1993)). In particular, this applies to non-linear least-squares tasks using the Gauss–Newton iterative procedure, and thus to the adjustment of geodetic networks also.

Regardless of whether the solution (14) is used once or is refined in the iterative process, the covariance matrix of the unknown vector determined by formula (13) is approximate. The assessment of the relative error of this matrix in terms of the relative increase of the spectral norm leads to a restriction on the parameter *α* analogous to (24), but with a ‘sharper’ upper limit (Kadaj, 1979):

Using Tikhonov's regularization, is it possible detect and locate a configuration defect in a geodetic network? The covariance matrix *Q** _{α}* expressed by formula (16) and the pseudo-inversion

*B*^{†}do not contain information that would allow for the identification and location of the defect. Such information is provided by the

*u*is the network defect. Since

*α*is a small number in relation to λ

_{r}, the approximate equality holds

*C*

_{1}≈

*C*

_{2}≈

*C*

_{3}and the dominant big value component

*u*·

*α*

^{−1}appears only in the trace of the matrix

*, or more precisely in the elements corresponding to the coordinates of undetermined points. The configuration defect is then identified in large values of the mean errors a posteriori of the respective coordinates (the diagonal elements of the*

**B**_{α}### Selecting the parameter *α* for detection and location of a configuration defect

Let the regularization parameter *α* be interpreted as
_{c} denotes a priori expected mean coordinate errors for the unknown parameters (point coordinates). In a probabilistic and statistical sense, it would also mean applying the idea of Bayesian estimation. Contrary to the commonly used Gauss–Markov model for observation adjustment, Bayesian estimation additionally uses a priori information (stochastic model) for the unknown parameters. In our case, on the one hand, it will free the task from the defect and singularity of the normal matrix, while on the other hand, it will allow to identify missing observations or those given with erroneous point numbering. In the general model of Bayesian estimation (e.g. Bossler (1972)), instead of the scalar matrix *α* · ** I** used here, a full

*covariance matrix may appear as additional a priori information about the determined unknowns.*

**C**_{X}Suppose the matrix of a system of normal equations *B*_{(n x n)} is singular due to a network defect, rank(** B**) =

*r*<

*n*. Assume, for example, the coordinates of all points determined in the network μ

_{x}= μ

_{y}= μ

*c*= 100 m, that is, with the weight

*α*= 0.0001 do not significantly affect the results of the network adjustment, we make the non-singular system of normal equations (the network becomes determinable, and the normal matrix becomes

*=*

**B**_{α}**+**

*B**α*and is significantly positively defined with the order

**I***r*=

*n*). Let us add that all eigenvalues (including zero values) of matrix

**are increased by the added parameter (λ**

*B*_{i}= λ

_{i}+

*α*;

*i*= 1, 2, . . . ,

*n*), and the inverse matrix

_{i}= 1/(λ

_{i}+

*α*).

In our case, the upper limit results from the assumption that the diagonal elements of the inverse
*u*-defect), then the diagonal elements of the matrix (assuming the use of standardized observations with the unit mean error value a priori) would be the squares of the coordinate mean errors. For example, for a horizontal geodetic network, the use of the regularization formula means that we add the equations *dx* = 0, *dy* = 0 to the existing system (in the previously adopted notations, *dx*, *dy* are some components of the vector *x*) with assumed a priori (artificially) large standard deviations (mean errors), for example, 100 m, and with appropriately small weights (as above) to reveal undetermined points with sufficiently large values of the resulting mean errors. For this purpose, we assume the condition for selecting a regularization parameter (left part as in (24)):

_{max}is the expected maximum (approximately) error of the coordinate of the determinable point in the network,

*k*is the multiple of the assumed mean error for pseudo-observation

*dx*= 0,

*dy*= 0, for example,

*k*= 10, 100, 1000, . . . (see attached numeric example) and

*t*is the number of significant binary digits of a real number.

## . Illustrative example of detection and location of configuration defects in a large traverse geodetic network

The following example illustrates the possibility of detecting configuration defects using the Tikhonov regularization. The object of the test is a relatively large geodetic network with a traverse structure (Figures 3 and 4). The network includes 6227 points with 448 reference points, 6298 angular observations, 1487 directional observations at 391 stations and 6829 length measurements.

In the professional adjustment of this network, the following standard deviations of observations were assumed: for directional observation: 28 [cc], for length measurements: constant standard deviation 0.016 m and proportional to the length of 0.0002 m/km. The average length of the side is approx. 350 m. In the resulting report, the estimated average standard deviation of the point was 0.038 m.

Artificial configuration defects were introduced in the network by excluding five angles at the vertices of the traverse marked in Figure 4. The classical least-squares adjustment algorithm signals the singularity of the normal matrix and the program is stopped. We know that the defect exists, but we are unable to locate it and quantify it. In practical situations, detecting configuration defects in such a large observation system would not be an easy task.

The Tikhonov regularization procedure becomes helpful here. In accordance with the example provided in section 4.3, we assume the parameter *α* = 1/10, 000 (according to the assumption: μ_{x} = μ_{y} = 100 m). Table 1 shows a fragment of the network adjustment report, where it is clearly visible that the resultant point error takes significantly large values just for the points for which there are deficiencies in the centre angles. These values, however, are smaller than 100 m because they occur in the vicinity of 5 points and the observations (angles and lengths) measured at the neighbouring points ‘work’ in a sense for the points with configuration defects.

##### Table 1

Table 2 shows the analogous results of the actual network adjustment (without defects). Comparing the errors of points that do not have a direct observational relationship with the ‘defect’ points in Tables 1 and 2, we find that the Tikhonov regularization practically does not change the accuracy parameters of the network, except for the defective places.

##### Table 2

[i] Designations as in Table 1.

The described method of automatic detection and localization of configuration defects, based on the Tikhonov regularization idea, was implemented in 1993 in the GEONET geodetic calculation system (www.geonet.net.pl). The test calculations presented in this paper were performed by means of the programs of this system.

## . Conclusions

The article presents the methodology of automatic detection of configuration defects in geodetic networks. This issue is important, especially in large networks, at the stage of initial measurement processing, where detection of possible data defects is usually a very labour-intensive task. Computer programs for adjusting geodetic networks ‘recognise’ the geometric structure of the network based on the identifiers (numbers) of points in the observation plans, in the list of approximate coordinates and in the list of tie point coordinates. In case of errors in point identifiers, the network structure is incorrectly defined. In particular, errors in identifiers may result in situations similar to the case of missing data.

Both the theoretical basis and numerous practical tests performed with the GEONET system programs show that the described method based on the Tikhonov regularization idea allows to automatically detect possible configuration defects in the geodetic network, interpreted by the computer program as gaps in data sets. In ‘raw’ observational data, as practice teaches, this will often be the result of errors in point identifiers.

When choosing a regularization parameter, we follow its pseudo-probabilistic (pseudo-Bayesian) interpretation. In this interpretation, the regularization parameter means very small value of the weight of the coordinate of the determined point (e.g. 0.0001), which corresponds to ‘excessively’ large value of the standard deviation of the coordinate of the determined point (e.g. 100 m). Such an assumption does not significantly affect the results of aligning the part of the network that is free from configuration defects, but allows to identify defect sites by showing large values of diagonal elements of the inverse of the regularized normal matrix.

The problem of configuration defects may occur in any real observation system, including hybrid networks (satellite-terrestrial, integrated). The given example of the classical network is only an illustration of the problem and its solution using the Tikhonov regularization.

Regardless of the possibility of applying the Tikhonov regularization, there are other methods supporting the identification of configuration defects in the geodetic network. One of them, implemented in the GEONET system, consists in the analysis of the topological structure of the network in terms of counting the number of independent observations that mark each point of the network in relation to other points of this network. However, it is easy to give examples of when such an approach will not work. It may be, for example, a network fragment not related to the rest of the network and not having tie points, but meeting the condition of correct (relative) determinability in relation to other points of this fragment.