Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (2024)

Astronomical time series often have non-uniform sampling in time, or irregular cadences, with long gaps separating clusters of observations. Some of these data sets are also explicitly non-Gaussian with respect to the expected model fit, or the simple mean. The standard Lomb–Scargle periodogram is based on the least squares solution for a set of test periods and, therefore, is easily corrupted by a subset of statistical outliers or an intrinsically non-Gaussian population. It can produce completely misleading results for heavy-tailed distribution of residuals. We propose a robust 1-norm periodogram technique, which is based on the principles of robust statistical estimation. This technique can be implemented in weighted or unweighted options. The method is described in detail and compared with the classical least squares periodogram on a set of astrometric VLBI measurements of the ICRF quasar IERS B0642+449. It is uniformly applied to a collection of 259 ICRF3 quasars each with more than 200 epoch VLBI measurements, resulting in a list of 49 objects with quasi-periodic position changes above the 3σ level, which warrant further investigation.

1.Introduction

The Lomb–Scargle periodogram calculation is a powerful technique designed to reveal and characterize the periodic components in observational data sequences, which finds a wide scope of applications. For a review of its properties and underlying assumptions from the user's perspective, see VanderPlas (2018). The need for this technique arises from the character of astronomical data (observational measurements), which are practically never evenly sampled in time. This makes the standard Fourier power spectrum analysis inapplicable for astronomical time series. Detection of orbiting exoplanets from precision radial velocities of host stars is one of the well-known use cases for the Least-Squares (LS) periodogram method (Hara & Ford 2023). The periodic component of the measured radial velocity sequence is caused by the reflex orbital motion of the host star orbiting the system's barycentre. The period of the main sinusoidal mode in the computed periodogram in this case estimates the orbital period of the planet, which often cannot be directly observed.

The periodogram method finds a somewhat less known application in precision astrometry of celestial bodies' positions. Binary stars with unresolved or dim companions have periodic signals, which are the harmonics of the orbital frequency, in either of the sky coordinates referenced to a fixed celestial frame. Given a significantly long and precise cadence of position measurements covering at least one orbital period, the more general approach is to directly fit a set of Kepler elements of the emerging explicitly nonlinear 2D model, which proves a daunting and ambiguous task in the presence of even a small admixture of statistical outliers (Goldin & Makarov 2006). A robust and reliable periodogram decomposition is a welcome alternative when a large amount of observational data has to be processed with a low output of true positives. The need for a resilient periodogram algorithm, which can produce meaningful results outside of the normal distribution of data points, also emerges in the interpretation of high quality photometric time series. Magnetically active stars, for example, often manifest complex structures of signals in their light curves with periodic modulation mixed with stochastic, unpredictable bursts of radiation (Makarov & Goldin 2017).

Our main goal for this study is to develop and test a modification of the classical 2-norm periodogram algorithm (also known as the Lomb–Scargle periodogram) based on the principles of robust statistical estimation. This algorithm is intended to be used for processing of a massive data base that includes single-epoch astrometric measurements of thousands of radio-emitting quasars with the geodetic Very Long Baseline Interferometry (VLBI) world-wide facility. The system of accurate positions of these sources constitutes the fundamental International Celestial Reference Frame (ICRF3, Charlot et al. 2020), which underpins all other derivative celestial and geodetic reference frames. The astrometric stability of the most frequently observed quasars is of crucial importance for the overall accuracy and stability of ICRF3. We therefore develop a method to determine if some of the ICRF3 sources manifest periodic signals in their celestial positions, which could emerge from dual orbiting black holes in their centers, as well as a number of other effects in the extended structures and jets (Makarov et al. 2012).

The need for robust statistical estimation techniques generally arises in astronomical data processing when the available data are ridden with a large fraction of outliers outside of the commonly assumed Gaussian distribution of errors, representing a heavy-tailed sample distribution. Examples of critically important applications can be found in the mutual orientation alignment of different celestial reference frames (Malkin 2021; Frouard 2023; Lambert & Malkin 2023), where common object show a high rate of position offsets with extremely low formal probabilities.

2.The Least-squares (2-norm) Periodogram

In the most general setup of the problem, our task is to mathematically analyze a given time series (observations) d(ti ), where periodic sinusoidal signals may be hidden. The data is discretized on a sequence of specific times of measurement ti , i = 1, 2,..., N. When the measurements are taken on a regular, equally spaced grid with time step Δt, the problem is solved by the direct Fourier transform and subsequent computation of the Fourier power spectrum. The spectrum is quantified on a grid of angular frequencies fk = 2 π/(kΔt), k = 2, 3,..., N, where the highest non-degenerate frequency f2 = πt is the angular Nyquist frequency. This set of frequencies is complete, because all other signals within the functional space spanned by the Fourier basis functions are not independent. In other words, the fitting function

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (1)

is the exact and unique representation of any sequence di with a zero mean. This is no longer true if the cadence {ti } is irregular. The Fourier harmonics are not orthogonal if sampled on an irregular cadence. In principle, this difficulty can be bypassed by constructing an ad hoc orthogonal basis from the Fourier harmonics using, for example, the Gram-Schmidt process, but the practical value of such representation is dubious, because the emerging fitting functions do not find a simple interpretation. However, we can disregard the issue of nonorthogonality and seek a solution to Equation (1) for a chosen fk . Under this generalization, the classical periodogram analysis is equivalent to the LS fitting of model (1) for a grid of trial periods pk = 2 π/fk (Scargle 1982). The emerging LS problems, for a specific pk , can be written as

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (2)

where the design matrix A has two columns with calculated sequences of Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (3) and Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (4) values, and the number of its rows is equal to the number of data points N. The right-hand part vector d is the vector of centralized observations, and the vector of unknowns x comprises the two coefficients ck and sk . Standard LS algorithms can quickly solve this system to obtain the solution vector

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (5)

Any standard LS algorithms can be used to solve this system. This has to be done for each trial period pk ≡ 2π/fk separately, including the setup of the design matrix A . In the example used in our paper, the number of trial periods Np = 1000, but in other applications, it can be up to O(105). This is not a problem for modern computers and LS algorithms, but in the 1970-ies, when the periodogram method started to attract astronomers' attention, the speed of computation was a crucial consideration. This motivated Lomb (1976) to propose a modification where the design matrix A is orthogonalized by introducing a phase shift τk in each of the fitting functions in (1), to the effect that the normal matrix A T A becomes diagonal. This modification has little practical advantage now but it brings in additional restrictions precluding necessary extensions of the model, as we will now discuss. Lomb's modification is therefore not recommended. A more accommodating and rigorous way of orthogonalization, if such a technical action is deemed desirable, was proposed by Ferraz-Mello (1981). Scargle (1982) also mentions that the spectral power of a pure noise data has a more predictable statistical distribution, but this argument is only valid for Gaussian noise with equal variances, which is never the case in practice.

3.Extended Periodogram Models

Astronomical data series d often include non-periodic components. Using the archetypical example of detection of exoplanet signals in precision radial velocity measurements of host stars, the expected additional terms include a constant offset (the systemic radial velocity of the exoplanet system) and possibly a linear trend from perspective acceleration or a distant binary companion. It is not recommended to estimate these terms separately and prior to the periodogram solution and subtract them from the original data, which, unfortunately, is often done in practice. The reason why this pre-processing leads to an error in the periodogram is that these terms are not orthogonal to the fitted sine functions on a non-uniform cadence of data points. Unlike the regular Fourier transform, the trial frequencies are not integer multiples (harmonics) of the time interval. If a sinusoidal signal is present in the data, its estimated amplitude or power will be affected by the biased estimate of the constant term. The only correct and consistent way of dealing with additional terms is to include them in the fitting model (Ferraz-Mello 1981; Cumming et al. 1999). For example, the fitting model suitable for exoplanet detection can be

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (6)

for each trial period pk . Note that a separate constant term x0 and a linear slope x1 are obtained for each trial period, and they are not equal for different trial periods. The variation of these terms with the trial period reflects the error introduced into the periodogram by subtracting the common terms a priori.

The periodogram estimate can be the amplitude of the fitted sinusoid

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (7)

or the power

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (8)

As discussed above, we are using the amplitude periodogram in this paper, which has a more intuitive interpretation as the amplitude of the periodic signal in the same units as the measurements. The linear condition equations still take the form (2), but the design matrix A now has four columns, and vector x includes four unknowns x0, x1, x2, and x3. The solution vector Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (9) is obtained from the LS solution, Equation (3). If a significant signal a(pk ) is detected, the corresponding values x0(pk ), x1(pk ) provide the best estimates of the constant term and the linear trend.

4.Statistical Uncertainties

What is the confidence level of a detected signal in the LS periodogram? This is an estimate of crucial importance, because the probability of the null hypothesis (that the detected feature is just a random fluke), also known as the false alarm probability (FAP), determines if we can believe the result. Traditionally, a high formal confidence is desired in astronomical applications such as detection of exoplanet signals, the recommended value being 0.997 (the 3σ level for a normal distribution). A robust method of estimating the FAP is the bootstrap simulation, which is also extendable to non-Gaussian distributions of measurement error. This is the method of "last resort" when the signal-to-noise ratio (S/N) of the detected signal leaves room for a catastrophic false positive. It is computationally expensive, however, and requires a sufficiently large number of data points. Monte Carlo methods, which are also computationally expensive, can be efficient when the statistical distribution of the observational noise is known. A random number generator is used to construct a sequence of synthetic measurements on the given sequence of times ti , then a periodogram solution is obtained for each realization of noise, and the signal amplitude Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (10) is computed. Repeating this process multiple times (O(103) is usually required for an accurate estimation) allows us to estimate the cumulative distribution function (CDF) of the posterior distribution of the periodogram amplitude at any trial period, and hence, the p-value of the null hypothesis (or FAP).

Here we describe a computationally efficient and direct method of confidence estimation for LS periodograms in the extended form (Equation (4)). Noting that the standard periodogram solution obtained from the LS adjustment per Equation (3) is already based on the assumption that the measurement error is normally distributed, a direct computation of the periodogram CDF can be performed. If the covariance matrix C d E[ d · d T ] is known or assumed, the corresponding covariance of the solution vector is

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (11)

where Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (12). It is often assumed that the measurements d are statistically independent, in which case the matrix C d is diagonal. If the errors also have the same variance σ2, this equation further simplifies to C x = σ2 C .

The solution covariance C x is a 4 × 4 symmetric matrix, which is computed for each trial period pk . We are mostly interested in the a(pk ) statistics per Equation (5). The two involved statistics x2 and x3, in accordance with the assumed normal distribution Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (13) for each data point, are binormal variates, whose covariance matrix C a is the corresponding 2 × 2 block of C x . The estimated vector Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (14) can be standardized to obtain

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (15)

so that Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (16) is a binormal uncorrelated variate of unit variance. This is equivalent to determining the error ellipse for binormal variates. The components of Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (17) can be interpreted as the upper and lower S/N of the given periodogram result. Consequently, the quadratic form

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (18)

is a χ2-distributed variate with 2 degrees of freedom. The corresponding confidence of rejecting the null hypothesis can be computed from the CDF of the distribution χ2[2] for each periodogram value. For a graphical representation of perodogram results, it is convenient to compare the confidence levels to specific points, which correspond to the ±1σ, ±2σ, and ±3σ intervals of the normal distribution, which have the cumulative probabilities of 0.683, 0.955, and 0.997, respectively. The corresponding levels of ψ (computed as CDFRobust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (19)) are 2.296, 6.180, and 11.829. Periodogram amplitudes with ψ-values above 11.829 can then be regarded as highly confident positive detections at a confidence level above 0.997.

In the exoplanet detection literature, an alternative method of FAP-estimation is often used, developed by Baluev (2008). It is also based on the assumption that the signal contains only a finite set of model (base) functions, and the random component of the data vector is pure Gaussian noise with the known standard deviations σi . The statistical significance of a single periodogram value can then be naturally estimated from the properly normalized difference of the reduced χ2 statistic of residuals with and without the corresponding harmonic terms (e.g., Equation (6) in Cumming et al. 1999), which follows the F-distribution (or beta-distribution if two or more specific periodogram frequencies are considered). However, while we have in practice a large number of periodogram value realizations, only the maximum value and the corresponding trial period are of interest. Even in the absence of detectable signal, given a large number of trials, it is probable that the highest significance value exceeds the threshold confidence level. This probability can be estimated within the extreme value statistic of an F-distributed hom*oscedastic random process assuming that the periodogram values are independent. This method is vulnerable to aliasing, which is caused by the limited spectral window of the given data series. The adjacent periodogram values are not independent, and periodogram features become increasingly wider "window functions" toward the longest trial periods. Non-uniform cadences with long gaps can also generate aliasing, spectral leakage, and spurious periodogram peaks. The extreme-value distribution fitting method (Süveges 2014) is more general for non-Gaussian processes, but it still refers to the null hypothesis of uncorrelated white noise in the data, which is inaccurate for the specific applications in this paper, or the spectroscopic detection of exoplanets (Makarov et al. 2009).

5.To Weight or not to Weight?

Astronomical time series often have unequal formal errors of individual data points. The formal error represents the expected standard deviation of the measurement, which can vary in a wide range because of observational conditions, instrument setup, etc. The LS solution in Equation (3), on the other hand, is unweighted, because it does not involve the estimated formal errors. The standard way of dealing with processing data of non-uniform precision is to use weighted LS fitting. It can be applied to LS periodogram analysis too in the framework of weighted periodogram solution. The basic equation replacing (2) becomes

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (20)

where the weight matrix Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (21). The covariance of the right-hand part is now the identity. The formal covariance of the periodogram coefficients of interest transforms from Equation (7) into

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (22)

The subsequent analysis of periodogram uncertainties is the same as described in Section 4.

We performed limited experiments using the formal weights on the example described in Section 6 to estimate the impact of this additional modification. We found rather limited changes in the computed periodogram amplitudes with the unweighted and weighted LS options. The most prominent features indicating possible signals have approximately the same shape and location. The greatest difference is found in the estimation of the 1σ and 3σ confidence intervals. The weighted covariance of the periodogram coefficients C a is generally much smaller for the weighted solution than for the unweighted solution. This is caused by a large spread of individual formal errors, and the fact that the weighted LS solution is optimal. If all the formal errors are equal, the periodogram covariances and the derived amplitudes become equivalent in the two solutions. Thus, the weighted covariance C a is the global minimum of all possible unweighted counterparts. The lower covariances result in narrower confidence intervals, and the net result is that most of the periodogram solution becomes a highly confident positive detection. This result is completely misleading for the given example, because, as we will see in the next Section, the formal errors of the data points have little bearing on the actual dispersion and statistical distribution of the data.

6.Why Do we Need Something Else?

Let us summarize the implicit assumptions involved in the LS periodogram method.

  • 1.

    The data vector is a composition of random uncorrelated noise and a single monochromatic sinusoidal signal, whose amplitude and period are to be determined.

  • 2.

    The measurement noise is Gaussian.

  • 3.

    The data sequence is centralized, i.e., has a zero mean—unless the extended version of the method is employed.

An example when the second assumption is violated can be found in (Makarov et al. 2010). The astrometric position (photocenter) of the Sun as measured by a distant observer is subject to stochastic variations caused by the presence of sunspot groups and bright plage areas on the rotating surface. Each photometric feature generates a time-variable shift of the unresolved disk on the timescale of days, which is not periodic because of the phase scrambling. The composition of such stochastic signals is an unpredictable "jitter." The measured shifts from the mean photocenter show an utterly non-Gaussian distribution because the intrinsic distributions of the sunspot sizes, lifetimes, and positions within the disk are not normal. In this case, the nominal LS periodogram, as well as the traditional FAP estimation, are likely to produce misleading and inaccurate results.

Figure 1 shows the observed time series used in this paper to illustrate the application of the proposed 1-norm periodogram analysis. It shows the high-accuracy astrometric data collected by geodetic VLBI for the ICRF3 source IERS B0642+449 for nearly 40 yr of continuous observations. Each data point corresponds to a one-day "session" with multiple delay measurements of this source together with a number of other ICRF3 sources. The observational data are represented as coordinate offsets Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (23) (left panel) and y = δobsδmean (right panel) in mas, where {αmean, δmean} are the weighted mean coordinates for this source in the equatorial coordinate system. The formal errors for each observation are shown as error bars. This enigmatic high-redshift (z = 3.41) gamma-ray blazar is obviously one of the astrometrically unstable ICRF3 sources with shifting position mostly in the R.A. component. The origin of the position variations is outside of the topic of this paper, but we briefly note the study by Xu et al. (2016), who detected a dual structure of IERS B0642+449 from a closed delay analysis of a high-intensity geodetic VLBI session. The detected separation of the dual components is approximately 0.46 mas and the position angle is 262Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (24)2. One of the interesting applications of sub-mas astrometry with VLBI is the possibility of detection of orbiting dual AGNs. With an estimated scale of 7.6 pc mas−1 at this redshift, a binary black hole with a total mass of 1010 M and a period of 40 yr may have an angular separation of about 10 μas, which may be within the reach with this type of data. Central engine binarity may be one of the explanations for the observed quasi-periodic modulation of some gamma-ray blazars' light curves (Ackermann et al. 2015).

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (25)

Both coordinate trajectories in Figure 1 appear to include long-term variations and, possibly, periodic components on the timescale of a few hundred days. Are they statistically significant? We begin with the standard LS periodogram analysis using the extended model Equation (4). The need to include the linear terms, in particular, comes from the possibility of a "secular" proper motion in the data, which is not part of the astrometric model used in the VLBI data reduction pipeline. We compute the periodogram fitting coefficients {x0,..., x3} for an exponential grid of 1000 trial periods, pk = p0 dex(q k), with the exponent step q = (3652.5 − p0)/1000 in days. The longest trial period is then 10 yr, which is practically limited by the time span of the available data.

The results are shown in the upper row plots of Figure 2 for the two coordinate components. The periodogram amplitude estimates are connected with a black line to aid the eye. The significance of each periodogram point is also computed for this unweighted solution according to Section 4. We color-coded the significance by the normalized confidence level, so that estimates below the 1σ level are marked with blue dots, and estimates above the 3σ level are marked with red dots. A large number of values appear to be highly significant with periods across the entire range, including some short periods below 100 d, which obviously cannot be physical. This result, with a jungle of sharp peaks in the short-period domain and a few prominent features in the long-period domain, is typical of LS periodograms for "noisy" data. The inference is completely false, and we will now reveal why that happens.

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (26)

The single-epoch positions measured with VLBI are two-dimensional, and each position determination {x, y} comes with a formal covariance G , which is a 2 by 2 matrix. It is convenient to consider the normalized and centralized single-epoch position offset

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (27)

because it is a scalar variate, which is expected to follow a Rayleigh distribution with scale 1, reducing the dimensionality of statistical analysis to 1. The true coordinates Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (28) are not known, but they can be separately estimated as the weighted mean position. Note that even the well-known formula for covariance G is based on the underlying assumption of normally distributed random errors. The variate D allows us to test this basic assumption. The histogram of D values computed for the example data set in Figure 1 is shown in Figure 3. For reference, the expected Rayleigh[1] distribution (normalized to the same area) is shown with the blue line. We can see that the actual distribution of astrometric offsets is very far from the expectation, and the difference cannot be fixed just by scaling the formal errors. Although the mode of the empirical distribution is approximately where it is expected to be (at 1), a long and powerful tail stretching far beyond the Rayleigh[1] curve indicates that nearly half of the available measurements have values associated with nil probabilities of occurrence within the assumed statistical model.

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (29)

The heavy-tailed nature of the data distribution invalidates the LS periodogram method. The data points with large deviations from the mean should not be called outliers in this case, because they represent a large part, if not the majority, of the population. Simple fixes such as clipping the data outside the 3σ threshold are not justified. The numerous deviant data points corrupt any LS estimation, and generate bogus signals in this periodogram analysis. Methods of robust estimation are designed to handle heavy-tailed data in a more consistent way. In particular, the 1-norm estimation seeks to minimize the sum of absolute values of residuals rather than the sum of their squares:

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (30)

This merit function diminishes the impact of large deviants and permits a meaningful solution for any intrinsically symmetric populations. It is robust with respect to the subsample containing high normalized offsets, because each individual data point has an effectively lower weight in the solution, irrespective of its value.

7.Implementation of 1-norm Periodograms

The same periodogram models (Equations (1) and (4)) can be used as in the classical LS method. The main differences in implementation are of the technical character. The main optimization problem is no longer linear, and it cannot be formalized as Equation (2). Consequently, there is no direct calculation of the associated covariance matrix of the periodogram coefficients. This can still be done numerically using Monte Carlo simulations. The solution itself is implemented with one of the existing global nonlinear optimization methods with vector-valued arguments, such as the Nelder-Mead (simplex downhill), differential evolution, or simulated annealing methods. 3 These methods are computationally much more expensive than the regular LS periodogram. However, we achieved a computing time of about 1 minute on a regular laptop for the given example with 1668 data points and 1000 trial periods for the two time series.

The resulting 1-norm periodograms for the given data sets are shown in Figure 2, lower row. They are expressed in the same values (amplitudes, per Equation (5)) and units as the LS periodograms in the upper row, so that they can be directly compared. Formal confidence levels cannot be directly computed for the 1-norm solutions, because the population distribution is non-Gaussian. We reproduce, however, the re-normalized confidence intervals 1σ and 3σ from the LS solution to emphasize the significance of the results.

Quite clearly, the robust 1-norm periodograms paint a different picture about the temporal variations of the given data. The amplitude values dropped by half or more, and most of the estimated values are now below the 1σ-interval. The largest reduction is seen in the high-frequency domain. Given the nature of the object under investigation, the low-frequency features are of special interest. We find the main features at different locations than with the LS method, and for the R.A. component, they clearly dominate the spectral power distribution. Intriguingly, there is a compact location around 1730 days with periodogram amplitudes above 3σ, which was completely insignificant in the LS solution. To test if this point is indeed associated with a high level of confidence, a much more extensive bootstrapping or Monte Carlo simulations are required.

We performed a non-parametric bootstrap by producing 100 data samples from the original time series by randomly permuting its elements but keeping the fixed cadence of epochs, thus inheriting the same distribution of the uncorrelated noise component as the original data. Then, for each data sample, we computed the LS and 1-norm periodograms. The bootstrap distribution appears reasonably symmetric. For each period, the N% confidence interval (0 ≤ N ≤ 100) is given by the non-parametric percentile bootstrap interval contained between the (N/2) th and (100 − N/2) th percentiles. For verification purposes, the periodogram analysis and the bootstrapping estimation was independently implemented by two authors using different computer languages, on the same data set. Figure 4 displays the obtained results where the grey lines represent the difference between the upper and lower bounds of the confidence intervals, respectively for N = 68.3% (dotted line), N = 95.5% (dashed line), and N = 99.7% (solid line). In the case of the 2-norm implementation, the bootstrap provides a confidence interval consistent with results shown in Figure 2. For the 1-norm spectrum, the confidence interval is lower than for the 2-norm, reflecting the robustness of the 1-norm estimation, which is less sensitive to the data points in the extended tail of the distribution. The amplitudes reach highest significant values, with a significance level well higher than 99.7%, for the right ascension suggesting that the bump observed at a period of ∼1700 days is not due to chance and the secondary peak at ∼2800 days should also be considered for further investigations. We note, however, that the bootstrap-estimated confidence only refers to the random uncorrelated noise in the data (of arbitrary PDF). The data may include a time-correlated component of physical or instrumental origin. Broad periodogram features seen in Figure 4 may require additional analysis using, e.g., structure functions of first or second order, which also incorporate periodic components with time-variable phase (Rutman 1978; Simonetti et al. 1985; Rutman & Walls 1991).

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (31)

8.A Search for Periodic Modulation in Observed Positions of ICRF3 Sources

Diurnal geodetic VLBI sessions have been regularly scheduled over nearly 40 yr, using networks of stations separated by baselines of hundreds to thousands of kilometers long. Within each daily session, a number of widely separated radio sources are observed multiple times over the course of 24 hours. The resulting data are processed in a few data analysis centers, including the U.S. Naval Observatory. In this paper, we use a global solution for two-dimensional coordinates of epoch calculated at USNO (2022a) in the standard S/X band setup. This data product includes astrometric time series from more than 6000 diurnal sessions. The total number of sources is 5153 in this data set, but here we only consider 259 of them with more than 200 single-epoch measurements.

The 1-norm periodogram computation was uniformly applied to each of the frequently observed ICRF3 sources, separately for R.A. and decl. offsets from the weighted mean positions. These mean positions are specifically computed for the given data set and the solution version, so they may slightly differ from the published ICRF3 mean positions. The purpose of this numerical experiment was to identify sources with possible sinusoidal variations in the observed positions on the sky. The results are presented in a compact form in Table 1 for 49 quasars where the formal significance criterion ψ > 11.829, from Equation (9), is triggered in either of the coordinates for at least one trial period. The table provides IERS names of the sources (which should be prepended with letter B to match Simbad identification), our computed mean R.A. and decl. coordinates in degrees, the total number of diurnal sessions, and the significant trial periods.

Table 1.ICRF3 Quasars with Quasi-periodic Signals

IERS idR.A.Decl. nobs Periods R.A.Periods Decl.
(°)(°)(days)(days)
0003-0661.5578870361−6.393148724216481760
0014+8134.285312111081.58559346491272>3652
0016+7314.940776501873.458338221312141520, 1850, >3652
0119+04120.48692373144.37353730411717>3652
0119+11520.423312688511.830670357816731670, 2640
0131-52223.2740104059−52.0010959196628830
0202+14931.210057895115.23640099859632940
0300+47045.896842583047.27118762968251030multiple 946–3652
0308-61147.4837464402−60.97751567291292800, 2070
0322+22251.403393144622.40010156058721660, >3652
0336-01954.8789074465−1.776612276124031820
0537-28684.9761728366−28.66554114735562750
0537-44184.7098398243−44.085816400523152760
0607-15792.4206230931−15.7112979835873870, 1160
0637-75298.9437829129−75.2713376525607990, 1110, 1550, 1970
0642+449101.633441636444.854608365116681690
0735+178114.530807290817.7052772667502>3652
0748+126117.716857239912.51800783589421590, >3652
0749+540118.255769037153.88323252431167>3652
0955+476149.581965182447.418845116328791320
1030+415158.265449423541.2683980499310multiple 2230–3580
1039+811161.096093947080.91095640772391240, 2060, 2770
1123+264171.473799660226.17221630923453460
1213-172183.9447990581−17.5292786545373>3652
1308+326197.619432721832.345495244621652920
1451-375223.6142072756−37.7925402884384920, 1310, 1900, 24701030, 1310, 1900, 2530
1546+027237.37265353762.61698978346191980, 2180
1548+056237.64695518075.45290231523132380, 3180
1611+343243.421101016334.213308021320583410
1617+229244.811769170022.79662526954423500
1639-062250.5090738210−6.3565819658738910
1639+230250.355114864322.95112021076192360
1739+522265.154074374952.1953909459222213502610
1846+322282.092035721932.3173899470872955, 1430, 31701370
1849+670282.316967864967.09491118944691390, 1680
2007+777301.379160504377.87867986842631325
2128-123322.8969239777−12.11799893548901890
2136+141324.755455290114.393331159511463240
2145+067327.02274445906.960723384921823460
2201+315330.812399111031.76063053969482015
2214+350334.083374587435.30393887107503430
2216-038334.7168238499−3.59357763344352420
2223-052336.4469137221−4.9503863240175822901770
2227-088337.4170180620−8.54845433018122690>3652
2229+695337.651957253869.7744658067964745, 1110, >3652
2232-488338.8051524303−48.5996651465209540
2234+282339.093628524728.482614773022651435, 22752275
2243-123341.5759665538−12.114243795510041270, 1730
2318+049350.18690246985.2305423851123024302430

Note. IERS source names in column (1) preappend letter B to the codes as given. R.A. and decl. coordinates in columns (2) and (3) are the weighted mean positions calculated from the given data, not the published ICRF3 coordinates. Periods in columns 5 and 6 are approximate local peaks in periodograms.

Download table as: ASCIITypeset image

The main result of this computation is that there seems to be no isolated single-frequency sinusoidal signals in the observational data similar to those that are found for astrometric binaries. Instead of well defined peaks in the periodograms, we find "packages" of trial periods with elevated amplitudes and significance levels. The emerging picture is more consistent with an ensemble of small vaguely periodic modes of non-commensurate frequencies. One-fifth of the detections have a rising periodogram amplitude toward the upper limit of this analysis (10 yr). The significant periods are mostly longer than 2 yr, although much shorter trial periods have been tested. Only 19% of the sample show periodic variations above the 3σ level. The typical peak amplitudes are in the range 100–200 μas.

We note that the robust periodograms were computed separately for the R.A. and decl. coordinates of epoch positions. In the case of Keplerian motion in a binary system, the detectable signal may be present in both coordinates with the same principal period (and its harmonics) but with different phase and amplitude. The probability density of the angle i between the line of sight and the vector of orbital angular momentum is proportional to Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (32). Therefore, nearly face-on projected orbits are less likely than nearly edge-on orbits. For marginally detectable trajectories, the detactable signal is mostly present in one dimension, which is uniformly distributed with respect to the local north direction. The largest extent of the projected orbit can be aligned with one of the coordinate axes with the same probability as a tilt of 45° or 135°. In the latter case, the detectable signal is split between the coordinates, and it should be harder to find it with confidence. Possible ways to deal with this problem include rotating the R.A.-decl. measurements on a grid of position angles to find a preferred direction maximizing the signal amplitude from a 1D periodogram. Technically, if a significant single-period signal is detected in both R.A. and decl. coordinates, a 2D version of robust periodogram can be implemented. The available coordinate measurements are combined in a single LS adjustment, but the number of unknown terms per trial period increases to a minimum of 8 because of the unknown phase. This may erode the confidence level of the signal, if the projected orbit is strongly elongated due to the geometric orientation or large eccentricity. Quasars IERS B1451−375, 2234+282, and 2318+049 are attractive targets for further investigation, because they show coherent periodicities in both coordinates from our results in Table 1.

9.Summary and Discussion

We have shown in this paper that the classical LS periodogram method is firmly based on strong and restrictive assumptions about the distribution of post-fit residuals (which is assumed to be Normal) and the character of physical signals in the data. It provides an optimal, unbiased, and unique solution for periodogram power or amplitude only under these conditions. Whenever the sample distribution shows significant departures from the Gaussian PDF, or more complex signals are present that are not captured in the model, the LS method becomes corrupted and can produce absolutely misleading results.

We have considered a specific observational data set for a moderately variable ICRF3 source collected over >30 yr by the global geodetic VLBI system. The distribution of astrometric positional offsets with respect to the mean position on the sky is explicitly non-Gaussian when scaled with the given 2D formal covariances or in absolute values. The normalized offsets are well represented by a log-normal distribution with a tighter mode and a heavy tail extending to high values. Nearly half of the measurements are way outside of the expected distribution. As a result, the traditional 2-norm (LS) periodogram produces a complex structure with multiple features that are formally above the 3σ confidence interval across the spectrum of trial periods. This result is completely bogus. The robust 1-norm periodogram method, when applied to the same data, produces amplitudes that are smaller by half or more. Ranked by the same previously estimated single-point confidence, the 1-norm values are all insignificant except for a single point in the R.A. component with a period of 1730 days and amplitude 72 μas, which appears to be above 3σ. Is this periodic signal real? The best way to find out is to continue taking high-precision measurements of this source with VLBI for a few yr. A stable sinusoidal signal, which could be produced by an orbiting binary black hole, for example, would emerge more strongly on the longer timescale. Alternatively, physical models could be tested, where transient periodic signals wax and wane in segments of the data due to phase scrambling. This new method provides the opportunity to more reliably and extensively search for periodic signals in non-Gaussian time series at the margin of available accuracy.

The 1-norm periodogram computation was performed for 259 ICRF3 sources with more than 200 diurnal sessions collected over nearly 40 yr. These measurements are characterized by heavy-tailed sample distributions of residuals. We identified 49 objects (19%), which have at least one statistically significant periodogram value in either coordinate component. Short periods are never found, indicating a possible physical mechanism of these signals in the transient structure of the radio-emitting sources. The signals are not consistent with clean sinusoidal variation at a specific frequency, which would emerge for an orbiting binary black hole. Rather, the pattern is that of "vague periodicity" represented by packages of sine waves with a distribution of frequencies. A possible physical model is a source that moves in loops on the sky returning to the vicinity of the initial position after some characteristic time, which may also vary with time. The estimated amplitude of these vaguely periodic excursions is 70 μas and higher. About one-fifth of the detected signals are truncated by the upper boundary of our periodograms (10 yr). Further investigation of these astrometric wobbles and continuous daily measurements will refine the models and allow us to understand the nature of the phenomenon.

Facility: VLBI - .

Software: Wolfram Mathematica.

Robust 1-norm Periodograms for Analysis of Noisy Non-Gaussian Time Series with Irregular Cadences: Application to VLBI Astrometry of Quasars (2024)
Top Articles
Latest Posts
Article information

Author: Ms. Lucile Johns

Last Updated:

Views: 5452

Rating: 4 / 5 (41 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Ms. Lucile Johns

Birthday: 1999-11-16

Address: Suite 237 56046 Walsh Coves, West Enid, VT 46557

Phone: +59115435987187

Job: Education Supervisor

Hobby: Genealogy, Stone skipping, Skydiving, Nordic skating, Couponing, Coloring, Gardening

Introduction: My name is Ms. Lucile Johns, I am a successful, friendly, friendly, homely, adventurous, handsome, delightful person who loves writing and wants to share my knowledge and understanding with you.