Project partner:
 Prof. Dr. Carsten Jentsch, TU Dortmund.
Twoyears project, funded by Deutsche Forschungsgemeinschaft (DFG) – Projektnummer 437270842.
Project aims:
Time Series of counts arise in many different situations in economics and related fields. They can have various forms with respect to their dependence structure or their marginal distribution. As classical models for realvalued time series are not able to maintain the discrete nature of count data, a great many of models tailormade for time series of counts have been developed. An adequate modelling of count data processes is important, to do forecasting, to monitor the subsequent process of the time series to reveal structural changes as soon as possible, or just to obtain a better understanding of the underlying count data process.
The planned research project on model diagnostics for time series of counts comprises three central steps of model building: model identification, model selection and model validation. While a large number of methods have been proposed for realvalued (continuous) time series and are available since a long time, corresponding approaches for discretevalued time series are far less developed. The existing methods are scarce and most of them are available only in rudimentary form (e.g., as heuristic application guidelines), and more rigorous, theorybased methods rely on restrictive model assumptions or focus on isolated characteristics of the process as, e.g., dispersion. Corresponding issues do also hold for goodnessoffit tests: while numerous goodnessoffit tests for continuousvalued time series have been proposed that are not only capable to test for specific models but also whole model classes, the applicability of available goodnessoffit tests for time series of counts is restricted, e.g., to parametric assumptions.
The planned research project features two complimentary lines of attack for model diagnosis in time series of counts. On the one hand, we aim to develop parametric methods for model diagnosis for time series of counts, which take into account various characteristics of the underlying distribution and/or dependence pattern. Further, diagnostic tools developed and widely applied for realvalued time series shall be made applicable also to time series of counts by using suitable parametric bootstrap implementations. On the other hand, we aim to develop goodnessoffit tests based on joint distributions that are capable to consistently distinguish between different model classes. For the implementation, but also to allow for a broader applicability of the abovementioned diagnostic tools, suitable semiparametric bootstrap methods for time series of counts shall be developed and employed for model diagnostics. For the proposed methods, we want to investigate in detail the performance and the applicability by elaborate comparative simulations studies and applications to real data sets relevant in economic sciences.
Project duration:
October 2020 – September 2022.
Project results:

The first parametric methods for model diagnosis to be developed focus on count time series with a Poisson marginal distribution. The aim is to develop statistical tests that use the SteinChen identity. As a corresponding preparatory work, we wrote the following article:
Weiß, C.H., Aleksandrov, B. (2022):
Computing (Bivariate) Poisson Moments using Stein–Chen Identities.
The American Statistician 76(1), pp. 1015.
Abstract: The (bivariate) Poisson distribution is the most common distribution for (bivariate) count random variables. The univariate Poisson distribution is characterized by the famous Stein–Chen identity. We demonstrate that this identity allows to derive even sophisticated moment expressions in such a simple way that the corresponding computations can be presented in an introductory Statistics class. Then, we newly derive different types of Stein–Chen identity for the bivariate Poisson distribution. These are shown to be very useful for computing joint moments, again in a surprisingly simple way. We also explain how to extend our results to the general multivariate case.
 Aleksandrov, B., Weiß, C.H., Jentsch, C. (2022):
GoodnessofFit Tests for Poisson Count Time Series based on the Stein–Chen Identity.
Statistica Neerlandica 76(1), pp. 3564 (open access).
Abstract: To test the null hypothesis of a Poisson marginal distribution, test statistics based on the Stein–Chen identity are proposed. For a wide class of Poisson count time series, the asymptotic distribution of different types of Stein–Chen statistics is derived, also if multiple statistics are jointly applied. The performance of the tests is analyzed with simulations, as well as the question which Stein–Chen functions should be used for which alternative. Illustrative data examples are presented, and possible extensions of the novel Stein–Chen approach are discussed as well.
 Aleksandrov, B., Weiß, C.H., Jentsch, C., Faymonville, M. (2022):
Novel GoodnessofFit Tests for Binomial Count Time Series.
Statistics 56(5), pp. 957990 (open access).
Abstract: For testing the null hypothesis of a marginal binomial distribution of bounded count data, we derive novel and flexible goodnessoffit (GoF) tests. We propose two general approaches to construct momentbased test statistics. The first one relies on properties of higherorder factorial moments, while the second one uses a socalled Stein identity being satisfied under the null. For a broad class of stationary time series processes of bounded counts with joint bivariate binomial distributions of lagged time series values, we derive the limiting distributions of the proposed GoFtest statistics. Among others, our setup covers the binomial autoregressive model, but includes also other binomial time series obtained, e. g., by superpositioning independent binary time series. The test performance under the null and under different alternatives is investigated in simulations. Two data examples are used to illustrate the application of the novel GoFtests in practice.
 Faymonville, M., Jentsch, C., Weiß, C.H., Aleksandrov, B. (2023):
Semiparametric estimation of INAR models using roughness penalization.
Statistical Methods and Applications 32(2), pp. 365400 (open access).
Abstract: Popular models for time series of count data are integervalued autoregressive (INAR) models, for which the literature mainly deals with parametric estimation. In this regard, a semiparametric estimation approach is a remarkable exception which allows for estimation of the INAR models without any parametric assumption on the innovation distribution. However, for small sample sizes, the estimation performance of this semiparametric estimation approach may be inferior. Therefore, to improve the estimation accuracy, we propose a penalized version of the semiparametric estimation approach, which exploits the fact that the innovation distribution is often considered to be smooth, i.e. two consecutive entries of the PMF differ only slightly from each other. This is the case, for example, in the frequently used INAR models with Poisson, negative binomially or geometrically distributed innovations. For the datadriven selection of the penalization parameter, we propose two algorithms and evaluate their performance. In Monte Carlo simulations, we illustrate the superiority of the proposed penalized estimation approach and argue that a combination of penalized and unpenalized estimation approaches results in overall best INAR model fits.
 Weiß, C.H., Puig, P., Aleksandrov, B. (2023):
Optimal Steintype GoodnessofFit Tests for Count Data.
Biometrical Journal 65(2), 2200073 (open access).
Abstract: Common count distributions, such as the Poisson (binomial) distribution for unbounded (bounded) counts considered here, can be characterized by appropriate Stein identities. These identities, in turn, might be utilized to define a corresponding goodnessoffit (GoF) test, the test statistic of which involves the computation of weighted means for a userselected weight function f. Here, the choice of f should be done with respect to the relevant alternative scenario, as it will have great impact on the GoFtest’s performance. We derive the asymptotics of both the Poisson and binomial Steintype GoFstatistic for general count distributions (we also briefly consider the negativebinomial case), such that the asymptotic power is easily computed for arbitrary alternatives. This allows for an efficient implementation of optimal Stein tests, that is, which are most powerful within a given class F of weight functions. The performance and application of the optimal Steintype GoFtests is investigated by simulations and several medical data examples.
 Weiß, C.H., Aleksandrov, B., Faymonville, M., Jentsch, C. (2023):
Partial Autocorrelation Diagnostics for Count Time Series.
Entropy 25(1), 105 (open access),
Special Issue “DiscreteValued Time Series”.
Abstract: In a time series context, the study of the partial autocorrelation function (PACF) is helpful for model identification. Especially in the case of autoregressive (AR) models, it is widely used for order selection. During the last decades, the use of ARtype count processes, i.e., which also fulfil the YuleWalker equations and thus provide the same PACF characterization as AR models, increased a lot. This motivates the use of the PACF test also for such count processes. By computing the sample PACF based on the raw data or the Pearson residuals, respectively, findings are usually evaluated based on wellknown asymptotic results. However, the conditions for these asymptotics are generally not fulfilled for ARtype count processes, which deteriorates the performance of the PACF test in such cases. Thus, we present different implementations of the PACF test for ARtype count processes, which rely on several bootstrap schemes for count times series. We compare them in simulations with the asymptotic results, and we illustrate them with an application to a realworld data example.
 Wang, S., Weiß, C.H. (2023):
New Characterizations of the (Discrete) Lindley Distribution and their Applications.
Mathematics and Computers in Simulation 212, pp. 310322, 2023.
Abstract: A Steintype characterization of the Lindley distribution is derived. It is shown that if using the generalized derivative in the sense of distributions, one can choose all indicator functions as the characterization functions class. This extends some known recent results about characterizations of the Lindley distribution. In addition, a new characterization based on another independent exponential random variable is also provided. As an application of the novel results, some moment formulas related to the Lindley distribution are obtained. Furthermore, generalized methodofmoments estimators for both the discrete and continuous Lindley distribution are proposed, which lead to a notably lower bias at the cost of an only modest increase in mean squared error compared to existing estimators. It is also demonstrated how the Stein characterization might be used to construct a goodnessoffit test with respect to the null hypothesis of the Lindley distribution. The paper concludes with an illustrative realdata example.
 Weiß, C.H. (2023):
Control Charts for Poisson Counts based on the SteinChen Identity.
Accepted for publication in Advanced Statistical Methods in Statistical Process Monitoring, Finance, and Environmental Science, Springer, 2023 (arXiv preprint).
Abstract: If monitoring Poisson count data for a possible mean shift (while the Poisson distribution is preserved), then the ordinary Poisson exponentially weighted movingaverage (EWMA) control chart proved to be a good solution. In practice, however, mean shifts might occur in combination with further changes in the distribution family. Or due to a misspecification during PhaseI analysis, the Poisson assumption might not be appropriate at all. In such cases, the ordinary EWMA chart might not perform satisfactorily. Therefore, two novel classes of generalized EWMA charts are proposed, which utilize the socalled SteinChen identity and are thus sensitive to further distributional changes than just sole mean shifts. Their average run length (ARL) performance is investigated with simulations, where it becomes clear that especially the class of socalled “ABCEWMA charts” shows an appealing ARL performance. The practical application of the novel SteinChen EWMA charts is illustrated with an application to count data from semiconductor manufacturing.
 Faymonville, M., Riffo, J., Rieger, J., Jentsch, C. (2023):
spINAR: Semiparametric and Parametric Estimation and Bootstrapping of IntegerValued Autoregressive (INAR) Models.
R package version 0.1.0, CRAN.
Full codes available on GitHub.
Abstract: Semiparametric and parametric estimation of INAR models including a finite sample refinement (Faymonville et al. (2022), doi:10.1007/s10260022006550) for the semiparametric setting introduced in Drost et al. (2009), doi:10.1111/j.14679868.2008.00687.x, different procedures to bootstrap INAR data (Jentsch, C. and Weiß, C.H. (2017), doi:10.3150/18BEJ1057) and flexible simulation of INAR data.
 Aleksandrov, B., Weiß, C.H., Nik, S., Faymonville, M., Jentsch, C. (2023):
Modelling and Diagnostic Tests for Poisson and Negativebinomial Count Time Series.
Accepted for publication in Metrika (open access).
Abstract: When modelling unbounded counts, their marginals are often assumed to follow either Poisson (Poi) or negative binomial (NB) distributions. To test such null hypotheses, we propose goodnessoffit (GoF) tests based on statistics relying on certain moment properties. By contrast to most approaches proposed in the countdata literature so far, we do not restrict ourselves to specific loworder moments, but consider a flexible class of functions of generalized moments to construct modeldiagnostic tests. These cover GoFtests based on higherorder factorial moments, which are particularly suitable for the Poi or NBdistribution where simple closedform expressions for factorial moments of any order exist, but also GoFtests relying on the respective Stein’s identity for the Poi or NBdistribution. In the timedependent case, under mild mixing conditions, we derive the asymptotic theory for GoF tests based on higherorder factorial moments for a wide family of stationary processes having Poi or NBmarginals, respectively. This family also includes a type of NBautoregressive model, where we provide clarification of some confusion caused in the literature. Additionally, for the case of independent and identically distributed counts, we prove asymptotic normality results for GoFtests relying on a Stein identity, and we briefly discuss how its statistic might be used to define an omnibus GoFtest. The performance of the tests is investigated with simulations for both asymptotic and bootstrap implementations, also considering various alternative scenarios for power analyses. A data example of daily counts of downloads of a TeX editor is used to illustrate the application of the proposed GoFtests.
 Weiß, C.H. (2024):
Stein EWMA Control Charts for Count Processes.
Accepted for publication in Statistical Methods and Applications in Systems Assurance & Quality, Book Series “Advanced Research in Reliability and System Assurance”, CRC Press (arXiv preprint).
Abstract: The monitoring of serially independent or autocorrelated count processes is considered, having a Poisson or (negative) binomial marginal distribution under incontrol conditions. Utilizing the corresponding Stein identities, exponentially weighted movingaverage (EWMA) control charts are constructed, which can be flexibly adapted to uncover zero inflation, over or underdispersion. The proposed Stein EWMA charts’ performance is investigated by simulations, and their usefulness is demonstrated by a realworld data example from health surveillance.
 Nik, S., Weiß, C.H. (2024):
Generalized Moment Estimators based on Stein Identities.
Accepted for publication in Journal of Statistical Theory and Applications, 2024 (open access).
Abstract: For parameter estimation of continuous and discrete distributions, we propose a generalization of the method of moments (MM), where Stein identities are utilized for improved estimation performance. The construction of these Steintype MMestimators makes use of a weight function as implied by an appropriate form of the Stein identity. Our general approach as well as potential benefits thereof are first illustrated by the simple example of the exponential distribution. Afterward, we investigate the more sophisticated twoparameter inverse Gaussian distribution and the twoparameter negativebinomial distribution in great detail, together with illustrative realworld data examples. Given an appropriate choice of the respective weight functions, their SteinMM estimators, which are defined by simple closedform formulas and allow for closedform asymptotic computations, exhibit a better performance regarding bias and mean squared error than competing estimators.
 Faymonville, M., Riffo, J., Rieger, J., Jentsch, C. (2024):
spINAR: An R Package for Semiparametric and Parametric Estimation and Bootstrapping of IntegerValued Autoregressive (INAR) Models.
Journal of Open Source Software 9(97), 5386 (open access).
Abstract: Although the statistical literature extensively covers continuousvalued time series processes and their parametric, nonparametric and semiparametric estimation, the literature on count data time series is considerably less advanced. Among the count data time series models, the integervalued autoregressive (INAR) model is arguably the most popular one finding applications in a wide variety of fields such as medical sciences, environmentology and economics. While many contributions have been made during the last decades, the majority of the literature focuses on parametric INAR models and estimation techniques. Our emphasis is on the complex but efficient and nonrestrictive semiparametric estimation of INAR models. The appeal of this approach lies in the absence of a commitment to a parametric family of innovation distributions. In this paper, we describe the need and the features of our R package spINAR which combines semiparametric simulation, estimation and bootstrapping of INAR models also covering its parametric versions.
 Faymonville, M., Jentsch, C., Weiß, C.H. (2024):
Semiparametric goodnessoffit testing for INAR models.
arXiv preprint.
Abstract: Among the various models designed for dependent count data, integervalued autoregressive (INAR) processes enjoy great popularity. Typically, statistical inference for INAR models uses asymptotic theory that relies on rather stringent (parametric) assumptions on the innovations such as Poisson or negative binomial distributions. In this paper, we present a novel semiparametric goodnessoffit test tailored for the INAR model class. Relying on the INARspecific shape of the joint probability generating function, our approach allows for model validation of INAR models without specifying the (family of the) innovation distribution. We derive the limiting null distribution of our proposed test statistic, prove consistency under fixed alternatives and discuss its asymptotic behavior under local alternatives. By manifold Monte Carlo simulations, we illustrate the overall good performance of our testing procedure in terms of power and size properties. In particular, it turns out that the power can be considerably improved by using higherorder test statistics. We conclude the article with the application on three realworld economic data sets.
 to be continued!
Einjähriges Projekt, gefördert durch die Interne Forschungsförderung (IFF2018) der HSU Hamburg.
Projektresultate:
Die IFFFörderung des Projektes „Modelldiagnostik für Zähldatenzeitreihen“ ermöglichte die Zwischenfinanzierung einer wissenschaftlichen Mitarbeiterstelle, welche mit einem Nachwuchswissenschaftler besetzt wurde. Im Rahmen dieser Förderung wurden für spezielle Arten von Zähldatenprozess, einen sog. PoissonINAR(1) und PoissonINARCH(1)Prozess, analytische Ausdrücke für die asymptotische Verteilung von Quadratmittel und Varianz der PearsonResiduen hergeleitet, was wiederum neuartige Signifikanztests ermöglichte. Die Performanz der entwickelten Tests wurde mittels Simulationen untersucht und alle Ergebnisse gemeinsam mit dem geförderten Nachwuchswissenschaftler in dem Manuskript „Testing the Dispersion Structure of Count Time Series Using Pearson Residuals“ zusammengefasst. Das Manuskript wurde von den „AStA Advances in Statistical Analysis“ zur Veröffentlichung angenommen.
Mithilfe der durch die IFF finanzierten Stelle konnte ein inhaltlich wesentlich erweiterter Projektantrag unter dem gleichlautenden Titel „Modelldiagnostik für Zähldatenzeitreihen“ erarbeitet werden, der in wesentlichen Punkten auf den Inhalten des Antrags der IFF basiert. Der Projektantrag (Sachbeihilfe) wurde am 20.12.2019 von der Deutschen Forschungsgemeinschaft (DFG) bewilligt.
Einen Überblick über das IFFProjekt bietet folgendes Poster.
Projektlaufzeit:
Juli 2018 – Juni 2019.
Publikationen:
 Aleksandrov, B., Weiß, C.H. (2020). Testing the dispersion structure of count time series using Pearson residuals.
AStA Advances in Statistical Analysis 104(3), pp. 325361.
Letzte Änderung: 21. June 2024