6 Conclusion

 Measuring democracy is not a straightforward task, but it is of paramount importance. To measure it, one requires “an operational definition of democracy precise enough to classify specific countries as we observe during particular historical periods” (Pérez-Liñán, 2017, p. 86). Furthermore, if this definition is not well thought out, this conceptual low standard might lead to unreasonable results. “In other words, gradations of democratic quality cannot be detected” (Geissel et al., 2016, p. 572). A similar dynamic plays out when it comes to the aggregation. One needs a way to aggregate the characteristics of democracy so as to come up with reasonable results (Gründler & Krieger, 2016, 2021). This is a major issue and point of contention in the current landscape, as described in Section 2 and evidenced by the current literature (Gründler & Krieger, 2021). One can employ a concept of democracy of a high standard and still come up with unreasonable results due to a problematic aggregation. Conversely one can employ a well theorized aggregation and still be faced with unreasonable results, due to a poor selection of characteristics.

 Conceptualization and aggregation, hence, are central issues in this debate (Gründler & Krieger, 2021). This contribution by no means attempted to outright solve these issues. However, it did attempt to employ and improve upon novel methods of aggregation, specifically machine learning in general and SVM regression in particular. In recent years such tools have increasingly been employed in the attempt to measure democracy (Gründler & Krieger, 2016, 2020, 2021). However, one could argue, they are still in their infancy in this field. The contribution herein provided was to show potential benefits of such methods in relation to an established aggregation method, as well as expand on its potential by employing broader definitions of democracy than had hitherto been used.

 Hence, in that effort, the aggregation function was isolated as a factor of influence in the indicator, by selecting a prominent index, i.e. the EDI, using the same variables as it does, calculating the SVM index, and comparing the two. This comparison indicated that two variables, the only ones that were either manually coded or objective, had a very high degree of influence over the resulting EDI, as opposed to other variables. This happened, despite overt efforts in the aggregation process to reduce their influence (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Coppedge, Gerring, Knutsen, Lindberg, Teorell, Marquardt, et al., 2021). Observations that had low scores on the other variables were given a relatively high overall EDI, because the aforementioned two characteristics had high scores. The SVM index, on the other hand, could use all manually coded, objective, and low-level indices to calculate the result, without needing prior aggregation into mid-level indices. This does not mean that every variable had the same effect, but that no arguably arbitrary decision in that regard had to be made a priori. Hence, this further confirms that the aggregation method of the SVM provided a benefit, as it removes the necessity for arbitrary decisions as to the unknown relationship between characteristics and shifts that responsibility into the context of nonlinear optimization (Gründler & Krieger, 2016, 2021).

 The conceptualization issue, on the other hand, is harder to improve as it is often contextual (Bjørnskov & Rode, 2020; Gründler & Krieger, 2021; Gutmann & Voigt, 2018). There are usually two counteracting interests. If a concept is too broad, it might conceptually overlap with the object of study. Furthermore, the broader a concept is the more likely it is that there are gaps in the data and that, hence, the coverage of the resulting index is restricted in terms of the number of countries and timespan, which, as shown in Section 2, is also a detriment. However, if the conceptualization is too narrow, then the index might be unable to identify variations in the gradation of democracy. Section 5 exemplified both of these issues by comparing SVM indices composed of different breadths of democracy to each other, as well as to the existing SVMDI (Gründler & Krieger, 2021) which has a narrow conception. This comparison took place in the form of three cases. They exemplified why, despite an improvement in aggregation, it is paramount to apply it to high-standard input variables. Without that, no matter how effective the aggregation might be, the index might be unable to identify variations in the gradation of democracy. The SVMDI overlooks serious issues that even the SVM index with the narrowest conception of democracy was able to capture. While the objective of the SVMDI was to avoid conceptual overlap, it also often gave completely unreasonable scores. Though covering all countries in the dataset does not fall under the purview of this contribution, many are notable examples of this dynamic. Beyond the cases mentioned in Section 5, Russia in 1999, for instance was given by SVMDI a score only 0.05 shy of a perfect democracy (Gründler & Krieger, 2021). This not only contradicted established democracy indicators (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021), it also contradicted real political issues at the time (Riasanovsky, 2022), as well as the indices herein developed.

 Despite the comparative large number of variables, the SVM index based on all internally embedded regimes was capable of identifying variations in democratic quality, when, for instance, the independence of the judiciary was restricted. This was shown by the cases of Hungary and, particularly, Poland. Despite the high number of characteristics, the SVM regression was able to analyze them effectively and assess the quality of democracy and was sensitive to changes in individual regimes. On the other hand, the fully embedded SVM index suffered. While it did also come up with reasonable results that were often in line with the internally embedded SVM index, the externally embedded variables played a very small role in the result and the index had to be restricted substantially in terms of coverage.

 This contribution then showed that a machine learning based approach to measuring democracy can be applied to broader conceptualizations of democratic regimes and still effectively identify variations of democratic quality. Furthermore, its aggregation method has proven itself capable of aggregating a large number of characteristics without a priori arbitrary decision on aggregation. While it improved the aggregation concern, this contribution also showed that one cannot escape the issue of conceptualization. It is paramount to base the decision on which characteristics to include on a solid theoretical basis. A too narrow approach might lose its ability to identify variations and a too wide approach might restrict the number of observations that can be analyzed. This contribution attempted to alleviate the issue of conceptual overlap, by providing future researchers with the option to focus their conceptualization on specific regimes or contexts inherent to the concept of Embedded Democracy (Merkel, 2004). This somewhat modular approach allows researchers to isolate variables that do not overlap with their object of study, while still keeping as many characteristics as possible and without have to be overly concerned with how to aggregate them.

 The field of machine learning is improving rapidly, and hence future research could contribute by employing new or alternative techniques in the effort of measuring democracy. The role of individual regimes of democracy as established by Merkel (2004) could be further investigated individually, a limitation that this contribution did not address. This contribution was also limited by the fact that observations were treated individually by the SVM regression, which did not consider that all of them belong to a specific country and timeseries. There are a series of robustness checks that have not been conducted, e.g. using alternative priming data or using incorrect labels. Furthermore, while there is consensus on the level of democracy of countries on the ends of the scale, there is disagreement on exactly how democratic/autocratic they are. This is by no means whatsoever a definitive index to measure democracy. However, this contribution attempted to improve upon existing measures and methods and hopefully inspire future research to better identify variations in the quality of democracy.