1 Introduction

There are major challenges that revolve around measuring democracy. First and foremost, and arguably most obviously, how broad the concept should be. And it shouldn’t be overlooked, since it is crucial for related research (Gründler & Krieger, 2021; Merkel, 2004; Pérez-Liñán, 2017, 2020). Often it requires “an operational definition of democracy precise enough to classify specific countries as we observe during particular historical periods” (Pérez-Liñán, 2017, p. 86). The minimalist, i.e. narrow, approaches focus mainly on competitive and public elections for political office. On the other end there are the maximalist, i.e. broad, approaches that might include socioeconomic aspects, e.g. civil liberties, inequality, rule of law, and even economic metrics. In between that are a wide gamma of variations. (Gründler & Krieger, 2021; Merkel, 2004; Munck & Verkuilen, 2002; O’Donnell, 2011). Conceptually speaking, all aforementioned approaches are equally valid, since there is little in the way of an objective method to evaluate them (Gründler & Krieger, 2021; Guttman, 1994; Munck & Verkuilen, 2002). Another issue relates to the object being researched. If, for instance, the research revolves around the relationship between democracy and economy, a definition that includes economic metrics might not be appropriate as it could skew results. This means that the answer as to what type of conceptualization is more appropriate can often be contextual (Bjørnskov & Rode, 2020; Gründler & Krieger, 2021; Gutmann & Voigt, 2018).

Secondly, there remains the question on how the different dimension relate to each other. In this instance there are basically two approaches, namely formative and reflective (Gründler & Krieger, 2021; Teorell et al., 2019). The former assumes that every condition of democracy is necessary. For instance, if election results are completely fabricated or the right to candidacy is severely restricted, the degree of suffrage amongst the citizenry is not relevant (Gründler & Krieger, 2021; Teorell et al., 2019). This in turn means that a “low score on any of the component indices thus suppresses the value of the overall index” (Teorell et al., 2019, p. 81). The latter assumes that all conditions are the result of a common factor and hence partially interchangeable. It follows that the aggregation rule should be additive and not multiplicative. Different than with the formative approach, one low value in the reflective approach would have less of an impact on the resulting index. Much like the breadth of the concept, there isn’t an objective answer to the question on which approach is best. Teorell et al. (2019) argue that a combination of the two alleviates issues with both, though recent research indicates that that might not be the case (Gründler & Krieger, 2020). This was also further explored in this contribution in Section 6. They all have their own merits, however the distinction should not be overlooked. From the perspective of providers of democracy indices, the choice in approach directly affects the aggregation of data down the line. For users, however, it is important to be cognizant of the approach used, because it affects how data is interpreted (Gründler & Krieger, 2021; Teorell et al., 2019).

Lastly, it is important to be cognizant of the resulting scale type. It can be dichotomous, ordinal, graded or continuous. Each of them has strengths and weaknesses. However, dichotomous scales lose a lot of discriminating power (Elkins, 2000; Gründler & Krieger, 2021). For instance, if one would take the continuous scales of the Varieties of Democracy’s (V-Dem) Electoral Democracy Index (EDI) (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021) and the Freedom House (FH) indicator (Freedom House, 2021) in the year 2020, from the 173 countries included in both datasets a fourth of them would fall in the middle third of both scales.

All aforementioned challenges play a role when deciding how to measure democracy. Results vary depending on the scope of the conceptualization, the aggregation, and the scaling (Elkins, 2000; Gründler & Krieger, 2016, 2020, 2021; Munck & Verkuilen, 2002). However, some of these concerns can be alleviated. For instance, as just alluded to, with growing apprehension over hybrid regimes, i.e. “political regimes that fall ‘somewhere in between’ full democracy and overt dictatorship” (Pérez-Liñán, 2020, p. 90), dichotomous measures often lack the discriminatory power to make the necessary distinction. Obviously, there is little debate on how to categorize countries in the extreme, e.g. Switzerland or the Democratic People’s Republic of Korea in 2020. But a dichotomous scale lacks the nuance to classify borderline cases (Pérez-Liñán, 2017, 2020). Furthermore, they are not inherently more reliable than graded measures, for instance (Elkins, 2000; Gründler & Krieger, 2016, 2021). Lastly, dichotomous scales overlook the process of democratization over time due to their lack of gradation (Doucouliagos & Ulubasoglu, 2007; Gründler & Krieger, 2016). Hence, there are objective reasons that can be rationalized in the pursuit of choosing a scale.

The same cannot necessarily be said for conceptualization and the aggregation. As mentioned previously the former is based on non-objective criteria, and the latter to an extent depends on the former. It is important to understand the relationship between the conditions for democracy, however that is dependent on which conditions are chosen. Furthermore, many of the current, notorious indices choose their aggregation mechanisms rather arbitrarily (Gründler & Krieger, 2016, 2021; Munck & Verkuilen, 2002). Building on the work by Gründler and Krieger (2016, 2021), this contribution attempted to partially overcome this issue by employing novel approaches to measure democracy based on machine learning (ML). There are multiple advantages. For one, these sorts of tools have a more pronounced ability to analyze datasets with a broad degree of dimensions and features. Further, they are better equipped to recognize patterns in the data and learn from them, without being explicitly programmed beforehand (Gründler & Krieger, 2016; Samuel, 1959).

For the purposes of this contribution, and building on aforesaid research, this contribution employed Support Vector Machines (SVM). The practical applications thereof can hardly be overstated. As pointed out by Gründler and Krieger (2016, p. 89), it has been employed with promising results in a plethora of other applications in a variety of scientific fields (Cortes & Vapnik, 1995; Gualtieri, n.d.; Guyon et al., 2002; Joachims, 2002; Orrù et al., 2012). However, in social sciences researchers have only relatively recently begun to understand and benefit from the use of machine learning (Grimmer, 2015; Grimmer et al., 2021; Molina & Garip, 2019). In this case, it provided with a suitable strategy to find the usually subjective and inherently unknowable aggregation function, hence avoiding arbitrary assumptions, and improving the general gradation quality of the democracy indicator. Machine learning algorithms in general and SVM in particular are designed to solve problems where the functional form is not known and where the functional relationship between input and output is not privy to researchers (Gründler & Krieger, 2016; Steinwart & Christmann, 2008).

The most appropriate aggregation function was found by transferring the problem of aggregation to the context of optimization. In order to do it, the necessary conditions were twofold. It was necessary to have input data for all observations in the dataset and further a limited set of observations with a known output, which was used to train the algorithm. To that end a set of countries that were indisputably held as highly democratic and highly autocratic were selected as priming data. These were distinguished from other regimes by using a set of two commonly accepted indices, namely the Unified Democracy Score (UDS) (Marquez, 2022; Pemstein et al., 2010) and EDI (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021). Countries that belonged to the upper or lower decile of either index’s scale was used as priming data. This data was validated in two ways. First, the priming data was compared to other established indices, to check for any noticeable incongruencies (Gründler & Krieger, 2016, 2021). Second, previous research and reports were used to confirm, in cases of incongruencies, that there weren’t any regimes in the priming data that were outwardly unreasonable. The countries in the priming data, both democracies and autocracies, were checked for representativeness to make sure that it is not excluding regimes with specific characteristics, such as regime type or durability. An SVM regression algorithm was then used to obtain the aggregation function (Gründler & Krieger, 2016, 2021).

The regression was applied to four datasets. One took the same variables as an existing index, namely the EDI. This was done to isolate the aggregation functions and assess their distinct effects on the overall indices. The other three encompassed variables that related to different breadths of conceptualizations of democracy as proposed by Wolfgang Merkel (Merkel, 2004) and his concept of Embedded Democracy offers key advantages. Due to its arguably modular characteristic, it is possible to test for different changes in conceptualization, without altering the core features, i.e. the Electoral Regime. It also offers the opportunity to pinpoint whether specific partial regimes play a larger role in the changes of the gradation of democracies. Further, it allows for the determination of changes in specific partial regimes, that might be overlooked by more minimalistic approaches. Hence, the extent to which an SVM based democracy index could effectively measure democracy and identify its different gradations was assessed by comparing the distinct SVM indices, not only to each other, but also to the already existing Support Vector Machine Democracy Index (SVMDI) by Gründler & Krieger (2021), which was described further in Section 2, and upon which this contribution attempted to improve.

Some of the data used in this contribution has already been mentioned, however the main data used to apply the SVM, and measure democracy’s partial regimes came from the V-Dem Dataset (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021). For starters, the dataset encompasses 202 countries in the period of 1789 to 2020. Though not every country-year combination was used, the wide range of data was a beneficial asset. Furthermore, it encompasses 470 unique indicators, 363 of which are coded from 1900 on (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021; Coppedge, Gerring, Knutsen, Lindberg, Teorell, Marquardt, et al., 2021; Gründler & Krieger, 2021). This contribution only included a small fraction of them, i.e. only those that related to the internal partial regimes and external evironment as defined by Merkel (2004), as well as no aggregated metrics to avoid the aggregation issue mentioned previously. Further, both objective data as well as expert-based subjective data were included, which in turn satisfied Munck and Verkuilen’s (2002) guidelines for the use of regime characteristics (Gründler & Krieger, 2021). This contribution benefited from V-Dem’s transparency regarding their data and their coding (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021; Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Coppedge, Gerring, Knutsen, Lindberg, Teorell, Marquardt, et al., 2021; Gründler & Krieger, 2021).

This contribution, hence, used the aforementioned tools and assets in the pursuit of answering the question: To what extent can a machine learning based approach to measuring democracy be applied to broader conceptualizations of democratic regimes? In that effort this contribution isolated the aggregation function to assess its potential benefits and pitfalls in relation to traditional aggregation procedures. It also built upon Gründler and Krieger’s (2016, 2021) concept of SVMDI by expanding the defining characteristic of democracy and modularizing different areas of democratic conceptualization. This was done to isolate characteristics of democracy that can be conceptually related to other fields of study, without changing the core of the minimalistic concept. Finally, this contribution also attempted to determine, whether different breadths of democratic conceptualization, when aggregated via SVM regression, can reasonably detect variations in democratic quality.