4 Priming Data

The priming data, used to train the models, originated from the full V-Dem dataset (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021), as it not only was necessary in the selection process but also it contained many of the other indicators used to check for consensus and representativeness, including the UDS (Marquez, 2022; Pemstein et al., 2010), which was used to select the observations (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021; Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021). Furthermore, for the consensus check as well as for controlling institutional characteristics, information was extracted from the DD index and further data made available by Bjørnskov and Rode (2020). For the calculation of regime length, data was collected from the datasets provided by Geddes, Wright and Frantz (2014).

In total, the priming dataset contained information on 4643 different country-year observations. It spanned from 1919 to 2020 and included 135 countries, though not every country had an observation for each year. Not all observations could be included in the datasets used to check for representativeness, as the data on institutional characteristics and regime duration were less comprehensive (Bjørnskov & Rode, 2020; Geddes et al., 2014). The Bjørnskov and Rode (2020) dataset starts in 1950 and the Geddes, Wright and Frantz (2014) only spans to 2014. However, since both still encompassed a substantial amount of the original dataset, they were still useful in assessing the representativeness of the priming data. The validation naturally also did not include the entirety of the timespan for all comparisons, since, as shown in Section 2, different indicators have encompassed different timeframes.

4.1 Consensus

To check the priming data for consensus, a binary classification was created. Should a country-year observation meet the requirements for a democratic regime it was given a classification of one. Otherwise, it would receive a classification of zero. This was possible since the observations were in either extreme end of the spectrum and it made checking for overlaps or disparities with other indicators simpler and easier to identify. These were then compared with the BMR Index (Boix et al., 2013, 2018), the DD Index (Bjørnskov & Rode, 2020), the LIED (Skaaning et al., 2015), the Polity V Index (Marshall & Gurr, 2020), and the Freedom House Index (Freedom House, 2021, 2022).

Boix–Miller–Rosato indicator (BMR)

Being a binary indicator, it was very simple to compare the priming data with it. However, there was a caveat. For the purposes of this contribution the regular BMR democracy variable was not used, but rather a variation of it with omitted data. This was due to how BMR codes data. If there is a military occupation in a particular country, the regular BMR democracy indicator maintains the same regime type prior to the occupation (Boix et al., 2013, 2018). Czechoslovakia, for instance, is coded as a democracy during its occupation by Nazi Germany between 1940 and 1944 (Boix et al., 2013, 2018; Horáková, 2003). The variation of it codes occupied countries with NA values (Boix et al., 2013, 2018; Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021). The reason that this is not done for the main indicator by Boix et al. (2013, 2018) is to avoid coding democratic transitions at the end of military occupations.

The comparison subset of the data had a total of 1991 observations, 1989 of which were congruent. Only 0.1% of the data differed. All of the differing data came from the autocratic group, meaning that the priming data and the BMR indicator concurred on all democratic regimes. The differing observations related to Uganda in 1980 and Greece in 1944. Both discrepancies weren’t unreasonable, but also not entirely uncontroversial. In 1980 Uganda suffered a military coup in May and also held a multiparty election in December (Gründler & Krieger, 2021). Greece in 1944 was under Nazi occupation until October and a civil war broke out in December (Encyclopaedia Britannica, n.d.). Hence, it is not a stretch to conclude that different indicators, with different criteria, coded these cases differently. This was borne out by inconsistent data relating to these two cases across indicators (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021).

Democracy-Dictatorship Indicator (DD)

Much like BMR, the DD Index was also fairly easy to compare. There were a total of 2173 observations. And much like BMR there was full consensus on all regimes labelled as democratic. For autocratic regimes there were 4 differing observations, or just shy of 0.2% of data. All of these referred to Benin between 1952 and 1955. The reason for the discrepancy was actually fairly straightforward. It was due to the DD indicator’s concept of democracy as formulated in Section 2. Benin in that time period satisfied the indicator’s four conditions for democracy, however Benin, for instance, only provided male suffrage, which would severely affect its democracy score under broader concepts of democracy. Hence, the discrepancy was not unreasonable.

Lexical Index of Electoral Democracy (LIED)

By virtue of being ordinal, the comparison was slightly less straightforward. The comparison dataset contained 2019 observations. Democracies in the priming data all fell under the highest level on the LIED scale, i.e. “regimes with minimally competitive multiparty election and full male and female suffrage for legislature and executive” (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Gründler & Krieger, 2021; Skaaning et al., 2015). This means that there was again consensus on the democratic end of the spectrum.

On the authoritarian end it got more complex. The overwhelming majority of instances of autocracy, i.e. 96.4% of them, fell at or below the second lowest level on the scale, namely “regimes with no-party or one-party elections” (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Gründler & Krieger, 2021; Skaaning et al., 2015). Of the 56 remaining cases 55 didn’t reach the fourth lowest level, i.e. “regimes with minimally competitive multiparty elections” (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Gründler & Krieger, 2021; Skaaning et al., 2015). The remaining observation, i.e. Nicaragua in 1933, much like Benin in the DD indicator, fell short of the highest category since it had yet to grant women the right to vote.

If only the two lowest categories are considered autocracies and only the highest democracy, the consensus rate between the priming data and the LIED indicator laid at 97.2% of all observations. Even then the categories in between would fall short of many conceptualizations of democracy, as described in Section 2. In short there was a high degree of reasonable agreement between the priming data and the aforementioned indicator.

Freedom House (FH)

The priming data was compared to FH’s categorical scale, which distinguishes countries as free, partly free, and not free. The combined dataset included 1207 observations. Following the trend established by previous indicators, all democratically labeled observations in the priming data were coded as free in the FH index.

The autocratically labeled regimes see less of a consensus. Though none were coded as free, just shy of 10% of autocratic regimes were coded as partly free, meaning there was a consensus among that subset of over 90%. Of the overall data, around 8.7% were given a combined average of the Political Rights and Civil Liberties variables between 5.5 and 5, meaning they are on the lower end of the full partly free range, i.e. from 5.5 to 3.0. Only 1.3% of the observations ranked above 5 and none above 4. These cases were categorized as autocracies by all of the other indicators used in this section, with the sole exception of Uganda in 1980, which has already been discussed previously. Hence, it is reasonable to conclude that there was considerable consensus between the priming data and this indicator as well, albeit to a lesser extent than the indicators discussed so far.

Polity Index

The Polity V index is also ordinal and hence also nuanced. In its totality, it encompassed 2032 observations. Notably, though, 112 of these observations fell under one of the “standardized authority scores” (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Marshall & Gurr, 2020). These include countries with a systemic interruption due to foreign influence, cases of interregnum or anarchy, and countries in transition. Much like all other indicators, there was overwhelming consensus on the top end of the scale. All democracies in the priming data had a score of +9 or above, with the sole exception of France in 1946 which was coded as being in transition.

Again, much like previous indices, the situation was far more complex in the bottom end of the spectrum. Nearly 7% of all observations identified as autocracies in the priming data were coded as being part of the aforementioned standardized group. A vast majority of the remaining observations, i.e. around 91%, were given a Polity rating of -6 or lower. Around another 7.5% scored between -5 and -3, which would put them in the low end of the anocracy category as established by the Polity Project. However, 1.4% scored between 0 and +4 and one observation scored a +8. This last case relates to Greece in 1944 and has been discussed before. As mentioned in Section 2, the aggregation rule of the Polity IV and V indices is methodologically problematic. By weighing all regimes equally and using an additive rule, there are doubtful classifications at the lower end of the scale (Gründler & Krieger, 2020; Munck & Verkuilen, 2002; Teorell et al., 2019), a conclusion that is supported by other indicators (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021). Apart from the sole cases of Greece om 1944 and Uganda in 1980, which have already been discussed, all other regimes selected as autocracies in the priming data and with a Polity score above 0 were not coded as democratic by any other indicator discussed in this section.

Summary

Table 1 - Consensus between priming data and other indicators Notes: This table summarizes the rate of overlap between the share of country-year observations that were labeled “autocratic” or “democratic” and classified as such by alternative indices. The Polity score does not include countries that were labelled with the “standardized authority scores” (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Altman, et al., 2021; Marshall & Gurr, 2020).

Table 1 summarizes the results of this section. It shows that there was a high degree of overlap, i.e. consensus, between the priming data and other indicators. There was full consensus for all observations in the top end of the spectrum. The same could not be said for the bottom end. However, despite there not being full consensus, the rate of overlap was high for all other indicators. The little disagreement there was, was not completely unreasonable, since, as pointed out in this section, the vast majority of these discrepancies could be traced either to uncertainties on how to code transitional regimes or on methodological differences between the indicators.

4.2 Representativeness

The priming dataset was controlled for representativeness in terms of distinct institutional characteristics, as well as timespan. The timespan was calculated in two ways. Once as the total ultimate duration of the regime, regardless as to what year the observation pertains. And once as to the duration of the regime up to the observation year. If a regime had not ended by the time the original dataset (Geddes et al., 2014) was publicized, the end year is considered to be 2014, when that took place. The objective of this section was not to check whether all regime characteristics are equally represented in the data, but rather to simply confirm whether the characteristics were heterogenous or not.

Autocracies

For the countries that scored below the first decile of the scale of either the EDI (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021; Teorell et al., 2019) or the UDS (Marquez, 2022; Pemstein et al., 2010), they were controlled for type, electoral regime, and durability.

In terms of regime types, all of them were included in the dataset, though to different extents. Royal and military dictatorships were the best represented in the priming data, together encompassing nearly 80%. Communist dictatorships comprised 15.35% and civilian autocracies came last with just over 8%. That being said, civilian autocracies comprised 101 country-year observations.

In terms of electoral systems, 65.9% of them had no elections, whereas 34.1% did. In the latter group 21.15% had non-democratic multi-party elections and 12.69% had single-party elections. 0.25%, i.e. 4 observations, had democratic elections. It is important to point out that, in all of these 4 observations suffrage was only available for the male part of the population. Since Bjørnskov and Rode (2020) distinguish between free elections and suffrage, this was possible. However, this wasn’t problematic as many conceptions of free and fair elections consider universal suffrage a characteristic of paramount importance.

In terms of duration, summary statistics were calculated for both overall duration of regime, and duration up until the year of observation. For both categories the minimum duration was less than a year. For the former, though, the first quartile was under 23 years, the median was 44, the third quartile under 65, and the mean was just shy of 60. For the latter, the first quartile laid under 5 years, the median 14, the third quartile under 35 and the mean nearly 34. For both categories the max was over 250 years, i.e. 272 and 258 respectively.

Democracies

For the countries that scored above the ninth decile of the scale of either the EDI (Coppedge, Gerring, Knutsen, Lindberg, Teorell, Alizada, et al., 2021; Teorell et al., 2019) or the UDS (Marquez, 2022; Pemstein et al., 2010), they were controlled for form of government, number of parliamentary chambers, voting system and durability.

For the form of government, the distribution was not uniform but there were comprehensive number of regimes for all three options. Nearly 67% of observations were parliamentary democracies, and both mixed, i.e. semi-presidential, and presidential democracies encompassed 16.9 and 16.25 percent of observations, respectively. The distribution of the total number of chambers was nearly cut down the middle. Nearly 49% of legislatures were unicameral, whereas just over 51% were bicameral. On the other hand, the variations of voting systems were more unequally distributed. While just shy of 91% of systems were proportional, a little over 9% of them were majoritarian.

In terms of duration democratic regimes also varied substantially, however not as extremely as autocracies. The total length varied from 21 to 143 years, where the first quartile was below 68, the median was 106, the third quartile was below 128, and the mean was 101. The duration of the regime at the time of the observation spanned from 7 to, as well, 143 years. The first quartile was below 56, the median was 80, the third quartile was below 106, and the mean was 81.

Summary

All regime characteristics were to varying extents included in the priming dataset. Though said characteristics were not always equally distributed, neither democracy nor autocracy in the dataset were a homogenous group, at least in the terms herein established. In the one shared characteristic, i.e. duration, they differed both internally and in relation to each other. The spread of regime duration varied substantially in both, thought the overall variation was larger for autocracies. That being said while democracies were fairly well distributed, since mean and median have similar values in both of the aforementioned types of length, autocracies overall tended to last less. The mean for both types of duration came close to the values for their individual third quartiles. This indicated that while some outliers pulled the mean up, the vast majority of autocracies did not last relatively long, with a duration at time of observation for 3/4 of the data under 35 years and an overall duration under 65. While these caveats are important to point out, and indicate that democratic regimes were generally more stable, this also shows that both democracies and autocracies were also not homogenic in term of their duration.