GWS 2024: Tinker, Tailor, Soldier, Poor Man

By Jon Freitag

While we wait for the 2025 survey to run its course, there is still more to discuss from the 2024 edition. If you have yet to complete the 2025 survey, please do so! You've got until August 31st, 2025.

Today's analysis examines four wargaming traits. We will reduce these in order to, hopefully, produce a few identifiable and meaningful wargaming profiles based upon survey responses. The statistical technique used to explore relationships and correlations between these four variables is Multiple Correspondence Analysis (MCA). MCA extends Correspondence Analysis (CA), which is typically used for two categorical variables, to more than two variables. MCA is a powerful exploratory tool for summarizing and visualizing relationships in datasets with several categorical variables, helping to uncover patterns and associations that might otherwise be hidden. MCA has been utilized in a number of past analyses of the Great Wargaming Survey.

Very briefly, the key points of MCA are:

  • Purpose: MCA helps to detect and represent underlying structures in complex categorical data, making it easier to interpret relationships between variables and categories.
  • How it works: It transforms categorical data into a numerical format (indicator matrix), then applies dimensionality reduction (similar to Principal Component Analysis for quantitative data) to project the data into a lower-dimensional space.
  • Output: The results are often visualized as maps or plots, where similar categories and individuals are positioned close together, revealing associations and clusters.
  • Applications: Widely used in social sciences, marketing, and survey analysis to explore patterns in responses, profiles, or preferences.

The questions pulled from the survey and used in this study are:

  1. Do you consider yourself mostly a historical, or more a fantasy / sci-fi wargamer on a scale of '0' (pure historical gamer to '6' (purely fantasy/sci-fi gamer)?
  2. How do you rate yourself as a craftsman/woman on a scale of '1' (terrible) to '5' (great)? Variable name = CRAFTSMAN with values 1-5.
  3. On a scale of '1' (not interested) to '5' (deeply interested), how much do you research the (fictional or not) background to your game? Variable name = RESEARCH with values 1-5.
  4. How many painted figures do you have in your collection?
  5. How often do you currently game?

The variables under consideration are Craftsman, Research, Collection Size (Collection_Size), and Gaming Frequency (Game_Freq). Will any identifiable patterns emerge from these data manipulations? Well, let's see.

Historical wargamers

To begin, only survey respondents whose primary interest is historical wargaming ('0' or '1' in question 1) are included. Total number of respondents in the sample is 1,652. With that criterion set, the frequency counts of each variable and its values are illustrated in the following four bar graphs:

While paired comparisons between any two variables can be useful, all four of the variables need to be considered simultaneously to extract any meaningful patterns and relationships between wargamers and their tendencies. To produce enough separation between values, outlier removal can be a useful technique used in an iterative process. Upon first inspection (see MCA: Initial), many of the data points are compressed into the lower left of the graph. This is a good candidate to test select outlier removal techinques.

In the first iteration of outlier removal, two outliers are removed. They are Research1 and Collection Size of 20,001-25,000.

After Iteration 1, values are still compacted into the lower left quadrant of the graph without much separation. In Iteration 2, three variables are removed as outliers. These are Research2, Craft1, and Collection Size 0-100. Now, these three could have been removed in Iteration 1 but I kept them in to illustrate the process.

After Iteration 2 of outlier removal, the spread between values is improving but Craft5 emerges as an outlier. Iteration 3 removes Craft5.

Having completed three outlier removal iterations, the spread between values is improved with enough separation and distinction to stop and assess the results. The next step is to move onto the analysis and interpretation of the resulting graph.

Using the origins in both dimensions (1 and 2) four quadrants are delineated. By studying each of the four quadrants, is there any underlying inference that emerges to help classify each of these spaces into any meaningful grouping label?

Overlaying wargaming terminology of troop experience to each of these four quadrants seems to fit the model inference in a reasonable fashion. I use the terms of Crack, Veteran, Regular, and Green to distinguish the attributes within each quadrant.

The loadings of variables and values into each quadrant present themselves counterclockwise as shown by the green arrow as follows:

  • Crack: Identified by high research values (5) and craftsmanship (4) values, gaming more than once a week and having massive painted armies (Poor Man!).
  • Veteran: Identified by large armies with weekly gaming. Little distinction with Regulars with respect to research values (3). No loading on craftsmanship.
  • Regular: Identified by high research values (4) and medium craftsman values (2,3). Gaming tends toward bi-monthly with painted armies 100-500 figures.
  • Green: Identified by good sized painted armies (501-2,500) and infrequent gaming. They are less associated with craftsmanship or research.

Interesting results and equally interesting groupings between the loadings within each quadrant. Always a surprise when data reveal their hidden, underlying tendencies. Where do I fit into this analysis? Well, I fit into the Crack classification quite closely with the exception that my craftsmanship rank is not likely up to '4' standards. I need to step up my game!

Where would you fit into this scheme, if at all?


I wonder if adding in rules source would add anything worthwhile into this analysis?

Leave a comment