Steam tags essentially do this now, although not very well.
https://partner.steamgames.com/doc/store/tags
It's similar! But yeah, Steam doesn't do it well. They still start with user judgements to define things like genre and their list of items is way, way too short. They start with things like "action genre". Exploratory factor analysis would have things like "PC can sprint", "game features hats", "first person perspective", "Player controlled driving", "future setting", "can choose PC hairstyle" etc and would have thousands of these items. That's why data entry would be such a bitch and would need a team of people doing it. Not just to input the initial list but to apply it to thousands of games. I'm basing the method loosely on the development of the five factor model of personality. That took... a while. Still, there are a couple of things that support the feasibility of my research design.
1. The initial work on the five factor model of personality was done manually, prior to widespread access to computing and before the existence of the Internet.
2. Even if we couldn't afford to get a statistician, I ran a poll on Codex education levels a while ago and the majority had Masters degrees or better. Someone's bound to be genuinely statistically literate (not me, I'm barely literate).
3. We wouldn't need to crowdfund that much between ourselves to hire a team of Venezuelans to do the data entry. Venezuelans work for absolute peanuts farming Runescape gold.
There are some issues with this that I've yet to solve however:
- Venezuelans speak Spanish. The only thing I can think of is to generate our list of items in Spanish but that seems impractical. Translation software seems like a more feasible option but it's not very reliable. Could the reliability be improved by keeping our items short? This would also limit our sample of games to those available in Spanish, but there's enough videogames with Spanish localizations to still get a same size in the thousands.
- Indians are another possibility and more likely to speak English, but I wouldn't hire an Indian to perform any task without direct supervision so I think that rules them out.
- Using third would peasants to do the data entry would require them understanding video game concepts. They don't.
- One possible solution to this would be to "crowd source" the data entry like Steam is doing. Reddit has the userbase to do it. If even a single respondent entered the data for a single game we'd get our dataset, even with extremely low response rates.
- The issue with this is I don't know that people unrelated to the project would sit through that without compensation.
- Unless someone has a solution to this we may actually have to do the data entry ourselves. I still think it's feasible, it would just take a couple of years (assuming a handful of interested people inputted a couple games a week).