More realistic player names, including international players!

Previously, player names in Basketball GM were generated based on a list of name frequencies in the US in 1990 published by the US Census Bureau. In some ways this was awesome - it was a huge list of names, so there was a lot of variety. However the US population in 1990 does not exactly correspond to global basketball talent. There should be more African American names and there should be international names from basketball-loving countries.

I never fixed this problem because there wasn't any data I could find that was nearly as good as the census data I used previously. But now I think I have a better solution: DraftExpress. DraftExpress is a website about the NBA draft. It has player profiles for basically every NBA prospect in recent history, even fringe guys like minor college players and roleplayers in overseas leagues. That's a pretty good sample of the distribution of basketball talent, right? Maybe not perfect, but probably good enough to be better than the previous names list.

So I used my trusty wget to scrape draftexpress.com, and then I wrote a script to parse names and countries for all players in their database. After a little work to clean up the data (splitting names into first and last names while handling extra spaces like Nando De Colo; fixing typos in country names), I filtered the list of countries to get rid of those with less than 5 names because they would just become too repetitive. So sorry Suriname, you and your 2 names are gone. That left me with 28,377 names from 85 countries. To generate a player, the game randomly picks a country and then randomly picks first and last names from that country.

Particularly cool things about the new names:

  1. No more leagues dominated entirely by people with mid 20th century white names.
  2. Increased realism, as you see a good number of players from expected countries like Spain, Lithuania, etc.
  3. Every now and then, you'll have a Brazilian player with no last name (like Nene and many soccer stars).
  4. Rarely, there will be players from tiny countries. Like if you play 5000 seasons you might see an Icelandic dude named Elvar Vilhjalmsson dominating shit, how cool will that be?

This is live now, even in existing leagues new draft prospects will be generated with this new naming method. And you can see the countries of all the players in your league by going to the Player Ratings page.