Analyzing your customer data

This is the first part in a series about analyzing your customer data.

GENDER

Find the gender of your customer base after the fact. This is straightforward and can be done based on the first name of your customer list. First get a names database, for example nam_dict.txt.bz2 from http://svn.php.net/viewvc/pecl/gender/trunk/data/nam_dict.txt.bz2?revision=275820&view=co then connect the data using Excel or your command line:

# save your customer first names to ./customernames (lowercase, no writespace)
cat nam_dict | cut -b1-25 | egrep -i '^[a-z ]+$' > nam_dict2
cut -b1-2 nam_dict2 > t1
cat nam_dict | cut -b1-25 | egrep -i '^[a-z ]+$' | cut -b4- | tr '[:upper:]' '[:lower:]' | sed 's/[ \t]*$//' > t2
paste t2 t1 > nam_tsv
join customernames nam_tsv | cut -d' ' -f2 | sort | uniq -c

SAMPLE OUTPUT
5711 F
5624 M
Post a Comment

Popular posts from this blog

How to connect your Roku to Xfinitywifi

Excel bug/feature: recursive VLOOKUP for hierarchy calculations

Is Free Internet Chess Server (FICS) a registered nonprofit? Are donations tax deductible?