You may know this problem: Your boss made you build several data centers around the globe with a bunch of NetScalers in the mix to load balance services across the various locations using GSLB (Global Server Load Balancing). But when it comes to configuring a static proximity geo IP database to help with that not too easy to understand dynamic proximity feature you notice most of these databases are commercial and you are out of budget. Luckily though, there are several free versions of geo IP databases out there, which reportedly work quite well. Before using one of those, you should carefully review the license terms. Some aren’t necessarily free for commercial use…

Moreover, these free versions are very popular, well maintained and were frequently updated (I say were because with the assignment of the last IP block by RIPE earlier in 2012 there shouldn’t be too many changes to the databases anymore) . So it’s fair to say that many of our clients who are using these databases are very satisfied with them.

Why is a database with IP addresses and address blocks necessary for such a setup? GSLB responds to DNS requests for a domain name with an IP address of a member service. Which service IP is returned is dependent on the load-balancing algorithm used, for example least connection, simple round robin or more commonly used, proximity to the client (or the clients local DNS to be precise). For proximity based GSLB, when a client sends a DNS request, the system determines the best suitable site and returns its IP to the client. In the process of ascertaining that decision, either dynamic proximity logic (CTX130154) is used, static proximity (CTX130155) or a mix of both.

 

Introduction

In this article I’ll show you how to import the GeoLite®City database from MaxMind®, Inc. The nice people from MaxMind offer different versions of free IP databases, e.g. a country- based and a city-based version, where the city version claims to have an “accuracy of over 99.5% on a country level and 78% on a city level for the US within a 40 kilometer radius.” (I’d like to stress the point that neither Citrix nor me personally have any favorite free IP database; for me, MaxMind was just the first hit on an Internet search and the license allowed to use it for the purpose at hand)

While most of our customers are satisfied with the country version, on a recent project I’ve been asked to import the city-based version. To import databases the Citrix article CTX116335 is good way to start. It shows the various formats the NetScaler accepts per default.

 

Formats needed and available

From the article I learned that the following format should be correct for GeoLite-city

“IP from (decimal #)”,”IP to (decimal #)”,”Location ID”,”CSHN”, “RC”, “City”<

The issue is though, when downloading and unpacking the correct archive from MaxMind’s website it contains two CSV files with the following content:

GeoLiteCity-Blocks.csv
| locId | country | region | city | postalCode | latitude | longitude | dmaCode | areaCode

GeoLiteCity-Location.csv
| startIpNum | endIpNum | locId

What needs to be done apparently is to join both files into one. From the location file we need all three values, followed by second to fourth value from the blocks file. Knowing your way around in Excel this should be an easy task you might think, unfortunately though the files are just too big to be opened by Excel (the contain >200.000 lines). Hence, it was time to start up my database server (MySQL) and create a joined CSV file with it.
 
Joining Location and Blocks CSV files with MySQL

First step to work with the geo data with MySQL is to create the database and tables. Below example is partly from MaxMind’s webpage, changed to my needs and works in my environment.

CREATE DATABASE GeoLiteCity;
USE GeoLiteCity;
CREATE TABLE blocks (startIpNum int(10) unsigned NOT NULL,endIpNum int(10) unsigned NOT NULL,locId int(10) unsigned NOT NULL,PRIMARY KEY (endIpNum));
 
CREATE TABLE location(locId int(10) unsigned NOT NULL,country char(2) NOT NULL,region char(2) NOT NULL,city varchar(50),postalCode char(5) NOT NULL,latitude float,longitude float,dmaCode integer,areaCode integer,PRIMARY KEY (locId)); 

In the next step the two files can be loaded into the previously created tables. For this we utilize the LOAD DATA function, provide the correct path to the CSV files and load them separated by comma into the respective tables.

LOAD DATA LOCAL INFILE ‘/tmp/GeoLiteCity_20121002/GeoLiteCity-Blocks.csv’ INTO TABLE GeoLiteCity.blocks FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘\”‘ LINES TERMINATED BY ‘\n’;
LOAD DATA LOCAL INFILE ‘/tmp/GeoLiteCity_20121002/GeoLiteCity-Location.csv’ INTO TABLE GeoLiteCity.location FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘\”‘ LINES TERMINATED BY ‘\n’;

As mentioned earlier the goal is to get one joined CSV file out of the data from both tables. MySQL supports writing result sets from SQL queries into files on disk in the very same way as it allows loading data from files into tables. Below command should do the trick, however it requires certain permissions on MySQL’s site in order to work.

SELECT b.startIpNum, b.endIpNum, b.locId, l.country, l.region, l.city INTO OUTFILE ‘/tmp/out.csv’ FIELDS TERMINATED BY ‘,’ OPTIONALLY ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’ FROM blocks b INNER JOIN location l ON (b.locId=l.locId) ORDER BY b.startIpNum;

To get around the configuration changes needed to make the aforementioned command working you can simply put the query from above (without the outfile option of course) into a file and use standard UNIX tools to export the data.

Query to join data from both tables based on their location ID:

SELECT b.startIpNum, b.endIpNum, b.locId, l.country, l.region, l.city FROM blocks b INNER JOIN location l ON (b.locId=l.locId) ORDER BY b.startIpNum;

Once you store this command in a file, the following two commands – connected by a pipe – will create a CSV file with the fields in the correct order for NetScaler’s location database engine.

mysql GeoLiteCity –p<> < GeoLiteCity_join.sql | awk ‘{print $1″,”$2″,”$3″,”$4″,”$5″,”$6′} > /tmp/myGeoDB.csv

 

Importing the file to NetScaler

Having completedthese steps and copied the file to the NetScaler’s file system, the following commands will import the data to the NetScaler location database.

NS> add locationfile /var/GeoIP/myGeoDB.csv -format Geoip-City

After a successful import the command “show locationparameter” will show you which file is currently active, how many entries are in the database and a lot more additional information. Due to the size of the database the NetScaler GUI will probably not be able to display the location data because the Java VM has a memory limitation. If that’s the case, the following command line tool on NetScaler will help you to test the database:

# nsmap -d -t

The database imported with the steps in this article is relatively large and hence will need some system memory and NetScaler naturally will need longer to lookup entries compared to smaller IP databases. The question should be asked and of course answered by the architect of such a solution whether or not the gained accuracy is worth the performance impact (even if it’s very little) and problematic handling of the files due to the sheer size. The closer your data centers are to each other the more accurate the IP database should be. With that in mind, for most global deployments there shouldn’t be the need to use the very accurate city version.

 

This code is provided to you “as is” with no representations, warranties or conditions of any kind. You may use and distribute it at your own risk. CITRIX DISCLAIMS ALL WARRANTIES WHATSOEVER, EXPRESS, IMPLIED, WRITTEN, ORAL OR STATUTORY, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NONINFRINGEMENT. Without limiting the generality of the foregoing, you acknowledge and agree that (a) the software application may exhibit errors, design flaws or other problems, possibly resulting in loss of data or damage to property; (b) it may not be possible to make the software application fully functional; and (c) Citrix may, without notice or liability to you, cease to make available the current version and/or any future versions of the software application. In no event should the code be used to support of ultra-hazardous activities, including but not limited to life support or blasting activities. NEITHER CITRIX NOR ITS AFFILIATES OR AGENTS WILL BE LIABLE, UNDER BREACH OF CONTRACT OR ANY OTHER THEORY OF LIABILITY, FOR ANY DAMAGES WHATSOEVER ARISING FROM USE OF THE SOFTWARE APPLICATION, INCLUDING WITHOUT LIMITATION DIRECT, SPECIAL, INCIDENTAL, PUNITIVE, CONSEQUENTIAL OR OTHER DAMAGES, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. You agree to indemnify and defend Citrix against any and all claims arising from your use, modification or distribution of the code.