readme.txt ---------- These files accompany the presentation that's at http://www.uoregon.edu/~joe/missing-half/missing-half.ppt (or .pdf) The files provided here are as-is, with no warranty. You will need to evaluate the suitability of these files for your particular purposes yourself. get-data.sh retrieves data files via rsync from I2 netflow data archive this sample file pulls an hour's worth of data for each of the router nodes; the geographic node structure of the file system is preserved after running this file, you'll have a series of flow-tool format data files, with files in each of nine geographic directories (ATLA CHIC HOUS KANS LOSA NEWY SALT STTLng WASH) export.sh uses flow-tools flow-export command to export data files in CSV format it is assumed that this file is run from within each geographic node after running this file you'll be left with a comma separated variable data file, 1-for-1 for each original flow-tool format files The following text files use SAS (see http://www.sas.com/ )............ formats.sas builds sas formats including map of AS numbers to AS names, protocol numbers to protocol names, etc. this file should be run prior to running the import.sas file note that not all ASNs are defined in the file; you may need to use whois to extend the ASN definitions to meet your needs import.sas reads the CSV files into SAS, creating a SAS permanent data set we provide a sample for ATLA, but you'd want to tailor and run this code in each of the geographic node directories analyze.sas reads the sas permanet data set created by import.sas, applying classification rules, and creating a new SAS permanent data set with the classification data Sample provided for ATLA, but you'd want to tailor and run this code in each of the geographic node directories Note that tailoring includes (if you're interested in a de-duped network wide view) tweaking the interfaces which are deleted for each router node; to appropriately tweak, you'll need to retrieve interface mapping information. get-nfilter.sh is a file that shows the process of retrieving interface information combine.sas combines the sas permanent data sets created for each geographic region reanalyze.sas demonstrate a different system of categorization Questions/concerns? I'm at joe@oregon.uoregon.edu