Playing with R

August 1, 2006

I’m doing some data analysis at the moment, and I’ve gone through some toolsets in search of a combination that will give me power, expressiveness and charting capability. The raw data comes out of our trading system’s RDB. I started off with SQL queries, but soon found the slowness of complex queries and the arcana of SQL was holding me back. So I exported the data as CSV and set to work with Excel. I made some more progress, but didn’t want to resort to VBA for the ad hoc slice and dice coding. So then I switched to processing my 300Mb CSV with Python. Progress was good for a couple of days, but then my script started to get hairier. The data extraction and cleansing logic was all mixed up with the analysis logic. And I still didn’t have any good charts. Back to the drawing board. I vaguely remembered folk on Victor Niederhoffer’s mailing list discussing R. I checked out the programming recommendations on the great man’s site. So I downloaded R, and after 30 minutes playing, I’m very impressed. Scripting, powerful maths, built in vector and matrix data types, and flexible charting all rolled together. I’ll be asking around on the floor to see if any of our traders are using R…