I’m keeping this blog as a sort of journal for a project involving data analysis. My starting point is twitter data, but I don’t really know where it might end up.
I have worked with databases and reporting for a few decades and more recently and I guess in conjunction with the rise of data science I have become more interested in getting deeper. Have taken a few online classes (Udacity, Coursera), have messed with R and Python a little personally and for work purposes and so I have come to a point where I feel like embarking on a project. Now, usually a data project should start with a question. That’s what all the texts say. But I don’t have a specific question in mind at the moment. Oh, sure, there are millions of questions that one could choose from. I want the data to show me what to ask. This is an area where I have specific experience and skill. Looking at a set of something, studying it, getting familiar with it, and noticing things that are odd or different or interesting. I think i actually learned this initially from Sesame Street (“one of these things is not like the others…”). Seems silly, yes? I think not, I think it is a very good skill to have. At the 4 items level and at the millions of items level. Being able to notice subtle differences, subtle similarities, stuff that doesn’t seem right or is out of place, the absence of something. These are the triggers and indicators for action, insight, deeper knowledge.
So, here I go. I’ll be journaling things that I find, sections of code, learnings, problems, whatever. Mostly for my use and reference, but maybe there will be a reader who benefits.