I love Facebook. There I said it. The number of friends I’ve re-established ongoing contact with and caught back up with in real-life, is more than I every would have otherwise. And not only that but the friends I’ve made along the way, chatted with, and generally shared life events with all through this ubiquitous, globally accepted, easy to use platform… Man I’d pay for it it wasn’t free. Don’t tell Zuck !

Anyway, with privacy and data ownership laws being what they are these days, Facebook, along with most of the other big social media outfits, have made it possible for users to download all the data that has been collected about activities on the platform ever since the day they joined, in an easy to interpret format.

So what would I want to know about my goings-on that may interest or inform me ? I guess there are the easy ones – how often do I use it and , to what purpose, and how has my usage evolved over the 12 years since I joined ? I’ve typically perceived my usage to by cyclic or in phases – I’ll use it rabidly for a while then kick it to the curb and forget about it, then pick it up again. Is my self-assessment true to the data or is there some other pattern in reality ? Also what kinds of posts to do I tend towards ? Do I talk about myself (status updates), talk to and comment upon others posts (comments), share stuff (photos), or just talk out loud a lot (make unsolicited or non-apropos posts) ? So I downloaded my data, dove in and, in the words on my favourite Psychology Professor, “got empirical”.

Some quick notes before jumping in:

  1. The download doesn’t give you everything, for example, it’s mostly focussed on the above sort of categories – they don’t include the host of other activities like voting in polls; expressing interest, creating or attending events; liking something; friending someone; adding life events, etc.
  2. There are some nulls amongst the data which aren’t explained but they are statistically few especially when seen in the context of a 12 year data dump.
  3. This is my activity data so if felt a bit “meta” (or Meta) sharing it, but there’s nothing personally identifiable in it – it’s just some of the patterns associated with a random FB user who happens to be me, and while I’m aware of personal security and privacy, this data doesn’t really compromise that.

So with that out of the way, let’s dive in and get our charts on ! The analysis will consist of a question followed by my visualisation to inform the answer, followed by some quick analysis of the insights revealed.

  1. How many posts have I made since day one, and what’s been the pattern to now ?

So it turns out I’ve made a total of 1,026 posts in the dataset. Pretty sure I’ve interacted a whole lot more than that but as mentioned above this pertains to those four key categories, so while not every little thing is there, it’s enough to form a representative indication of overall activity.

We can see the initial excitement while I joined up and went crazy friending-up as the network effect took hold. Then there was a period of relative silence, followed by a spike, a dip, a larger spike, a dip, and another spike. So more or less cyclic as I’d thought, just the amplitude that varies and this is further backed-up by the trend line suggesting a sinusoidal waveform. There are life and global or even local events that happened which I personally can identify here, which explain to me the reasons for the patterns shown, suggesting a correlation between FB activity and things happening in my biosphere.

2. When do I post and to what degree ?

Dividing the day up into it’s diurnal phases and showing the activity as a scatter plot with number of events determining the size of the dots, we see the majority of the activity is during the day but a good amount of late night and insomniac-style early morning goings-on.

3. What sort of posts do I make depending on time of day ? Do I get all sharey during the day then all commenty during the night ?

Going from top to bottom following the legend, I generally post more than other activities, and do this with increasing frequency from midday up until about 8pm. I share, and update my status, to a medium level during the day, then really ramp up that activity from 7pm to midnight. My writing on people’s timeline is the least frequent and variable activity but does see a slight up-tick after 9pm.

4. During any given time of day, what sort of posts do I make and in what relative quantities ?

First we can look at just the number of posts were made, depending on the time of day (the size and position of the dots), then have a closer look at what sort of posts went on during those times, when batched up in this way (the colour of the dots):

So given those those markets we can show the general timings of activities as well as drills in to show the mix of posts and the relative amounts of each; contributing to a good sense of my method, volume, and types of engagement over time.

For example, it shows that:

  • There’s not much variety in what I’m doing in the wee-small hours;
  • That if I’m to write on someone’s timeline it’s more likely to be either at 10am or 11pm, and that the amount of these varies throughout the day;
  • Commenting or actual posting kicks off around 12pm, and is of a high though varied volume; and
  • While status updates – in contrast the the former two – follows a fairly predictable timing, quantity, and pattern, which is from midday through to midnight.

In order to show the evolution of the types and quantities of posts over time I then produced an animation, aggregated by week:

The next step for this puppy is to produce a smooth animation mimicking Hans Roslings brilliant TED talk about global health as it relates to demographics and economic development. Smooth visualisation animations involves the use and integration of many different tools and some pretty exotic libraries. It’s on my list though. 🙂

Conclusion

Privacy and data ownership laws have made it a requirement for companies collecting personal data to make that data easily available to the party from which it was either knowingly or unknowingly collected. Sometimes it can surprise in the volume, variety, and velocity of that which is collected and returned; other times it turns out to be quite minimal and pedestrian – “nothing to see here”. Examples for the latter currently being Apple and Spotify – I downloaded my data for those and it was a snooze-fest. So some kudos to Facebook I suppose: even if the data is not exhaustive or complete it’s certainly more than what I’ve seen from the others I’ve looked at to date.

And as for what it tells you ? Like all exploratory data analytics and visualisation – often more than what you thought it would, sometimes confirming, other times contradicting your understanding or expectations.

This to me is what good Data Science is about – finding the patterns, the outliers, confirming the expected, revealing the unexpected, gaining novel insights and making sound predictions. And above all making all of this easily understandable to whatever audience is necessary. There is no room for technical or academic arrogance in this field – it’s about being entrusted with massive amounts of oftentimes sensitive data and being relied upon to turn it into useable, actionable intelligence that can be used to inform decisions that advance the objective.

Bad Data Science is sometimes lazily turning loads of numbers into pie charts and a bit of Excel, in that it possibly doesn’t add to or synthesise understanding – in fact it can just add to the noise, and delays decision-making.

Anyways, that it for those particular strange habits. If you’re interested in what you’ve seen then feel free to leave a comment, or catch on Twitter @andrewreardon 🙂

Cheers

Leave a comment

Leave a Reply

error: Content is protected.

Discover more from Thick Ethernet

Subscribe now to keep reading and get access to the full archive.

Continue reading