Big Data? How about Long Data…

Apologies for the hiatus.  Things (life, holidays, work, travel, library books, book club, making delicious dinners, etc.) got… hectic.  I should know better than to promise upcoming posts, but here I go anyways:  today I had a fascinating and enlightening discussion with some librarians taking a Coursera MOOC on e-learning & digital cultures.  Definitely going to brain-dump some of that discussion here later.  Very promising course.

But, what have I really come to break my long silence here for?  I saw an article today on the ol’ interwebs on Big Data.  Really it was about why we should be hyping up “long data,” though.

Arbesman prefaces his piece by acknowledging that the trend of collecting, analyzing, visualizing and thinking about “big data” brings some amount of value to our society.  (Though, I think it can also be used for negative purposes… just think about all the data Facebook, Amazon, Google and whomever else has on all of us.  What might they do with that?)  But, his main point is that we’re missing an opportunity by only looking at a “snapshot” in time.  Enter “long data,” which does not as of yet have a wikipedia page.

By “long” data, I mean datasets that have massive historical sweep — taking you from the dawn of civilization to the present day.

He does a lovely job offering a picture of what a world of long data might look like – it might enable a much richer and deeper interpretation of how things are and have been.  It’s a very nice concept!  But I particularly like how he calls for an even more forward-thinking approach to data analysis at the end of his piece.

If we’re going to move beyond long data as a mindset, however — and treat it as a serious application — we need to connect these intellectual approaches across fields. We need to connect professional and academic disciplines, ranging from data scientists and researchers to business leaders and policy makers.

Again, I think this is a nice concept and I hope this vision is realized.

Now, I realize that this is a very vision-y piece, with a high-level call and a re-framing of something people are buzzing about at the moment.  But, I think that there are some nitty-gritty aspects to fulfilling this vision that could have been raised.  Most importantly:  to use data effectively and have it be interoperable, metadata (information about the data you’re interested in) must be correct and relevant and there must be enough of it.

These are exciting times in the world of data – we have the NSF and other federal agencies requiring (or strongly suggesting) data management plans in grant proposals, libraries are getting into the mix and figuring out their role in the process of data curation, and individual researchers spinning on their wheels to pump out publications as fast as possible often with little regard to data management at all.  Metadata!  Gotta have it.  And it’s gotta be (reasonably) good.

I’m particularly interested in what librarians are going to do in this sphere moving forward.  I know of data curation specialists, data visualization librarians, and other roles in the library that are starting to interface with these kinds of research.  It’ll be fascinating to see where this goes in the future!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s