Posts tagged as:

big data

The Humanities Take on Data Mining via Google Books

22 June 2010
Thumbnail image for The Humanities Take on Data Mining via Google Books

The Humanities are “Going Google”, according to Marc Parry of The Chronicle, in a piece he wrote a few weeks ago.
The gist of the article is that some Humanities scholars are very interested in data mining the texts scanned in for the Google Books Project.
Why do they want to use Big Data mining techniques [...]

  • Share/Bookmark
Read the full article →

The Multiple Aspects of Data Science

21 June 2010
Thumbnail image for The Multiple Aspects of Data Science

Earlier this month, Nathan Yau at FlowingData posted Mike Loukides‘ analysis of data science from O’Reilly Radar. I finally found some time to read it.
I really enjoyed the post. The author entitled it, “What is data science?“, and covered the various aspects of the newbie field, primarily from a commercial point of view. He examined: [...]

  • Share/Bookmark
Read the full article →

Beginning a Series — Reviews of Open Data Sites

29 January 2010
Thumbnail image for Beginning a Series — Reviews of Open Data Sites

I will be reviewing English-language, government-sponsored open data sites as an off-shoot of my doctoral work. I will begin initially with the “key” government sites compiled by the authors of The Guardian’s DataBlog as one of their inaugural posts.
Last week I reviewed data.gov.uk, so I while I may add a bit more detail to [...]

  • Share/Bookmark
Read the full article →

HM Government Opens Up Government Data to the Public

21 January 2010
Data.uk.gov web site listing of all data sets

The British Government has released data sets to the public for use in either the public or private sectors at data.gov.uk.
Previously, the governments of the United States, Australia, and New Zealand had created data sites for use by the public, including commercial use. The primary idea behind the release of these data sets is [...]

  • Share/Bookmark
Read the full article →

Roger Magoulas Defines a Data Scientist

18 January 2010
Roger Magoulas Discusses Big Data and Data Science

Roger Magoulas, the director of market research at O’Reilly Media, defines a Data Scientist in the short video Big Data (part one), which is part of O’Reilly’s “The Future at Work” video series.

First, he sees a Data Scientist as someone with an amalgamation of skills that used to be reserved only for academic institutions [...]

  • Share/Bookmark
Read the full article →

“A Data Deluge Swamps Science Historians”

14 January 2010
Server cabinets at a data center

A few months ago, Robert Lee Holtz wrote an article in the Science Journal section of the Wall Street Journal where he discussed how the data deluge is swamping scientists and researchers.
The author addressed the particular issue of how curators store data for current and future access so that other scientists may access the data [...]

  • Share/Bookmark
Read the full article →

Should Cloud Computing Be Called Swamp Computing?

5 January 2010
NC Swamp: Should Cloud Computing Be Called Swamp Computing?

David Talbot at Technology Review published an article recently entitled, “Security in the Ether“. The author writes that the efficiencies of cloud computing are also its weaknesses. Users’ access to all of the bells and whistles a cloud offers could also enable them to attack a specific target, once they were able to get onto [...]

  • Share/Bookmark
Read the full article →

The Fourth Paradigm Data-Intensive Scientific Discovery

16 December 2009
The Fourth Paradigm: Data-Intensive Scientific Discovery

John Markoff, the author of a New York Times article called, “A Deluge of Data Shapes a New Era in Computing“, writes that Tony Hey, Stewart Tansley and Kristin Tolle have edited a book that discusses the “Fourth Paradigm”. The book, The Fourth Paradigm Data-Intensive Scientific Discovery, is in honor of Jim Gray, who argued [...]

  • Share/Bookmark
Read the full article →

What is Taming (the) Data (Deluge)?

1 December 2009
Lasso at ready, to tame the data deluge

The data deluge refers to the increasingly large and complex data sets generated by researchers that must be managed by their creators with “industrial-scale data centres and cutting-edge networking technology” (Nature 455) in order to provide for use and re-use of the data.
The lack of standards and infrastructure to appropriately manage this (often tax-payer [...]

  • Share/Bookmark
Read the full article →