Team:St Andrews/project/ethics/communication

From 2010.igem.org

(Difference between revisions)
(Premise)
(Premise)
Line 8: Line 8:
  crawler(searchterm):
  crawler(searchterm):
   links = Stack S
   links = Stack S
-
   S = {facebook, twitter, bbcnews, cnn, foxnews, guardian, times, nytimes ..  
+
   S = {facebook, twitter, bbcnews, cnn, foxnews, guardian, times, nytimes .. myawesomeblog}
   while S is not empty or arbritary threshold:  
   while S is not empty or arbritary threshold:  
     crawlerparser(S, searchterm)
     crawlerparser(S, searchterm)

Revision as of 14:43, 6 September 2010


St Andrews from East Sands

University of St Andrews iGEM 2010

Welcome!

The Saints

University of St Andrews iGEM 2010

Our first year at iGEM!

Communicaiton

Premise

Realtime Internet communication is incrasingly common, the so called facebook generation are growing up aquainted with a dizzying array of instantanous comunication methods. The inception of email was hearlded as a revolution in communication, today the quantity of email traffic is at an all time low. In place of email instance messaging and social network messaging have come to precidence. Combined with the vast quanitites of blogs, forum posts, wikis and other forms of user generated content the volume of publically acessible communications is immense. From a human practices perspective this provides a vast and frequently changing dataset which gives insight into how people communicate.

Before one can reap the benefits of having access to such a great pool of data one must answer the challenge of collecting this data. The web stores exobytes of data hence collecting and parsing the entirety of the available data is simply not an option. However this is not required, when interested in a set of related terms such as {synthetic biology, synbio, igem} one can disregard large portions of the web. Furthermore if one is considering gathering data relating to social commuinication then a number of start points quickly become apparent. Firstly serveral social networks offer a fairly standard XML based API and secondly virtually every so called web 2.0 site organises data via some form of chronological hierarchy (be it through metadata or simply via the removal of old articles from the home page of the site). These two features of the web allow for us to deduce a simple algorithm for collecting data.

crawler(searchterm):
 links = Stack S
 S = {facebook, twitter, bbcnews, cnn, foxnews, guardian, times, nytimes .. myawesomeblog}
 while S is not empty or arbritary threshold: 
   crawlerparser(S, searchterm)
crawlerparser(links, searchterm):
 for each link in links
   results = results += link containing searchterm
   for each result in results
     if result is old disregard
     else add all hyperlinks in result to S