Team:Paris Liliane Bettencourt/Project/SIP/Downloads

From 2010.igem.org

Revision as of 08:19, 27 October 2010 by Theotime (Talk | contribs)



SIP Wiki Analyser : Downloads





Team List

Wiki Data SIP Database
To read databases, use sqlite3.

Warning : Notice these files are generated using "links -dump" to remove html, to speed the process, but you can do without that, because SIP will remove them later. With links, some pages with special characters like '(' ')' and ':' in their name are not converted, we consider it's not very important, because it's a small number of pages, but you can re-gen the database without the html parse step.
You can also use html2text, but if the software find special character, it don't remove the html.
Also, some team missed in databases (28 for 2009, and 10 for 2008) : to make the database, I reverse '-' char by '_', cause sqlite3 don't work with this char, but I forgot to change the name to download the team, so the url was bad, and files were not downloaded. Each team with '-' char was not downloaded. I can't re-gen the database, cause it take me a lot of time to compute that, so I've just recompute team with '-' char (dictionary table is unchanged).

Notes about filters : In these files, there're no filters but you can make what you want : remove common name, keep only MeSH terms etc. See what you need!