José Moreira

590 days ago
Proactive P Notes for the day 11 Dec 2015: 
Fiona N 

 
Peter M THIS COMMAND WORKS for PMR, suggest you use different years
getpapers --query '"human genomic" AND PUB_YEAR:[2010 TO 2010]' -o genome2010 -x
Antony Q Delete empty directories:
cd genome2010
find -empty -delete
cd ..
Peter M then normalize:
norma -q genome2010 -i fulltext.xml -o scholarly.html --transform nlm2html
then word frequencies
ami2-word -q genome2010 -i scholarly.html --w.words wordFrequencies  --w.stopwords /org/xmlcml/ami2/plugins/word/stopwords.txt
and regex
Antony Q ENA (European Nucleotide Archive) regex (from http://www.ebi.ac.uk/miriam/main/collections/MIR:00000372):
José M More often than not, i utilise https://regex101.com for helping me develop and test regular expressions and might be useful for other people here, particularly regular expression newbies. For example: https://regex101.com/r/cA9aK7/1
