Hackpads are smart collaborative documents. .

José Moreira

590 days ago
Proactive P Notes for the day 11 Dec 2015: 
 
 
 
Fiona N Tweet with hashtag: 
#CMDNAHack please use when twitting!
 
And find us on Twitter: @DNAdigest @theContentMine @linguamatics 
 
 
 
Peter M THIS COMMAND WORKS for PMR, suggest you use different years
getpapers --query '"human genomic" AND PUB_YEAR:[2010 TO 2010]' -o genome2010 -x
 
Antony Q Delete empty directories:
cd genome2010
find -empty -delete
cd ..
 
Peter M then normalize:
norma -q genome2010 -i fulltext.xml -o scholarly.html --transform nlm2html
 
then word frequencies
ami2-word -q genome2010 -i scholarly.html --w.words wordFrequencies  --w.stopwords /org/xmlcml/ami2/plugins/word/stopwords.txt
 
and regex
 
Antony Q ENA (European Nucleotide Archive) regex (from http://www.ebi.ac.uk/miriam/main/collections/MIR:00000372):
^[A-Z]+[0-9]+$
 
 
 
José M More often than not, i utilise https://regex101.com for helping me develop and test regular expressions and might be useful for other people here, particularly regular expression newbies. For example: https://regex101.com/r/cA9aK7/1
 
  • Notes
 

Contact Support



Please check out our How-to Guide and FAQ first to see if your question is already answered! :)

If you have a feature request, please add it to this pad. Thanks!


Log in