I am trying to parse xmlValue of certain child nodes from NCBI xml file. But, for some PM.IDs, the Root node <PubmedArticleSet> has different information w.r.t pubmed records, PubmedBookArticle and PubmedArticle. I would like to pass a condition, if(xmlName(fetch.pubmed) == PubmedBookArticle extract certain valueselseif (xmlName(fetch.pubmed) == PubmedArticle extract other values. Finally, make a dataframe with both the values corresponding to their PMIDs. It seems simple, but (xmlName(fetch.pubmed) throws error no applicable method for 'xmlName' applied to an object of class "c('XMLInternalDocument', 'XMLAbstractDocument')" Any help is appreciated, thank you
<?xml version="1.0"?>
<!DOCTYPE PubmedArticleSet PUBLIC "-//NLM//DTD PubMedArticle, 1st January 2015//EN" "http://www.ncbi.nlm.nih.gov/corehtml/query/DTD/pubmed_150101.dtd">
<PubmedArticleSet>
<PubmedBookArticle>
<BookDocument>
<PMID Version="1">25506969</PMID>
<ArticleIdList>
<ArticleId IdType="bookaccession">NBK259188</ArticleId>
</ArticleIdList> ....
...... </BookDocument>
</PubmedBookArticle>
<PubmedArticle>
<MedlineCitation Status="Publisher" Owner="NLM">
<PMID Version="1">25013473</PMID>
<DateCreated>
<Year>2014</Year>
<Month>7</Month>
<Day>11</Day>
</DateCreated>....
....</MedlineCitation>
</PubmedArticle>
</PubmedArticleSet>
My code is below
library(XML)
library(rentrez)
PM.ID <- c("25506969"," 25032371"," 24983039","24983034","24983032","24983031",
"26386083","26273372","26066373","25837167",
"25466451","25013473")
# rentrez function to retrieve XMl file for above PIMD
fetch.pubmed <- entrez_fetch(db = "pubmed", id = PM.ID,
rettype = "xml", parsed = T)
# If empty records, return NA
FindNull <- function(x,x1child){
res <- xpathSApply(x,x1child,xmlValue)
if (length(res) == 0){
out <- NA
}else {
out <- res
}
out
}
# extract contents from xml file
xpathSApply(fetch.pubmed,"//PubmedArticle",FindNull,x1child = './/ArticleTitle')
xpathSApply(fetch.pubmed,"//PubmedBookArticle",FindNull,x1child = './/BookTitle')
How do I get above code in a loop, so that I can retrieve values within PubmedArticle and PubmedBookArticle as an when the condition is met in each search ?
0 comments:
Post a Comment