error by uploading a "xsd " file

Hi everyone,

I am trying to import an "xsd" file in R but unfortunately I get an error.

library("XML")
library("methods")
result<- xmlParse(file= "C:/Users/LE/Data_Public_2017.xsd")
print(result)
rootnode<- xmlRoot(result)
rootsize<- xmlSize(rootnode)
print(rootsize)
print(rootnode)
data<- xmlToDataFrame("C:/Users/LE/Data_Public_2017.xsd")

the error that I get is

Error in [<-.data.frame(*tmp*, i, names(nodes[[i]]), value = c(attribute = "", :
doppelte Indizes für Spalten
10.
stop("duplicate subscripts for columns")
9.
[<-.data.frame(*tmp*, i, names(nodes[[i]]), value = c(attribute = "",
attribute = "", attribute = ""))
8.
[<-(*tmp*, i, names(nodes[[i]]), value = c(attribute = "",
attribute = "", attribute = ""))
7.
fromRaggedXML2DataFrame(nodes, varNames, c(length(nfields), length(varNames)),
colClasses, stringsAsFactors)
6.
xmlToDataFrame(doc, colClasses, homogeneous, collectNames, nodes = xmlChildren(xmlRoot(doc)),
stringsAsFactors)
5.
xmlToDataFrame(doc, colClasses, homogeneous, collectNames, nodes = xmlChildren(xmlRoot(doc)),
stringsAsFactors)
4.
xmlToDataFrame(xmlParse(doc), colClasses, homogeneous, collectNames,
stringsAsFactors = stringsAsFactors)
3.
xmlToDataFrame(xmlParse(doc), colClasses, homogeneous, collectNames,
stringsAsFactors = stringsAsFactors)
2.
xmlToDataFrame("C:/Users/LE/Data_Public_2017.xsd")
1.
xmlToDataFrame("C:/Users/LE/Data_Public_2017.xsd")

Anyone has an idea how could I import this file?

Many thanks in advace :slight_smile:

I dad never even heard of one before but this might help. XSD File (What It Is and How to Open One)

thanks, great article ...but still unable to open the file

Well, I am unlikely to be of much help but is the file publicly available? Someone here might have an idea of how to tackle it.

If you have access to excel, you can follow the links to material that will teach you to use excel to convert the hierarchical xml into a flat form; you could then trivially save this to csv and load to R as a data.frame.

The general problem of how to arrange hierarchical info, in a rectangular way is what needs to be overcome in your case it seems. what do with dupicated subscripts for columns.

consider these constructed 'simple' examples:


xmlToDataFrame('<outer>
               <middle>
               <fact1>0</fact1>
               <fact2>a</fact2>
               </middle></outer>')

# fact1 fact2
#     0     a
xmlToDataFrame('<outer>
               <middle>
               <fact1>0</fact1>
               <fact2>a</fact2>
               </middle>
               <middle>
               <fact1>1</fact1>
               <fact2>b</fact2>
               </middle>
               </outer>')
# fact1 fact2
#    0     a
#    1     b

# complicate by declaring two fact1's  for the second row
xmlToDataFrame('<outer>
               <middle>
               <fact1>0</fact1>
               <fact2>a</fact2>
               </middle>
               <middle>
               <fact1>1</fact1>
               <fact1>2</fact1> 
               <fact2>b</fact2>
               </middle>
               </outer>')

# what would you expect to see if the second row has two 'fact1's value of 1 and 2 ?

Hi everyone,

yes, it is publicly available. I need to upload the " [Referenzdaten 2020 als XML-Datei ]"
(https://www.nrz-hygiene.de/files/Referenzdaten/ITS/Infektionen/ITS-Infektionen_Archive/Referenzdaten_2020/ITSRefData_Public_2020.xml)"

Many thanks in advance.

thanks @nirgrahamuk, but did not work for me...

The files is publicly available (https://www.nrz-hygiene.de/files/Referenzdaten/ITS/Infektionen/ITS-Infektionen_Archive/Referenzdaten_2020/ITSRefData_Public_2020.xml)"

I have tried :

Load the package required to read XML files.

library("XML")

Also load the other required package.

library("methods")

Give the input file name to the function.

result <- xmlParse(file = "ITSRefData_Public_2020.xml")

Print the result.

print(result)

i took it from here:

but unfortunately it did not work...It gives me an error:

"Error: XML content does not seem to be XML: ''

I have also tried,

library("XML")
library("methods")
result<- xmlParse(file= "C:/Users/E/Desktop/R/ITS-KISS/ITSRefData_Public_2020.xml")
print(result)
rootnode<- xmlRoot(result)
rootsize<- xmlSize(rootnode)
print(rootsize)
print(rootnode)
data<- xmlToDataFrame("C:/Users/E/Desktop/R/ITS-KISS/ITSRefData_Public_2020.xml")

I was able to upload the data, with 11 observations of 4 variables but it is empty...

thanks in advance

I don't know what causes your issue; I downloaded the file you linked to and XML 'xmParsed' it fine...
One possible way you might have caused a problem is if you didnt simply download the file, but did some sort of incorrect copy paste ? I honestly have no idea.

Hi@nirgrahamuk

thanks very much. Yes, you are right. I think the problem is that the downloaded file has incorrectly copy paste, because when I open the file it says:
"This XML file does not appear to have any style information associated with it. The document tree is shown below"
and then it shows the xsd tree...

I will keep trying :wink: ..completely new to xml files ....thanks for the hint

Best

Advice : dont copy and paste, right click the xml page in your browser and choose the save option.

Same here. I think the problem is you need to read in the data using `read_xml" and then you can parse it, etc.

See R XML: How to Work With XML Files in R

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.