Please find the attached sample hadoop application involving XMLInputFormat. We have attached the input format on which we have tried this application.


We have attached the complete jar file also, you can just run it in your VM by providing the input and output files.


When you want to run on other XML files, you just need to change the START_TAG & END_TAG in all the classes i.e., XML_input_driver.java, XMLInputFormat.java, XML_input_mapper.java.


For printing the desired tag value, you need to change the tag name of your XML data in the below lines of XML_input_mapper.java


String name = eElement.getElementsByTagName("name").item(0).getTextContent();
String value1 = eElement.getElementsByTagName("value").item(0).getTextContent();
 

Note: XMLInputFormat remains the same for any XML data you have, you just need to change the START_TAG  and the END_TAG


Below is the output we have got,




XML tag values are extracted as per the format written in the context of the XML_input_mapper.java file, you can alter this as per your requirement.


--------------------------

Thanks and Regards,

Kiran Krishna

--------------------------