Top xml Questions

List of Tags

I mean 100+ MB big; such text files can push the envelope of editors.

I need to look through a large XML file, but cannot if the editor is buggy.

Any suggestions?

Answered By: Nick ( 327)

I'm assuming that you're on Windows, so I'll recommend gVim. Where Notepad++ will choke on very large files, VIM has chowed through those puppies with little problem.

010Editor on Windows will open GIANT (think 5GB) files in binary mode and allow you to edit and search the text.

Community wiki:

Suggestions are

  • gVim loads entire file into memory first.
  • SlickEdit
  • Emacs (has a low maximum buffer size limit if compiled in 32-bit mode).
  • Large Text File Viewer
  • PilotEdit (loads entire file into memory first).
  • HxD hex editor, but good for large files.

Text editors with 2GB limit: Notepad++, Jujuedit, TextPad

Chas. Owens

One mistake I see people making over and over again is trying to parse XML or HTML with a regex. Here are a few of the reasons parsing XML and HTML is hard:

People want to treat a file as a sequence of lines, but this is valid:


People want to treat < or <tag as the start of a tag, but stuff like this exists in the wild:

<img src="imgtag.gif" alt="<img>" />

People often want to match starting tags to ending tags, but XML and HTML allow tags to contain themselves (which traditional regexes cannot handle at all):

<span id="outer"><span id="inner">foo</span></span>

People often want to match against the content of a document (such as the famous "find all phone numbers on a given page" problem), but the data may be marked up (even if it appears to be normal when viewed):

<span class="phonenum">(<span class="area code">703</span>)
<span class="prefix">348</span>-<span class="linenum">3020</span></span>

Comments may contain poorly formatted or incomplete tags:

<a href="foo">foo</a>
<!-- FIXME:
    <a href="
<a href="bar">bar</a>

What other gotchas are you aware of?

Answered By: bobince ( 150)

Here's some fun valid XML for you:

<!DOCTYPE x [ <!ENTITY y "a]>b"> ]>
    <a b="&y;>" />
    <![CDATA[[a>b <a>b <a]]>
    <?x <a> <!-- <b> ?> c --> d

And this little bundle of joy is valid HTML:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "" [
    <!ENTITY % e "href='hello'">
    <!ENTITY e "<a %e;>">
    <p id  =  a:b center>
    <span / hello </span>
    &amp<br left>
    <!---- >t<!---> < -->
    &e link </a>

Not to mention all the browser-specific parsing for invalid constructs.

Good luck pitting regex against that!

EDIT (Jörg W Mittag): Here is another nice piece of well-formed, valid HTML 4.01:

c#, xml

The default methods for dealing with XML in C# seem incredibly crude to me leading me to suspect that I must be missing something in my searches. Is there a simpler method of parsing XML files in C#? If so, what?

Answered By: Jon Galloway ( 108)

I'd use LINQ to XML if you're in .NET 3.5.

Julius A

Where can I get a list of the XML document escape characters?

Answered By: Welbog ( 272)

There are only five:

"   &quot;
'   &apos;
<   &lt;
>   &gt;
&   &amp;

They're easy to remember. HTML has its own set of escape codes which cover a lot more characters.

Here's my layout code;

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android=""

    <TextView android:text="@string/welcome" 

    <LinearLayout android:id="@+id/LinearLayout" 

            <EditText android:id="@+id/EditText" 

            <Button android:text="@string/label_submit_button" 



What this looks like is on the left and what I want it to look like is on the right.

Android Layout - Actual (Left) and Desired (Right)

The obvious answer is to set the TextView to fill_parent on height but this causes no room to be left for the button or entry field. Essentially the issue is that I want the submit button and the text entry to be a fixed height at the bottom and the text view to fill the rest of the space, similarly in the horizontal Linear layout I want the submit button to wrap its content and for the text entry to fill the rest of the space.

If the first item in a Linear Layout is told to fill_parent it does exactly that, leaving no room for other items, how do I get an item which is first in a linear layout to fill all space apart from the minimum required by the rest of the items in the layout?


Relative Layouts were indeed the answer - Thank you!

    <?xml version="1.0" encoding="utf-8"?>


        android:layout_alignParentBottom="true" >




Answered By: Janusz ( 121)

I think you should try a relative layout.
If you have a relative layout that fills the whole screen you should be able to use android:layout_alignParentBottom to move the button to the bottom of the screen.

If your views at the bottom are not shown in a relative layout then maybe the layout above it takes all the space. In this case you can put the view that should be at the bottom, first in your layout file and position the rest of the layout above the views with android:layout_above. This enable the bottom view to take as much space as it needs and the rest of the layout can fill all the rest of the screen.


Is REST a better approach to doing Web Services or is SOAP? Or are they different tools for different problems? Or is it a nuanced issue - that is, is one slightly better in certain arenas than another, etc?


Now, almost three years later I would like to ask this question again - offering a bounty to encourage an indepth answer. I would especially appreciate information about those concepts and their relation to the PHP-universe and also modern high-end web-applications.

Answered By: mdhughes ( 270)

I built one of the first SOAP servers, including code generation and WSDL generation, from the original spec as it was being developed, when I was working at Hewlett-Packard. I do NOT recommend using SOAP for anything.

The acronym "SOAP" is a lie. It is not Simple, it is not Object-oriented, it defines no Access rules. It is, arguably, a Protocol. It is Don Box's worst spec ever, and that's quite a feat, as he's the man who perpetrated "COM".

There is nothing useful in SOAP that can't be done with REST for transport, and JSON, XML, or even plain text for data representation. For transport security, you can use https. For authentication, basic auth. For sessions, there's cookies. The REST version will be simpler, clearer, run faster, and use less bandwidth.

XML-RPC clearly defines the request, response, and error protocols, and there are good libraries for most languages. However, XML is heavier than you need for many tasks.


Isn't there a convenient way of getting from a java.util.Date to a XMLGregorianCalendar?

Answered By: Ben Noland ( 285)
GregorianCalendar c = new GregorianCalendar();
XMLGregorianCalendar date2 = DatatypeFactory.newInstance().newXMLGregorianCalendar(c);
Steve McLeod

I have a Java String that contains XML, with no line feeds and indentations. I would like to turn in into a String with nicely formatted XML. How do I do this?

String unformattedXml = "<tag><nested>hello</nested></tag>";
String formattedXml = new [UnknownClass]().format(unformattedXml);

Note: My input is a String. My output is a String.

Answered By: Steve McLeod ( 70)

Here's an answer to my own question. I combined the answers from the various results to write a class that pretty prints XML.

No guarantees on how it responds with invalid XML or large documents.

package ecb.sdw.pretty;

import org.apache.xml.serialize.OutputFormat;
import org.apache.xml.serialize.XMLSerializer;
import org.w3c.dom.Document;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

 * Pretty-prints xml, supplied as a string.
 * <p/>
 * eg.
 * <code>
 * String formattedXml = new XmlFormatter().format("<tag><nested>hello</nested></tag>");
 * </code>
public class XmlFormatter {

    public XmlFormatter() {

    public String format(String unformattedXml) {
        try {
            final Document document = parseXmlFile(unformattedXml);

            OutputFormat format = new OutputFormat(document);
            Writer out = new StringWriter();
            XMLSerializer serializer = new XMLSerializer(out, format);

            return out.toString();
        } catch (IOException e) {
            throw new RuntimeException(e);

    private Document parseXmlFile(String in) {
        try {
            DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
            DocumentBuilder db = dbf.newDocumentBuilder();
            InputSource is = new InputSource(new StringReader(in));
            return db.parse(is);
        } catch (ParserConfigurationException e) {
            throw new RuntimeException(e);
        } catch (SAXException e) {
            throw new RuntimeException(e);
        } catch (IOException e) {
            throw new RuntimeException(e);

    public static void main(String[] args) {
        String unformattedXml =
                "<?xml version=\"1.0\" encoding=\"UTF-8\"?><QueryMessage\n" +
                        "        xmlns=\"\"\n" +
                        "        xmlns:query=\"\">\n" +
                        "    <Query>\n" +
                        "        <query:CategorySchemeWhere>\n" +
                        "   \t\t\t\t\t         <query:AgencyID>ECB\n\n\n\n</query:AgencyID>\n" +
                        "        </query:CategorySchemeWhere>\n" +
                        "    </Query>\n\n\n\n\n" +

        System.out.println(new XmlFormatter().format(unformattedXml));

Saif Khan

Is there a free XML formatting (indent) tool available where I can past an XML string and have it formatted so I can read the XML document correctly?


Edit ~ I am using XML Notepad on Windows XP.

Answered By: Guy ( 286)

I believe that Notepad++ has this feature.

Edit (for newer versions)
Install the "XML Tools" plugin (Menu Plugins, Plugin Manager)
Then run: Menu Plugins, Xml Tools, Pretty Print (XML only - with line breaks)

Original answer (for older versions of Notepad++)

Notepad++ menu: TextFX -> HTML Tidy -> Tidy: Reindent XML

This feature however wraps XMLs and that makes it look 'unclean'. To have no wrap,

  • open C:\Program Files\Notepad++\plugins\Config\tidy\TIDYCFG.INI,
  • find the entry [Tidy: Reindent XML] and add wrap:0 so that it looks like this:
[Tidy: Reindent XML] 
input-xml: yes 

How do I declare an Android UI element using XML?

Answered By: Casebash ( 332)

The Android Developer Guide has a section called Building Custom Components. Unfortunately, the discussion of XML attributes only covers declaring the control inside the layout file and not actually handling the values inside the class initialisation. The steps are as follows:

1. Declare attributes in values\attrs.xml

<?xml version="1.0" encoding="utf-8"?>
    <declare-styleable name="MyCustomView">
        <attr name="android:text"/>
        <attr name="android:textColor"/>            
        <attr name="extraInformation" format="string" />

Notice the use of an unqualified name in the declare-styleable tag. Non-standard android attributes like extraInformation need to have their type declared. Tags declared in the superclass will be available in subclasses without having to be redeclared.

2. Create constructors

Since there are two constructors that use an AttributeSet for initialisation, it is convenient to create a separate initialisation method for the constructors to call.

private void init(AttributeSet attrs) { 
    TypedArray a=getContext().obtainStyledAttributes(

    //Use a
         R.styleable.MyCustomView_android_textColor, Color.BLACK));

    //Don't forget this

R.styleable.MyCustomView is an autogenerated int[] resource where each element is the ID of an attribute. Attributes are generated for each property in the XML by appending the attribute name to the element name. For example, R.styleable.MyCustomView_android_text contains the android_text attribute for MyCustomView. Attributes can then be retrieved from the TypedArray using various get functions. If the attribute is not defined in the defined in the XML, then null is returned. Except, of course, if the return type is a primitive, in which case the second argument is returned.

If you don't want to retrieve all of the attributes, it is possible to create this array manually.The ID for standard android attributes are included in android.R.attr, while attributes for this project are in R.attr.

int attrsWanted[]=new int[]{android.R.attr.text, R.attr.textColor};

Please note that you should not use anything in android.R.styleable, as per this thread it may change in the future. It is still in the documentation as being to view all these constants in the one place is useful.

3. Use it in a layout files such as layout\main.xml

Include the namespace declaration xmlns:app="" in the top level xml element.

    android:text="Test text"
    app:extraInformation="My extra information"

Reference the custom view using the fully qualified name.

Android LabelView Sample

If you want a complete example, look at the android label view sample.

 TypedArray a=context.obtainStyledAttributes(attrs, R.styleable.LabelView);


<declare-styleable name="LabelView">
    <attr name="text"format="string"/>
    <attr name="textColor"format="color"/>
    <attr name="textSize"format="dimension"/>


    app:text="Blue" app:textSize="20dp"/>

This is contained in a LinearLayout with a namespace attribute: xmlns:app=""