XML escaping and unescaping are essential when working with XML data to ensure that special characters like <, >, &, and ", which have specific meanings in XML, are handled correctly.
1. XML Escape/Unescape in Java
In Java, you can manually escape and unescape XML data, or use libraries like Apache Commons Lang or Java's built-in methods to handle escaping/unescaping.
Using Apache Commons Lang (XML Escape/Unescape)
You can use the StringEscapeUtils class from the Apache Commons Text library to escape and unescape XML strings.
First, add the Apache Commons Text dependency to your project:
xml
<!-- In Maven (pom.xml) -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-text</artifactId>
<version>1.9</version> <!-- Use the latest version -->
</dependency>
XML Escape (Using Apache Commons Lang)
java
import org.apache.commons.text.StringEscapeUtils;
public class XMLEscapeExample {
public static void main(String[] args) {
String input = "<div>Hello & welcome!</div>";
String escaped = StringEscapeUtils.escapeXml10(input);
System.out.println(escaped); // Output: <div>Hello & welcome!</div>
}
}
XML Unescape (Using Apache Commons Lang)
java
import org.apache.commons.text.StringEscapeUtils;
public class XMLUnescapeExample {
public static void main(String[] args) {
String escaped = "<div>Hello & welcome!</div>";
String unescaped = StringEscapeUtils.unescapeXml(escaped);
System.out.println(unescaped); // Output: <div>Hello & welcome!</div>
}
}
Manual XML Escape/Unescape (Without External Libraries)
If you prefer not to use external libraries, here's how you can manually handle escaping and unescaping XML data:
XML Escape (Manual Method)
java
public class XMLManualEscape {
public static String escapeXML(String input) {
return input.replace("&", "&")
.replace("<", "<")
.replace(">", ">")
.replace("\"", """)
.replace("'", "'");
}
public static void main(String[] args) {
String input = "<div>Hello & welcome!</div>";
String escaped = escapeXML(input);
System.out.println(escaped); // Output: <div>Hello & welcome!</div>
}
}
XML Unescape (Manual Method)
java
public class XMLManualUnescape {
public static String unescapeXML(String input) {
return input.replace("&", "&")
.replace("<", "<")
.replace(">", ">")
.replace(""", "\"")
.replace("'", "'");
}
public static void main(String[] args) {
String escaped = "<div>Hello & welcome!</div>";
String unescaped = unescapeXML(escaped);
System.out.println(unescaped); // Output: <div>Hello & welcome!</div>
}
}
2. XML Escape/Unescape in .NET
In .NET, you can use the System.Security or System.Xml namespace to escape and unescape XML strings.
Using System.Security.SecurityElement for XML Escape/Unescape
In .NET, SecurityElement.Escape method is commonly used for XML escaping.
XML Escape (Using SecurityElement.Escape in .NET)
csharp
using System;
using System.Security;
public class XMLEscapeExample
{
public static void Main()
{
string input = "<div>Hello & welcome!</div>";
string escaped = SecurityElement.Escape(input);
Console.WriteLine(escaped); // Output: <div>Hello & welcome!</div>
}
}
XML Unescape (Using SecurityElement.Escape in .NET)
To unescape XML strings, you need to handle the reverse operation (e.g., replacing < with <).
You can use the HttpUtility.HtmlDecode method for unescaping or manually handle it as shown below.
csharp
using System;
using System.Web;
public class XMLUnescapeExample
{
public static void Main()
{
string escaped = "<div>Hello & welcome!</div>";
string unescaped = HttpUtility.HtmlDecode(escaped); // Works for XML escaping/unescaping
Console.WriteLine(unescaped); // Output: <div>Hello & welcome!</div>
}
}
Manual XML Escape/Unescape in .NET
If you're not using SecurityElement, here's how you could manually handle XML escaping and unescaping:
XML Escape (Manual Method)
csharp
using System;
public class XMLManualEscape
{
public static string EscapeXML(string input)
{
return input.Replace("&", "&")
.Replace("<", "<")
.Replace(">", ">")
.Replace("\"", """)
.Replace("'", "'");
}
public static void Main()
{
string input = "<div>Hello & welcome!</div>";
string escaped = EscapeXML(input);
Console.WriteLine(escaped); // Output: <div>Hello & welcome!</div>
}
}
XML Unescape (Manual Method)
csharp
using System;
public class XMLManualUnescape
{
public static string UnescapeXML(string input)
{
return input.Replace("&", "&")
.Replace("<", "<")
.Replace(">", ">")
.Replace(""", "\"")
.Replace("'", "'");
}
public static void Main()
{
string escaped = "<div>Hello & welcome!</div>";
string unescaped = UnescapeXML(escaped);
Console.WriteLine(unescaped); // Output: <div>Hello & welcome!</div>
}
}
Summary of XML Escape and Unescape
Escaping XML:
Special characters in XML like <, >, &, ", and ' need to be replaced with their corresponding XML entities (<, >, &, ", and ') to ensure that the XML data is properly formatted.
Common use cases: When you need to include raw text that might contain special characters within XML data (e.g., inside a <tag>).
Unescaping XML:
Unescaping XML involves converting the entities back to their original characters.
Common use cases: When you want to parse or manipulate XML data and need to decode the escape sequences to work with raw data.