Friday, September 20, 2024

Convert HTML Tags to Lower-case for XHTML Compliance

The XHTML definition demands all tags to be lower-cased.

Your page will not validate otherwise and will therefore not be valid XHTML. If you write all your XHTML by yourself, it shouldn’t be an issue.

You simply write all tags in lower-case.

Now, imaging situations where you’re not in control over the code being written.

One situation is when you let visitors/users of the website write HTML in a text box or even better, a rich text editor like FCKeditor or FreeTextBox. For some reason, no rich text editor I know of can write flawless XHTML in all situations, correct me if I’m wrong.

So, I wrote a little static helper method in C# that converts HTML tags to lower-case.

/// <summary>
/// Convert HTML tags from upper case to lower case. This is important in order
/// to make it XHTML compliant. It also includes some tags that are not
/// XHTML compliant, you can remove them if you want.
/// &tl;/summary>
private static string LowerCaseHtml(string html)
{
    string[] tags = new string[] {
    "p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
    "h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
    "tr", "table", "th", "td", "tbody", "thead", "tfoot",
    "input", "select", "option", "textarea", "em", "strong"
    };

    foreach (string s in tags)
    {
      html = html.Replace("", "/" + s + ">");;
    }

    return html;
}

If you also want to lower-case the HTML attributes, you can do it almost the same way as the HTML tags. I probably missed some attributes, but you can easily add them to the string array in the method below.

/// <summary>
/// Convert HTML attribues from upper case to lower case. This is important in order
/// to make it XHTML compliant.
/// </summary>
private static string LowerCaseAttributes(string html)
{
    string[] attributes = new string[] {
    "align", "cellspacing", "cellpadding", "valign", "border",
    "style", "alt", "title", "for", "col", "header", "clear",
    "colspan", "rows", "cols", "type", "name", "id", "target", "method"
    };

    foreach (string s in attributes)
    {
       html = html.Replace(s.ToUpper() + "=", s + "=");
    }

    return html;
}

You can use this method when you save the input from a text box or you can use it when you render the page. Here’s how you change the output of the ASP.NET page by overriding the Render method. You can remove the tags you don’t need from the method to optimize the performance.

protected override void Render(HtmlTextWriter writer)
{
    using (HtmlTextWriter htmlwriter = new HtmlTextWriter(new System.IO.StringWriter()))
    {
       base.Render(htmlwriter);
       writer.Write(LowerCaseHtml(htmlwriter.InnerWriter.ToString()));
    }
}

You can use this approach in conjunction with my whitespace removal method. It also uses the page’s Render method.

Tag:

Add to Del.icio.us | Digg | Reddit | Furl

Bookmark Murdok:

Mads Kristensen currently works as a Senior Developer at Traceworks located
in Copenhagen, Denmark. Mads graduated from Copenhagen Technical Academy with a multimedia degree in
2003, but has been a professional developer since 2000. His main focus is on ASP.NET but is responsible for Winforms, Windows- and
web services in his daily work as well. A true .NET developer with great passion for the simple solution.

Home

Related Articles

3 COMMENTS

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles