Speed up a slow JSF XHTML editing experience in Eclipse or IBM RAD/RSA.

Speed up a slow JSF XHTML editing experience in Eclipse or IBM RAD/RSA.

If you find yourself doing some JSF (Java Server Faces) development within either Eclipse, IBM’s RAD (Rapid Application Developer) or IBM RSA (Rational Software Architect) IDEs you may find that the JSF editor can run slowly with some lag. This seems particularly a problem on RAM starved machines and/or older versions of the Eclipse/RAD IDEs. The problem (which can be intermittent) is very frustrating and can result in whole seconds going by after typing before your changes appear in the editor. It seems that the JSF code validator is taking too long to re-validate the edited JSF code file. At one point this got so bad for our team many would revert to making JSF changes in a text editor and then copy/paste the final code into the IDE.


Thankfully there is a workaround and in order that I don’t forget if I hit this problem again I’m posting it here. The workaround (although sadly not a fix) is to use a different “editor” within the same IDE. If you right click the JSF file you want to edit and use the pop-up menu to choose to open it with the XML Editor instead of the XHTML Editor then you will find a much faster experience. Whilst this does remove some of the JSF/XHTML specific validations it provides support for tags etc and will perform faster.

Should you wish to always use the XML Editor to edit XHTML files you can make this global change via the preferences. Go to General > Editors > File Associations > File Types list > select XHTML extension > click Add > Add XML Editor. Then in the associated editors list select the XML Editor and click the ‘Default’ button – thus making XML Editor the default for all XHTML files. Of course once this is done you can still click on individual XHTML files and right click to open in the original XHTML editor should you want to temporarily switch back for an individual file.

Hopefully this will prevent you pulling your hair out in frustration when editing XHTML files.


Building a Python Flask Web UI For Raspberry Pi Sure Elec LCD

In an earlier post I outlined how I setup a Sure Electronics LCD screen with my Raspberry Pi 3 using a Python driver. Whilst updating the LCD via command line is immensely useful I decided to build a UI to control the LCD send messages too it. By using a browser based UI I could update the LCD screen from anywhere. Essentially this was a chance to play with a Python web framework and write some code!


I’ve passed the UI’s URL round my family’s devices at home and they now send me messages whilst I’m in my study working/playing.

The end result can be found in my GitHub repo.

As my driver was in Python and I’m enjoying coding in Python at the moment I decided to use a Python Web framework to serve the HTML/JavaScript UI and host RESTful services on the server side to accept LCD commands. After some reading I went with Flask which  seemed perfect for my needs. I could have gone with Django but Flask seemed for appropriate for my needs. For a good comparison see this CodeMentor.io post. For a great tutorial on Flask checkout this series by Miguel Grinberg and this great post by Scotch.io.  

Building the server side web framework was easy and logical in Flask and I was able to get something setup in one file which served my needs. However after reading some Flask best practices I spread my solution out into a more appropriate structure. Flask will seem familiar to web developers with experience of ASP.net MVC, Web API, Node/Express etc. You define routes to handle incoming requests. The key aspects my solution are outlined below. I am running the Flask server directly on my Raspberry Pi and using it to serve the pages and host the services for commanding the LCD screen.

To install Flask (on a Pi) first install Python Pip (a popular Python Package Manager) via “apt-get install python-pip” or “apt-get install python3-pip” (for a Python v3 specific Pip) and then install Flask via “pip install flask”.

Flask comes with a small lightweight development server which runs your app in Dev mode and also auto restarts after code changes. I found this fast and robust enough for my needs. 

Lets check out the main parts of the code:

run.py:  This is the entry point for the app. When run it calls run in the app file and here I have optionally passed IP/Port I want the app to run on which enables the app to be exposed to the internal network so I can connect from other machines on the network. 

from app import app 
if __name__ == '__main__':
    app.run(host="", port=5000)

app/__inti_.py & config.py: This is the app initialisation code and where it points to the config.py file where config settings can be set for the app.

app/views.py: This is the heart of the app. After importing the relevant python components and instantiating the Smartie LCD driver (from previous post), the routes for the app are defined.

def show_homepage():
    return "Home Page!"

def show_lcdpage():
    return render_template("lcd.html", name=name)

The route for root will just return the text “Home Page” whereas the route for /lcd will call render_template to return a templated HTML page (lcd.html) and passes any relevant data (e.g. “Jeff” which is irrelevant in this example”). Templates will be covered shortly below.

@app.route("/lcd/clear", methods=["POST","GET"])
def display_clear():
    return "Success"

@app.route("/lcd/displaymessage", methods=["POST"])
def display_message():
    if not request.json:
    return "Success" 

Any POST or GET on http://SERVERADDRESS:PORT/lcd/clear will result in the smartie drivers clear screen method being called. A POST to “/lcd/displaymessage” will be validated to ensure that the request contains JSON data and then the data will be passed to the driver for display.

/app/templates/lcd.html: This is the main HTML page that enables the user to type the messages to display.


The CSS and JavaScript used by this page is found in the static folder and referenced in the usual way………….

<!-- CSS for our app -->         
 <link rel="stylesheet" href="/static/lcd.css"/>

<!-- JS for our app --> 
<script type="text/javascript" src="/static/lcd.js" charset="utf-8"></script>

So we need to ensure that the flask server returns these static files, but we don’t want to have to define  a specific app.route for each one so instead we use this one in our views.py :

def send_file(filename):  
      return send_from_directory('/static', filename)

This basically states that any requests for a file path are sourced from the /static folder directly. So we can just place any files in the static folder that we want to be served directly (the CSS and JavaScript files in our case).

/app/static/lcd.js:  From this JavaScript code we can consume the services hosted by Flask for our application. It’s using the XMLHttpRequest object to make AJAX requests to the Flask server. The SendCommand function takes callback methods which will be called on success or error.

function SendCommand(url, httpVerb, data, successCallback, errorCallback){
  var dataToSend;
      var dataToSend = JSON.stringify(data);          
   var request = new XMLHttpRequest();
  request.open(httpVerb, url, true);
  request.setRequestHeader('Content-Type', 'application/json; charset=UTF-8');
   request.onload = function() {
      if(this.status >= 200 && this.status < 400){
          // success here
          var returnedData; 
          if (this.response != null){
              successCallback(returnedData, this.status);
          //error returned from server
          errorCallback("Error response returned from server", this.status);
   request.onerror = function() {
          errorCallback("Error contacting server", this.status);

  if (dataToSend != null){

That’s mostly it. Run the app by running the run.py module (e.g in the Python IDLE or terminal) and direct your browser to http://SERVERADDRESS:5000/lcd.

The code for my Python driver and this web app is available on GitHub here https://github.com/RichHewlett/smartie and https://github.com/RichHewlett/LCD-Smartie-Web.

Useful React.JS Learning Resources

Useful React.JS Learning Resources

Below are some links that you might find useful for learning React.js and Flux, Facebook’s successful JavaScript UI framework.  There are a lot of resources out there but here are some of the best that I have collected for members of my team.

Introductions and overviews of React.js:


Tutorials for Flux & React:


Prefer the old school approach of reading a book then instead check out this: React.js Essentials by Artemij Fedosejev

NPM config for web access via a proxy

NPM config for web access via a proxy

If you are using NPM for to install your JavaScript modules and you are sitting behind a corporate proxy server  with a strict firewall then you will likely be having problems. If NPM cannot find its way out to the web you will likely be getting a timeout error like the one below:


npm ERR! argv “C:\\node.exe” “C:\\nodejs\\node_modules\\npm\\bin\npm-cli.js” “install” “package1”
npm ERR! node v4.2.1
npm ERR! npm  v2.14.7
npm ERR! errno ETIMEDOUT
npm ERR! syscall connect
npm ERR! network connect ETIMEDOUT
npm ERR! network This is most likely not a problem with npm itself
npm ERR! network and is related to network connectivity.
npm ERR! network In most cases you are behind a proxy or have bad network settings.
npm ERR! network
npm ERR! network If you are behind a proxy, please make sure that the
npm ERR! network ‘proxy’ config is set properly.  See: ‘npm help config’

To resolve this problem you need to tell NPM the address of your web proxy, including the username/password to authenticate, so that it can route outgoing HTTP requests via that proxy. NPM stores its configuration in a config file and can be edited via the console/terminal using “NPM Config” command. Use this command to set  set both the HTTP and HTTPS values replacing the username/password and proxy address with your custom values:

npm config set proxy http://username:password@yourproxy.yourcompany.com:8080/
npm config set https-proxy http://username:password@yourproxy.yourcompany.com:8080/

To view the current proxy settings, or to check that your change worked, you can run “npm config get” (as opposed to “npm config set”) to read the settings.

“npm config get proxy”
“npm config get https-proxy”

Alternatively running only “npm config get” will show ALL NPM config settings.

Should you want to remove the npm setting you can do it like this:

“npm config rm proxy””
”npm config rm https-proxy”

For more information checkout the NPM documentation here:https://docs.npmjs.com/misc/config

Setting HTTP Headers in Java Server Faces (JSF)

Setting HTTP Headers in Java Server Faces (JSF)

In my last post  I discussed using HTTP headers to control browser caching of sensitive data. The post can be found here. The examples provided in that post were all ASP.Net and so I thought I’d cover how to explicitly set your HTTP Response headers when you are using the Java JSF framework.

Adding Headers via Code

You can set HTTP response headers directly in your code via the HTTPServletResponse object, as below:

import javax.servlet.http.HttpServletResponse;

ExternalContext context = FacesContext.getCurrentInstance().getExternalContext();
HttpServletResponse response =  (HttpServletResponse)context.getResponse();
response.setHeader("TestHeader", "hello");

This results in the addition of this header in the HTTP Response: TestHeader: hello.

Adding Headers in your XHTML Markup

Alternatively you can set it directly on each XHTML page via an event tag, as shown in the example below:

<f:event type="preRenderView" listener="#{facesContext.externalContext.response.setHeader('TestHeader', 'hello')}" />
    <h:panelGrid columns="2">
      <h:outputText value="Name"></h:outputText>
      <h:inputText value="#{LoginBean.name}"></h:inputText>
      <h:outputText value="Password"></h:outputText>
      <h:inputSecret value="#{LoginBean.password}"></h:inputSecret>
    <h:commandButton action="welcome" value="Submit" />

This again results in the addition of this header in the HTTP Response: TestHeader: hello.

Adding Headers via a custom Web Filter

In order to implement a solution across your whole web application and for the ability to set headers for different resource types (not just facelets), a web filter may be what you need. Filters intercept your requests and responses to dynamically transform them or use the information contained in them. A good guide to Filters is this official Oracle one – The Essentials of Filters, or check this one out – Servlet Filter to Set Response Headers .

Here is the source code for a simple filter that sets a custom header and can be used to explicitly set HTTP response headers as required:

package Filters; 
import java.io.IOException; 
import javax.servlet.Filter; 
import javax.servlet.FilterChain; 
import javax.servlet.FilterConfig; 
import javax.servlet.ServletException; 
import javax.servlet.ServletRequest; 
import javax.servlet.ServletResponse; 
import javax.servlet.http.HttpServletResponse;

public class HeaderFilter implements Filter 
    public void init(FilterConfig fc) throws ServletException {} 

    public void doFilter(ServletRequest req, ServletResponse res, 
            FilterChain fc) throws IOException, ServletException 
        HttpServletResponse response = (HttpServletResponse) res; 
        response.setHeader("TestHeader", "hello"); 
        fc.doFilter(req, res); 

    public void destroy() {} 

Once you’ve coded your filter you need to update your Web.xml file to tell the framework that you want to use this filter. Do this by adding a filter element containing the name and class details. You also need to set a filter-mapping element to map the filter with the request for resources. In the below example which configures the Filter above I have mapped “/*”  meaning all requests will go through the filter but you can configure this to only impact certain resources or file types.


Once configured via the web.xml file the custom header sets this header on all respones: TestHeader : hello.

Preventing Browser Caching using HTTP Headers

Many developers consider the use of HTTPS on a site enough security for a user’s data, however one area often overlooked is the caching of your sites pages by the users browser. By default (for performance) browsers will cache pages visited regardless of whether they are served via HTTP or HTTPS. This behaviour is not ideal for security as it allows an attacker to use the locally stored browser history and browser cache to read possibly sensitive data entered by a user during their web session. The attacker would need access to the users physical machine (either locally in the case of a shared device or remotely via remote access or malware). To avoid this scenario for your site you should consider informing the browser not to cache sensitive pages via the header values in your HTTP response. Unfortunately it’s not quite that easy as different browsers implement different policies and treat the various cache control values in HTTP headers differently.

Taking control of caching via the use of HTTP headers

To control how the browser (and any intermediate server) caches the pages within our web application we need to change the HTTP headers to explicitly prevent caching. The minimum recommended HTTP headers to de-activate caching are:

Cache-control: no-store
Pragma: no-cache

Below are the settings seen on many secure sites as a comparison to above and perhaps as a guide to what we should really be aiming for:

Cache-Control:max-age=0, no-cache, no-store, must-revalidate
Expires:Thu, 01 Jan 1970 00:00:00 GMT

HTTP Headers & Browser Implementation Differences:

Different web browsers implement caching in differing ways and therefore also implement various subtleties in their support for the cache controlling HTTP headers. This also means that as browsers evolve so too will their implementations related to these header values.

Pragma Header Setting

Use of the ‘Pragma’ setting is often used but it is now outdated (a retained setting from HTTP 1.0) and actually relates to requests and not responses. As developers have been ‘over using’ this on responses many browsers actually started to make use of this setting to control response caching. This is why it is best included even though it has been superseded by specific HTTP 1.1 directives.

Cache-Control ‘No-Store’ & ‘No-Cache’ Header Settings

A “Cache-Control” setting of private instructs any proxies not to cache the page but it does still permit the browser to cache. Changing this to no-store instructs the browser to not cache the page and not store it in a local cache. This is the most secure setting. Again due to variances in implementation a setting of no-cache is also sometimes used to mean no-store (despite this setting actually meaning cache but always re-validate, see here). Due to this the common recommendation is to include both settings, i.e: Cache-control: no-store, no-cache.

Expires Header Setting

This again is an old HTTP 1.0 setting that is maintained for backward compatibility. Setting this date to a date in the past forces the browser to treat the data as stale and therefore it will not be loaded from cache but re-queried from the originating server. The data is still cached locally on disk though and so only provides little security benefits but does prevent an attacker directly using the browser back button to read the data without resorting to accessing the cache on the file system.  For example:  Expires: Thu, 01 Jan 1970 00:00:00 GMT

Max-Age Header Setting

The HTTP 1.1 equivalent of expires header. Setting to 0 will force the browser to re-validate with the originating server before displaying the page from cache. For example: Cache-control: max-age=0

Must-Revalidate Header Setting

This instructs the browser that it must revalidate the page against the originating server before loading from the cache, i.e. Cache-Control: must-revalidate

Implementing the HTTP Header Options

Which pages will be affected?

Technically you only need to turn off caching on those pages where sensitive data is being collected or displayed. This needs to be balanced against the risk of accidently not implementing the change on new pages in the future or making it possible to remove this change accidently on individual pages. A review of your web application might show that the majority of pages display sensitive data and therefore a global setting would be beneficial. A global setting would also ensure that any new future pages added to the application would automatically be covered by this change, reducing the impact of developers forgetting to set the values.

There is a trade off with performance here and this must be considered in your approach. As this change impacts the client caching mechanics of the site there will be performance implications of this change. Pages will no longer be cached on the client, impacting client response times and may also increase load on the servers. A full performance test is required following any change in this area.

Implementing in ASP.net

There are numerous options for implementing the HTTP headers into a web application. These options are outlined below with their strengths/weaknesses. ASP.net and the .Net framework provide methods to set caching controls on the Request and Cache objects. These in turn result in HTTP headers being set for the page/application’s HTTP responses. This provides a level of abstraction from the HTTP headers but that abstraction prevents you setting the headers exactly how you might like them for full browser compatibility. The alternative approach is to explicitly set the HTTP headers. Both options and how they can be implemented are explored below:

Using ASP.net Intrinsic Cache Settings
Declaratively Set Output Cache per ASPX Page

Using the ASPX Page object’s attributes you can declaratively set the output cache properties for the page including the HTTP header values regarding caching. The syntax is show in the example below:

Example ASPX page:

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="Default.aspx.cs" Inherits="CacheTestApp._Default" %> 
<%@ OutputCache Duration="60" VaryByParam="None"%> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" > 
<head runat="server"> 
<form id="form1" runat="server"> This is Page 1.</form> 

Parameters can be added to the OutputCache settings via the various supported attributes. Whilst this configuration allows specific targeting of the caching solution by enabling you to define a cache setting for each separate page it has the drawback that it needs changes to be made to all pages and all user controls. In addition developers of any new pages will need to ensure that the page’s cache settings are correctly configured. Lastly this solution is not configurable should the setting need to be changed per environment or disabled for performance reasons.

Declaratively Set Output Cache Using a Global Output Cache Profile

An alternative declarative solution for configuring a page’s cache settings is to use a Cache Profile. This works by again adding an OutputCache directive to each page (and user control) but this time deferring the configuration settings to a CacheProfile in the web.config file.

Example ASPX page:

<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="Default.aspx.cs" Inherits="CacheTestApp._Default" %> 
<%@ OutputCache CacheProfile=" RHCacheProfile "%> 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 
<html xmlns="http://www.w3.org/1999/xhtml" > 
<head runat="server"> 
<form id="form1" runat="server"> 
This is Page 1. 

Web.config file:

<outputCache enableOutputCache="false"/> 
<OutputCache CacheProfile=" RHCacheProfile"> 
<add name="RHCacheProfile" 

This option provides the specific targeting per page and the related drawbacks of having to make changes to every page and user control. This solution does provide the ability to centralise the cache settings in one place (minimising the impact of future changes) and enables caching to be set during installation depending on target environment via the deployment process.

Programmatically Set HTTP Headers in ASPX Pages

Output caching can also be set in code in the code behind page (or indeed anywhere where the response object can be manipulated). The code snippet below shows setting the HTTP headers indirectly via the Response.Cache object:

Response.Cache.SetMaxAge(new TimeSpan(0,0,30));

This code would need to be added to each page and so results in duplicate code to maintain and again introduces the requirement for this to be remembered to be added to all new pages as they are developed. It results in the below headers being produced:

Cache-Control:no-cache, no-store



Programmatically Set HTTP Headers in Global ASAX File

Instead of adding the above code in each page an alternative approach is to add it to the Global ASAX file so as to apply to all requests made through the application.

void Application_BeginRequest(object sender, EventArgs e)
	Response.Cache.SetMaxAge(new TimeSpan(0,0,30));

This would apply to all pages being requested through the application. It results in the below headers being produced:

Cache-Control:no-cache, no-store



Explicitly define HTTP Headers outside of ASP.net Cache settings.

Explicitly Define HTTP Headers in ASPX Pages

The response object can have its HTTP Headers set explicitly instead of using the ASP.net Cache objects abstraction layer. This involves setting the header on every page:

void Page_Load(object sender, EventArgs e)
	Response.AddHeader("Cache-Control", "max-age=0,no-cache,no-store,must-revalidate");
	Response.AddHeader("Pragma", "no-cache");
	Response.AddHeader("Expires", "Tue, 01 Jan 1970 00:00:00 GMT");

Again as a page specific approach it requires a change to be made on each page. It results in the below headers being produced:


Expires:Tue, 01 Jan 1970 00:00:00 GMT


Explicitly Define HTTP Headers in Global ASAX File

To avoid having to set the header explicitly on each page the above code can be inserted into the Application_BeginRequest event within the application’s Global ASAX file:

void Application_BeginRequest(object sender, EventArgs e)
	Response.AddHeader("Cache-Control", "max-age=0,no-cache,no-store,must-revalidate");
	Response.AddHeader("Pragma", "no-cache");
	Response.AddHeader("Expires", "Tue, 01 Jan 1970 00:00:00 GMT");

Again this results in the below headers being produced:


Expires:Tue, 01 Jan 1970 00:00:00 GMT


Environment Specific Settings

It’s useful to be able to set the header values via configuration settings, not least to be able to test this change in a performance test environment via before/after tests.

All of the above changes should be made configurable and be able to be triggered/tweaked via the web.config file (and therefore can be modified via deployment settings).

Useful Links For More Information 

Upgrading MVC 3 to MVC 4 via NuGet

Upgrading MVC 3 to MVC 4 via NuGet

I had to upgrade an old ASP.NET MVC 3 project to MVC 4 yesterday and whilst searching for the official instructions I found that there is a NuGet package that does all the hard work for you.

The official instructions for upgrading are in the MVC 4 release notes here: http://www.asp.net/whitepapers/mvc4-release-notes#_Toc303253806

But Nandip Makwana has created a NuGet package that automates this process. Check it out here: https://www.nuget.org/packages/UpgradeMvc3ToMvc4

It worked great for me.

Host Static HTML or WebForms Page within MVC Site

Host Static HTML or WebForms Page within MVC Site

If you need to host a static HTML page within an ASP.net MVC website or you need to mix ASP.net WebForms with an MVC website then you need to configure your routing configuration in MVC to ignore requests for those pages.

File:Belgian road sign F7.svgRecently I wanted to host a static HTML welcome page (e.g. hello.htm) on an MVC website. I added the HTML page to my MVC solution (setting it as the Visual Studio project’s start page) and configured my web site’s default page to be the HTML page (hello.htm). It tested ok at first but then I realised that it was only displaying the hello page first on debug because I’d set the page to be the Visual Studio project’s start-up page and I hadn’t actually configured the MVC routes correctly so it wouldn’t work once deployed.

For this to work you need to tell MVC to ignore the route if its for the HTML page (or ASPX page in the case of mixing WebForms and MVC). Find your routing configuration section (for MVC4 it’s in RouteConfig.cs under the App_Start folder, for MVC1,2,3 it’s in Global.asax). Once found use the IgnoreRoute() method to tell Routing to ignore the specific paths. I used this:

routes.IgnoreRoute("hello.htm"); //ignore the specific HTML start page
routes.IgnoreRoute(""); //to ignore any default root requests

Now MVC ignores a request to load the hello HTML page and leaves IIS to handle returning the resource and hence the page displays correctly.

Setting a Custom Domain Name on an Azure Web Site

Setting a Custom Domain Name on an Azure Web Site

I recently decided to add a custom domain name to a free Azure website that I use for development purposes. As the FREE Azure web site model doesn’t support custom domains (a shame but hard to complain as it’s FREE) I needed to upgrade the site to the ‘Shared’ mode. This is easily done by the Scaling button in the azure portal.

Firstly however I needed to link my current azure web site to sit under a different subscription to the one I used to set it up. The problem is that cannot move sites between subscription models yet (please fix this Microsoft). To get around this I needed to create a new website under the correct subscription and then publish my web site code to it. Luckily this is easy to do as it’s just a basic website but I can imagine that this could be painful if you have a bunch of storage accounts or a database to re-create.

Using the Azure Portal, creating a new site is a simple process Click +NEW at the bottom of the portal for the menu shown below:


Once created all I needed to do was download a Publish profile (see this tutorial link for how to publish to Azure) for the new site for Visual Studio to use. Once downloaded I opened my VS2012 solution and brought up the Publish dialog. I pointed it to the new Publish profile file and clicked Publish. In just a few seconds I’ve got a new Azure web site up and running with my existing MVC web application. This was very smooth, with no change to config or code required. The sheer simplicity of this impressed me as I was short on time.

Next I needed to allocate my custom domain which as previously mentioned is not available for FREE websites so i needed to upgrade to SHARED mode. From the Azure portal >web site configuration > scale > click SHARED (remember this model incurs a cost).


Once upgraded I could then immediately select DOMAINS and set up my CNAME and A record references, for more information see this useful link (configuring a custom domain name for a Windows Azure web site). It’s worth reading the comments on the post too as it covers issues with registering the domain without the WWW subdomain.

Once the DNS entries had propagated I had my existing site up and running under a custom domain running within a shared Azure instance, all with very little effort.

The HTML Agility Pack

For a current project I needed to perform a simple screen scrape action. The resulting solution was functional but a bit rough and ready. Luckily I stumbled upon this open-source HTML library project: The HTML Agility Pack, hosted on CodePlex at http://htmlagilitypack.codeplex.com.

It is an excellent little library that makes dealing with HTML a breeze, whether you are screen scraping or just manipulating HTML documents locally. It is very forgivable with regards to malformed HTML documents and supports loading pages directly from the web. You can just parse the HTML or modify it, and it even supports LINQ.  A key benefit of this library is that it doesn’t force you to learn a new object model but instead mirrors the System.XML object model – a huge help for getting up and running quickly, as well as making coding it feel natural.

Download HTML directly via a URL:

HtmlDocument htmlDoc = new HtmlDocument();
HtmlWeb webGet = new HtmlWeb();
htmlDoc = webGet.Load(url);

Or parse an HTML string:

HtmlDocument htmlDoc = new HtmlDocument();

Then you can use XPATH to query the HTML document as you would an XML document:           

// select a <li> where it has an element of <b> with a value of "Name:"
var nameItem = htmlDoc.DocumentNode.SelectSingleNode("//li[b='Name:']");
if (nameItem != null && nameItem.ChildNodes.Count > 1)
    name = nameItem.ChildNodes[1].InnerText;

You can download it via NuGet here : http://nuget.org/packages/HtmlAgilityPack.

For more examples of it’s use check out these posts: Parsing HTML Documents with the Html Agility Pack and Crawling a web sites with HtmlAgilityPack.