New Web citizens–html 5, CSS 3 & IE 9

I recently attended a seminar hosted by Microsoft at Pune, India unveiling the potential of new standards (html 5, css3) and innovations (IE9) that are going to make the entire web experience different (for better). The discussion, no doubt, offered good amount of food for thought. What I intend to discuss here is not comprehensive sets of features present in the trio but the challenges we are likely to face and a few questions raised in the seminar that shouldn’t had been answered but weren’t.

Continue reading “New Web citizens–html 5, CSS 3 & IE 9”

Video: Introduction to Http

I have already devoted several posts on HTTP Protocol, its significance in web application development. Here is a short video talk on Http.

Continue reading “Video: Introduction to Http”

Asp.Net Http Handlers (Part 1)

In the last instalment we discussed the need and motivation behind http handlers. We also discussed the creation and deployment of http handlers using ISAPI specification. In this section we will discuss how we can implement Http Handler using .net framework.

Before we talk about ‘how’  lets first talk about ‘what’  and ‘why’. Although we have already discussed about Http Handlers let us re-define from a different perspective.

Http handlers are programs that are invoked by a web server to get particular job done. period.

Now this job can be as big as implementing a whole new programming infrastructure or as small as printing a hello world message.

Let us also now discuss the possible meaning of an asp.net http handler. By extending the above definition we can easily deduce:

.Net Http Handlers are .net programs that will be invoked by the web server to get some job done.

But then, we already have facility to write a .net application (.aspx page) to get job done. So why do we need a new application? Why not just use a .aspx page? Let us try to understand this part with a sample application.

 

Why .net HTTP Handlers?

Let us try to answer this question by discussing a scenario:

Job Description: We need to create a image generator. The Image Generator can generate a given text into an image on the fly. Such images can be used for generating CAPTCHA or creating graphs based on the data present in database or creating an image of your email. The possibilities are endless.

Now that we know our job, let us try to find out what are the alternatives:

 

Create a .aspx page

 

Let us say we can create an imagegen.aspx to handle the situation. User can generate the image by invoking it with right set of parameters example:

http://dev.vnc.in/imagegen.aspx?type=text2img&text=dev@vnc.in&font-size=12&color:blue

This approach will work. However, there are several problems:

  • The page need to be incorporated in every website. A .aspx page is typically not considered as a component module. While same .aspx page can be included in several different website; it is not a very elegant pattern.
  • Although not explicitly stated, .aspx page is typically used for generating .html output. It is evident from the fact that a .aspx page is designed as a web page. Code behind is an optional entity. While it can generate an image; it is certainly not  a very clean solution.
  • Implementing a complicated set of steps in a single .aspx page goes contrary to the best coding practices. Over a period of time such pages will become unmanageable.
  • Because a .aspx page can’t serve all sorts of  purpose .net came up with other extensions such as .asmx.

While a .aspx page can be used for writing any kind of logic; it is neither the design goal nor a good practice us a .aspx page for roles other than generating a web output. This is clear from the fact that Microsoft itself came up with other extensions such as .asmx for web service.

It appears we have two options – wait for Microsoft to come up with a solution for my problem or we come up with our own customized solution.

 

I am sure you are not going to wait for Microsoft to come up with some solution. Are you?

 

If you are thinking in terms of our customized solution; it implies you are thinking about a HTTP Handler

Ok So we need a HTTP handler. But then we already have an infrastructure to create and deploy and HTTP Handler – ISAPI HTTP Handler.  We can use it. Isn’t it?

 

Why not an ISAPI Http Handler?

 

Sure. ISAPI seem to be an appropriate solution for creating an HTTP Handler. After all it exists with the sole purpose of creating an Http Handler. But then it is not really a .net based solution. Is it?

ISAPI is a win 32 dll and the handlers are typically written in C/C++ or other programming languages. The ISAPI don’t support extensive .net API. And a .Net developer need to master a new language to write a Handler.

Of the two choices – ISAPI  vs .aspx ; developer often choose easier and not so good approach – writing an .aspx page.

 

Having discarded the two alternatives, we are left with the only obvious choice – .net based HTTP Handlers. Let us try to understand why .net http handlers makes sense:

 

An asp.net http handlers are http handlers written in .net. It utilizes the full capacity of .net API and is an easier solution for a .net developer. It uses the best of both the worlds – ease of Handler coupled with convenience of .Net.

 

Implementing HTTP Handler in .Net

 

Having convinced ourselves about the benefit of a .Net Http Handler, let us understand the steps needed for implementing a .Net based Http Handler.

 

 

Step 1 – Creating The Handler

Creating the Handler is a simplicity in itself. All We need is to create a class that implements System.Web.IHttpHandler interface. The Interface has got just two methods. The following class diagram represents our ImageHandler and its relation with IHttpHandler.

IHTTPHandlerImplementation

 

 

Typically an Http Handler will be created as a class library. We also need to add a reference to System.Web assembly.

 

Let us create a class library. Name it vnc.web.ImageGenerator

Create-ImageGeneratorProject

Now Add the necessary System.web assembly.

add-ref

Change the project property to make the default namespace to vnc.web instead of vnc.web.ImageGenerator

change-project-property

Now delete the default class1.cs and add a new class – ImageGenerator.cs. Next implement interface System.Web.IHttpHandler.  Implement the Interface.

 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Web;

namespace vnc.web
{
    public class ImageGenerator: IHttpHandler
    {
        #region IHttpHandler Members

        public bool IsReusable
        {
            get { throw new NotImplementedException(); }
        }

        public void ProcessRequest(HttpContext context)
        {
            throw new NotImplementedException();
        }

        #endregion
    }
}

 

http://dev.vnc.in/app-name/red/black/georgia/12/vivek.jpg.image

Now let us go ahead and write our business logic. The business logic for the current ImageGenerator can be downloaded from the project and I am not including it here.

Compile the project to generate the necessary assembly. The Creation of HTTP Handler is complete.

 

Deploying the HTTP Handler

 

Now that the HTTP Handler is ready, let us focus on the deployment steps. It includes two steps:

 

Configuring IIS Handler Mapping

 

iis-config

Since our HTTP Handler is not an ISAPI Handler, it is not supposed to be registered directly in the IIS Handler Mapping as we discussed in the previous episode – Http Handlers (Although the latest IIS does have an option).  What we need to do is to map the new URL Pattern to asp.net engine. Now IIS will pass the request for new resource to aspnet_isapi.dll . (Ofcourse asp.net engine doesn’t know how to process this request. But we will handle this in a separate step)

 

 

 

 

Now all request for the .image extension will be diverted to aspnet_isapi.dll – The Asp.net engine. However, predictably, asp.net has got no idea as to how to handle this request or whom to pass this request. This configuration will be done in web.config or mechine.config depending on the requirement.

 

Registering Our Handler with asp.net

 

To configure asp.net engine to pass request for .image resources to our handler we need to register our handler in web.config or machine.config. We need following steps:

  1. Add a Test website to our solution
  2. Add a reference to the website project. Select the vnc.web.ImageGenerator.dll assembly from the list.
  3. Add handler reference to web.config. Locate handler section and add the code as mentioned below. Look up for a Handler section and add following entry. The code has been modified for clarity
<httpHandlers>
       <remove verb="*" path="*.asmx"/>
       <add verb="*" path="*.asmx" validate="false" … >
<add verb="*" path="*_AppService.axd" validate="false" type="…” /> <add verb="GET,HEAD" path="ScriptResource.axd" type="…”/> <add verb="*" path="*.image" type="vnc.web.ImageGenerator"/> </httpHandlers>

 

The newly added entry has been underlined. Now we are done and we can test our application safely.

try out running following url

http://localhost:47877/HandlerDemoSite/red/black/georgia/22/Vivek%20Dutta%20Mishra.jpeg.image

And you should get a jpeg image generated for you with your choice. The given path doesn’t exist physically and yet it will return the result.

In our next instalment, we will discuss some other advanced features of asp.net http handler.

 

Download File – ImageGeneratorSolution

Http Handlers

This is a continuation of our series on HTTP. This Articles discusses the motivations and design concerns related to creation of Handler. The articles covers the technical details of what we discussed in the preceding article  Http Hacked.

Before we attempt to understand the concept of handlers we need to understand the working of HTTP. I would recommend going through the previous articles for this purpose:

  1. story of www – between http and html – This is the main article that discusses the evolution of www, http protocol. Its motivation and the road map.
  2. Http HackedThe most important article to understand the current discussion. This is the second instalment on the running story on http. The episode mainly discusses how http was hacked for web programming.
  3. Http Through Ages – This article is a technical overview of the changes between various versions of http and you may like to go through it for the completion of discussion.

Now lets quickly recap how an Http Server works.

 

Scenario 1. Http Server is requested to Serve a static resource

By static resource we mean a html document or an image or a downloadable content that always exists (or can exist) on the server, and all a server does is to pick it and pass it to the client. The following diagrams represents the scenario:

Http-Request-Response-Model

Any and every Http Server is designed to serve this kind of request. As such the request is handled in strict accordance with http protocol.

 

Scenario 2. Http Server is requested a Dynamic Resource

The situation get trickier when you need to serve a resource that doesn’t exist; rather it is created. More accurately – that is computed. An example could be – requesting a list of customers. The information exists in the database. It needs to be extracted and presented in a tabulated format. So we need a program that –

  • Accepts users request and relevant parameters
    • Parameter may tell to get a list of priority customer
    • Filter customers based on geographical location and so on
  • Executes relevant queries on to the database and extracts the necessary records
  • Creates an HTML page with the extracted records in a presentable format.
  • Sends the newly created html page to the user.

So we need a program. But how does this program relate to the web server. Web server need to delegate its work to this special program which is designed to handle the request. The overall interaction process is represented in following diagram:

 

Http-Request-Response-2

 

The diagrams brings out following points clearly:

  1. Web server really doesn’t know how exactly to use the dynamic resource (in our case customer.aspx)
  2. It relies on an external application or extension to get the computation done. These extension are typically termed as http handlers.
  3. A web server typically can delegate the computation to one of the many handlers.
  4. Each Handler is supposed to process a particular kind of request which can be identified by url format.
  5. A url format may mean
    1. In most of the cases a particular extension  – *.asp, *.aspx, *.php, *.jsp and so on
    2. It may also mean request associated with a particular folder. Eg. /cgi-bin/*
    3. It may be a particular absolute path – Eg /log/log.app
  6. The web server checks the Url and then decides whom to delegate the request.

 

Developing The Handlers

 

As we already discussed in our previous article Http Hacked, that the need of such an extension was felt clearly. We needed a mechanism to extend the server in different and unpredictable ways. For this reason different vendors proposed their solution. The solutions had implementation differences, however, philosophically they had same working principle:

Define a set of guidelines and  API that will be used for creating a handler. The Handler will expose a certain well defined set of functionality that will be called by the web server to get the request processed.

Microsoft proposed an API layer and termed it as ISAPI – Internet Server API

Netscape, for example, proposed NSAPI – Netscape Server API.

 

Servers and handlers are supposed to be compliant with one of the available standards. For example IIS is an ISAPI compliant server and can work with any ISAPI  compliant handler. All ISAPI compliant handlers are supposed to work with every ISAPI compliant servers.

 

An ISAPI handler is typically written as a win32 dll and is designed in C/C++ language. However, other language choices are also available.

 

Deploying the Handler

Once a Http Handler is created, it needs to be registered with the web server and mapped with an url pattern. Following screenshots shows different ISAPI handlers registration with IIS 7.0.

Step 1: Start Internet Service Manager from Control Panel –>Administrative Tools

 

Step 2: Navigate to Http Handler Mapping section

isapi handler configuration 1

 

Step 3: We will have a list of existing mapping. We can modify them or add new entries

isapi handler configuration 2

 

Step 4: This is a sample mapping of .aspx files to aspnet_isapi.dll

 

isapi handler configuration

 

 

Summary

 

In this section we looked at the roles of Http handlers and their deployment. In the next episode we will be checking deployment of http handler using asp.net.

Http Hacked

www was born to be just another route to internet; fate made it the internet.

In its inception it was meant to be for publishing electronic documents and the model was quite similar to print industry. concentration was on formatting the documents and documents changed  infrequently. Access may require authentication but authentication was simple and need for a real time data non-existent.

As dependence on internet grew more and more complicated, need for a smarter communication was clear. There were seemingly two options:

  • Create a new protocol.
  • Hack and Existing Protocol.

And if your choice was second, your choice had to be HTTP.

 

Why hack HTTP and not just create a new protocol?

 

Here we will be getting into the guessing work. But the guesses are likely to be as perfect as fact itself. But why guess work? Because, hacks are rarely documented and never systematic.

First reason, that favours use of any existing protocol, and not just http, is the availability of large infrastructure which needs to be created across the globe, otherwise.

Http protocol emphasised on presenting formatted information, something which was the common denominator of all the programming requirement. Using http as the starting point made sense. The other important reasons that favoured http are:

  • Presence of customizable request and response headers.
  • Output in presentable format that can support layouts, links, tables etc.
  • Availability of forms to get user input.
  • HTTP anyway needed a solution for such a dynamic requirement.

As we have already talked, http soon felt the need of a search engine and the first proposed solution to it was to modify the http server itself. While the solution worked, it was clearly less than a desirable design and very far from an ideal solution. Not only it required modification to the server, it was apparent that this kind of solution would necessitate a lot of change in the server for so many different reasons.

What we really needed was what we term as open-close principle. A design that can allow other applications to run and support server thus extending its functionality without modifying the server for every business requirement.

 

Roadmap to http hack

 

Before we understand the solution, let us quickly picture how a simple http communication works. We will look at four different scenario of http communication.

Scenario 1: Requesting an html page

Client sends following request to the server.

GET HTTP/1.1 /profile/vivek.html [cr][lf]
CAN-ACCEPT: */*[cr][lf]
[… more headers …]
[cr][lf]  

 

The server responds to client with a response header and the content.

HTTP/1.1 200 OK [cr][lf]
CONTENT-TYPE: text/html[cr][lf]
CONTENT-LENGTH: 2012[cr][lf]
[… more headers …]
[cr][lf]
[… actual content follows … ]

[content ends; connection closes]

 

Notice, the request was for an html page and the server responds with status code of 200, indicating that request was successful. Next it sends CONTENT-TYPE as text/html. It indicates that the generic type of output is text and the format of document is html. The header gives sufficient information so that client can save the output as a .html file and render it as an html document.

Scenario 2: Requesting an image

A typical request for image will be no different from the previous one.

GET HTTP/1.1 /profile/vivek.jpg [cr][lf]
CAN-ACCEPT: */* [cr][lf]
[… more headers …]
[cr][lf]

The server once again checks for the document and sends a response:

HTTP/1.1 200 Ok [cr][lf]
CONTENT-TYPE: image/jpeg [cr][lf]
CONTENT-LENGTH: 42085 [cr] [lf]
[… more headers …]
[cr][lf]
[… byte by byte content of image … ]

Notice this time server reports generic content type as a image file. The specification further reveals that the image is of type jpeg. Thus the client saves the image in a file with extension .jpg. The content is once again rendered as a image and appropriate viewer is used.

 

Scenario 3: Requesting an executable

Once again the request remains same.

GET HTTP/1.1 /downloads/time-piece.exe [cr] [lf]
CAN-ACCEPT: */* [cr][lf]
[… more headers …]
[cr][lf]

Here we have placed a request for an executable called time-piece.exe. Server checks for the same and responds:

HTTP/1.1 200 Ok [cr][lf]
CONTENT-TYPE: application/octet-stream [cr][lf]
CONTENT-LENGTH: 47200 [cr][lf]
[… more headers …]
[cr][lf]

[… byte by byte content of the executable …]

 

As we can see this time the mime type reported is application/octet-stream. This is a indication to the server that the response is an application and the content is not supposed to be displayed on the browser. It is supposed to be downloaded. The browsers typically present a save option.

Let us look at a  fourth scenario. Here the request is made for an image. However server sends header such that the image is downloaded rather than displayed in the browser.

 

Scenario 4:  Request that allows you to download the image

GET HTTP/1.1 /misc/india-map.jpg [cr][lf]
CAN-ACCEPT:*/* [cr][lf]
[… more headers …]
[cr][lf]

 

This time server sends the response in the following manner.

HTTP/1.1 200 Ok [cr][lf]
CONTENT-TYPE: application/octet-stream [cr][lf]
CONTENT-LENGTH: 48029 [cr][lf]
[… more headers …]
[cr][lf]
[… actual image content …]

 

Notice the request and response is proceeds in the same way as in scenario 2,  except for one change. This time server describes content-type  as application/octet-stream. This is a signal to client to save the output rather than to display on the browser.

 

So what is the bottom line?

 

Clients can place the request for a resource. All requests are placed in the same way.
Server Response typically include a CONTENT-TYPE. Clients reaction to response will typically be governed by the CONTENT-TYPE specified in the response header and NOT on the request headers. That also implies that the things can change between the request and response.

 

CGI – The http Hack

 

Since the modification to the server was neither desirable nor practical, NSCA developed Common Gateway Interface (CGI) specification. CGI soon became an standard way in which applications can interact with the server and generating dynamic content.

CGI is a specification and not an application or a particular programming language. The specification is designed for an application that reads (http request) from STDIN and presents the output on STDOUT. As long as this requirement is met, any application compiled or interpreted or scripts can act as CGI.

So how does it typically work?

  1. The client will request for the CGI application. Let us say a binary executable file. 
  2. This time server instead of providing the binary executable for download will actually execute the application on server.
  3. The CGI application will generate the output to STDOUT
  4. The output from the CGI application will be piped to HTTP response stream.
  5. Thus the client will get the output of executable rather than executable itself.

Let us understand the whole scenario in http request response format:

 

GET HTTP/1.1 /whats-the-time.exe [cr][lf]
CAN-ACCEPT:*/* [cr][lf]
[…other headers…]
[cr][lf]

You will notice that the request header remains unchanged. However this will work differently this time. Server will actually execute the application whats-the-time.exe on the server. This application will check the current time on the server and print it in html format on the STDOUT. Server will respond to client as:

HTTP/1.1 200 ok [cr][lf]
CONTENT-TYPE:text/html [cr][lf]
CONTENT-LENGHT:1280[cr][lf]
[… other headers …]
[cr][lf]
[… the STDOUT output of the application …]

Notice this time we requested for a .exe. However, the content-type in response indicate a text/html output. The application executes on the server and generates dynamic information which is sent to the client.

But how will the server know whether to execute the executable as an CGI or to allow it as a download? To simply the issue, a directory was designated to store all the CGI application. A request from the designated directory will be treated as CGI, any other requested will be treated differently.

The First CGI application was written in C language. The CGI directory was accordingly named as cgi-bin as the CGI was a binary executable. Ever since it is a convention to name the CGI directory as cgi-bin.

This kind of application opened a Pandora of new possibilities including:

  • User registration and authentication.
  • online data manipulation, query and reporting. This is the most important aspect which is the back bone of all e-commerce, b2b and b2c applications.
  • Accessing the network resources using dynamic interface

 

Beyond CGI

 

A host of technology evolved over time. Many claiming to be superior and more efficient than the original CGI. However, in philosophy they remain exactly similar to the original idea of CGI:

  1. All these technology essentially mapped a request to some external application. Http-Request-Response-2
  2. The application processed the request, interacted with database or other server resources and presented the output.
  3. The output is then sent to the client.

They however, differed with the CGI in some subtle aspects:

  1. Most of the newer technology used scripts rather than an executable. For efficiency, these scripts can be compiled to an intermediate code.
  2. Use of multiple thread rather than multiple process.
  3. The respective handlers are typically mapped to a particular extension rather than a particular folder.

The alternative technologies that work on these principles are asp and asp.net from Microsoft, jsp and servlets from Java world, php and host of other open source technology. CGI scripts are still written mostly in languages like perl.

 

Client side hacks

Although, initially the server side hacks well served its purpose, as the load on server begin to increase problems with this approach became more than apparent.

  • Every decision needs to be taken on the server (html doesn’t have decision making capabilities)
    • This caused un-due load on the server.
    • Round the trip communication between client and server were costly both in terms of time and the bandwidth.
  • If technology could support, many a decision can actually be taken on client side and it even makes more sense.
  • A client side technology, however, can never replace the need of server side technology as several information are available only on the server end.

The first client side technology which revolutionized the scenario was java applet.  Java applet is essentially an application written in java which is embedded in a web page in a manner quite similar to a picture. However, it is an intelligent application written in java. Also the browser need to have special plug-in support to understand and execute an applet in addition to html.

Soon a number of new technologies offered similar approach:

Client side technologies requires special browser addons. These intelligent objects are downloaded off the browser using standard HTTP request and are stored in local machine in a temporary cache. Once loaded these applications are executed with the help of their respective addons. Support for these technology typically depends on browser and their capacity to expand.

Popular client side technology include:

  • Java applet – being the first of its kind
  • Scripts – Java script being most popular. other scripts include vbscript, jscript and so on
  • Adobe Flash – Orginally macromedia flash. Rich graphical experience. Continued to rule its world
  • Microsoft Silverlight – Comparatively new entrant; Offers functionality similar to flash. However, it is backed by .Net programming language which makes it more charming. However, being a Microsoft technology, chances of  full hearted support from industry looks bleak.

 

Ajax

 

A wonderful hack which comprises support from both client side and server side and gets the best of both the worlds. A complete discussion on Ajax is beyond scope of current discussion but certainly merits on on itself.

 

Hack and hell

 

Believe it or not, the two words often go hand in hand. A typical web application is a mix of so many different technology:

  • HTML Document
  • A server side script merged in the HTML document. These scriptlets will be processed by the server side applications to produce clean HTML document.
  • Client side scripts. These scripts will be executed by browser addons.
  • Client side objects. These objects will be embedded within the page and displayed. However, they are not HTML and they remain together physically.

 

A web application is typically made up of different components developed in different technology and they execute in different environment and machine. Their development requires varied set of expertise and negotiating between them is often complicated. Deployment of these technologies on different machines and configuration is another herculean task.

There are real difference in how different browsers interpret various client side components such as html, css, javascript etc.

All these things create a real hell for web developer and confusion for end users.

We still dream to see a more perfect web world