Amit's Web Proxy Project

Proxy 2

The typical interaction between a web browser and a web server occurs in two stages: first, the browser sends a request to a server, which performs checks to ensure the request is to a valid document and that the browser has permission to access the document; second, the server sends a reply to the browser, which performs checks to ensure the valid is valid and if the document is executable (a Java applet, for example) that it does not violate any security restrictions as it executes. A web proxy server can sit between a browser and server. Most web proxies are used for firewalls, caching, or filtering. The MURI Proxy both filters and modifies documents sent in a reply. In particular, our proxy modifies Java applets to restrict their behavior. This proxy is not designed to be a foolproof system to catch all hostile Java classes that are sent over the network; rather, it is a convenient tool with which we can experiment with restrictions for Java applets.

The general architecture of the MURI Proxy involves intercepting and acting on both HTTP requests and replies. A request can be handled in one of four ways:

  1. Block. A blocked request is never answered; the browser's connection is closed and the browser reports an error or uses a "broken image" icon.
  2. Answer. An answered request is handled directly by the proxy. In a sense, the proxy acts as a web server.
  3. Redirect. A redirected request is sent to a location other than for what location it was originally intended.
  4. Forward. A forwarded request is not modified by the proxy; it is sent to the web server for which it was originally intended.

Requests that are redirected or forwarded to a remote web server will result in a reply document, which is intercepted by the proxy. A series of transformations can be applied to the document, with the result being sent to the browser. The URL and MIME type for the document are examined to determine how it is to be transformed. The web browser will not know that the document was modified by the proxy.

At present, the MURI Proxy is capable of the following actions on HTTP requests:

  1. Block. The proxy blocks access to a number of sites known to serve only advertising.
  2. Answer. The proxy handles documents at http://_proxy/*. The most important is at /start/, which launches a window with a user interface applet giving the user control of and information about the proxy.
  3. Redirect. The proxy redirects requests for special Java classes (see Java class filtering, below) to a site that contains bytecode for these classes.
  4. Forward. All other requests are forwarded.

The MURI Proxy also contains several modules for transforming Java documents:

  1. HTML documents containing references to an image from a blocked site are modified to omit those references.
  2. Java applets using frame windows (windows that appear outside the browser) are modified to:
    1. restrict the number of frame windows to ten;
    2. and restrict the size of frame windows to 500x400.

Download

To run the proxy demo, it is required that you be using a browser (such as Netscape Navigator or Microsoft Internet Explorer) that supports web proxies and Java applets.

To run the proxy on your own machine you will need the Python programming language interpreter installed. (Python is available for Unix, Windows, Macintosh, OS/2, BeOS, and several other platforms.) Download the MURI proxy demo (version 2) from here. The proxy has not been tested on non-Unix systems.

To see the behavior of the proxy, it is recommended that you compare the behavior of your browser when the proxy is not being used to the behavior when the proxy is being used. Note: Be sure that you exit and restart your browser after changing proxy settings; otherwise, documents may be in the cache and may not be loaded through the proxy.

These pages will act differently when the proxy is being used:

  1. CNN.com and many other pages have their advertisements removed.
  2. Java Applets: the size and number of windows is restricted.

You can also bring up the user interface applet, which displays information about the proxy. The menus let you control which proxy modules will be active.

Proxy 3

Version 3 of the proxy is less focused on Java applet security research and more towards HTML and Javascript filtering. I found that I liked using the proxy for my daily web surfing, but there were several problems. The main problem was that version 2 did not perform well when running multiple browser windows simultaneously. Version 3 uses an event-driven architecture that can handle hundreds of simultaneous connections. In addition, the content filtering in version 3 is more modular and can handle streaming (so that portions of the document can be filtered and sent on to the browser before the entire document is loaded). Version 3 of the proxy supports a configuration file and loadable modules.

Download


proxy3.tar.gz
proxy3.zip

The proxy has not been tested on non-Unix systems.

Proxy 4

I found that proxy 3 suited my needs well. However, there is a major restrictions in the design. One is that each connection between the browser and the web server is represented as a single object. I decided to separate the browser connection object from the server connection object. The design of proxy 4 allows for (a) browser connections handled directly by the proxy (no server object) for special URLs or caching, (b) multiple browser connections (usually images) serviced by the same server object, (c) server objects with no browser object (for prefetching content). A side effect of separating the browser object from the server object is that the input and output buffers are decoupled, making the proxy much smoother. Proxy 4 also doesn't lose data like proxy 3 does in some situations (usually loading large pages over a fast link).

Proxy 4 is still in development. I've implemented basic proxying, HTTP/1.1 performance features (gzip encoding, chunked encoding, persistent connections), and asynchronous DNS lookups. However, I have not implemented content filtering, loadable modules, status URLs, or a configuration file.

Download


proxy4.zip

The proxy has not been tested on non-Unix systems. However, it's more likely than earlier versions to work across platforms, because it uses the cross-platform asyncore library for networking.


Last modified 15:24 Sun 26 Nov 2000 , Amit Patel