Project Computing

Google Desktop Proxy


Providing access to the Google Desktop search service from remote machines

23 Oct 2004
Last updated: Kent Fitch, 01 Jan 2005 - version 0.2
incorporating Pavel Ševčík's encoding change
Since this tool was created, Google Desktop has changed a great deal and the approach taken here nolonger works. This means that this code will not work asis with the current version of Google Desktop.

Background and motivation

The Google Desktop allows you to search the contents of your hard drive, email and browser history contents.

The search interface runs in your browser. The Google Desktop software provides a web server which binds to your PC's TCP/IP loopback interface and can only be accessed by requests originating from your PC.

Hence, other machines can't send search requests to your Google Desktop search engine.

This is generally a good thing from a privacy point of view: do you really want random people looking at not just the contents of files on your PC but at emails and at web pages you've retrieved and even deleted from your browser cache?

Well, maybe. Maybe you like to share everything. If so, read on.

If Google Desktop were accessible from places other than the machine on which it was installed, some interesting possibilities would arise:

The Java program described here allows others to search your Google Desktop.

It contains some very simple restrictions which may allow you to make it harder for some people to search and retrieve content from your PC:

  1. Requests to configure Google Desktop preferences or delete content from the Google Desktop index are blocked unless they originate from the local machine
  2. By default (as installed) access is only granted to the local machine. However, a parameter file can be supplied which defines which IP addresses or IP address prefixes are allowed access. Be aware, however, that it is optimistic in the extreme to have confidence in associating IP addresses with people or organisations, and that for example, a compromised machine within a trusted IP range can easily broaden access to "the world".
  3. A log file can be specified in the parameter file (but be aware that if the parameter file is stored on the PC, it too will be indexed and retrievable via the proxy, as will the proxy parameter file...). The log records date/time, remote IP address and the start of the request so you can see which horses have bolted in vaguely which direction.

[Since this simple program was released, a more full-functioned/easier to configure alternative has appeared - see http://dnka.com/. I haven't tried it, but it looks impressive.]

How it works

This program is a very simple proxy. A browser or other program on another computer can open a connection to this program which passes the request through to the Google Desktop web server on the same machine as the proxy. Because it is on the same machine, the Google Desktop web server processes the request and passes the result back to the proxy which in turn sends it back to the originating requestor.

The Google Desktop server sends back 6 types of results (maybe more?):

  1. Status, home page: these are passed straight back, rewriting the Google Desktop web server's network address of 127.0.0.1:4884 URL's to the address of the proxy
  2. Search result listings: ditto.
  3. Data retrieved from the Google Desktop Cache: ditto
  4. Data (files) retrieved from the local system disk: Google Desktop opens a window containing the file. This is not appropriate when the request comes in remotely, so instead the proxy reads the file from the disk and sends it to the requestor, hopefully with the right MIME type (determined by file extension). Any URLs in the contents aren't rewritten so unless they are absolute references or the retrieving machines sharse the same address space as the source machine (eg, drive mappings) links will be broken.
  5. Redirects to "live" web servers. The proxy sends the redirect back to the requestor without contacting Google Desktop (because I think that sometimes at least Google Desktop tries to open a new browser window on the local machine).
  6. Preferences and remove-from-proxy functions: these disallowed except from the loopback address.

Parameters

On startup the proxy will attempt to read parameters from a file called proxyParms.txt in the current directory. The name of this file may be changed with a command line parameter to the proxy.

The parameters allow specification of:

If you want anyone to be able to access the proxy, specify:

allowIPAddresses=*

If you want requests from any machines in the 192.168.1 subnet to access the proxy, specify:

allowIPAddresses=192.168.1.*

If you want requests from any machines in the 192.168.1 subnet or from the adress 202.173.90.123 to access the proxy, specify:

allowIPAddresses=192.168.1.*,202.173.90.123

The loopback address 127.0.0.1 can always access the proxy and doesn't need to be specified.

If you let any machines access the proxy then eventually search engines such as Google will find and index it...

Think carefully before allowing any access - as mentioned above, once any external access is allowed it is impossible to guarantee wider access is not leaked - it is like sharing a secret.

Problems, things to fix

  1. Unicode support - bytes and strings are interchanged irresponsibly (Pavel Ševčík has supplied a fix (version 0.2 of the code) which starts to address this problem)
  2. I don't have much confidence that the restrictions on access to the index deletion and preference are robust - I suspect some unicode or character escaping mechanism may defeat the restriction on access from the loopback address
  3. Only perform URL rewriting on HTML responses
  4. URL rewriting could change the content length of the response (eg, if the proxy is listening on port 999 or 11111) - check that there is no content-length header in the response, or store and forward the response recomputing a new content length
  5. Improve the URL rewrite scanner - especially fix end of block processing (easy)
  6. After changing the parameter file you have to restart the proxy.

Enhancements

Installation

Please be aware that a tool like this combined with the search precision of Google Desktop makes it very easy for the contents of your PC and your browsing history,emails and chat to be seen by systems and eventually persons unknown.

Do not use this tool on a system containing data that you don't want to share with the world.

Either:

  1. Save or copy/paste the Java program source found here (13 KB) to your system and compile it like this:

    javac DesktopProxy.java

    Then save or copy/past and edit the default parameters found here (must be saved as proxyParms.txt in the same directory unless a run-time command argument is passed to the proxy telling it where to find the parameter files), or

  2. Download the precompiled version as a Java JAR file found here (9 KB) and then save or copy/past and edit the default parameters found here.

Running

Open a command window and CD to the directory containing the proxy. If you compiled it yourself, run by entering this command:

java DesktopProxy

If you downloaded the JAR, run by entering this command:

java -jar DesktopProxy.jar

Now, bring up the standard Google Desktop search screen. Edit the URL to replace the port number (4884) with the port number the proxy is listening on (by default, 8088). Press "GO" (or whatever) and you should see a response from Google Desktop coming via the proxy.

To access from another machine, replace the "127.0.0.1" part of the URL with your machine's IP address, and update the proxyParms.txt allowIPAddresses parameter setting to allow access from this other machine by entering the other machine's IP address. Kill and restart the proxy. Try your machine's IP address and proxy port and Desktop Search URL from the other machine, eg:

http://192.168.1.23:8080/&s=4065774076

(The last part of that URL will vary from installation to installation.)

If it doesn't work, maybe the request is coming through a web proxy - check the desktop proxy log for a message. Note that if you open up access to a web proxy then anyone using that proxy will be able to access the Google Desktop on your machine.

Support and License

There is neither - use at your own risk however you like. Do what you want with the code.

This tool was produced as a personal project by me, Kent Fitch, and has nothing to do with Project Computing Pty Ltd.

History

Project Computing Pty Ltd ACN: 008 590 967 contact@projectComputing.com