Google Wave’s Java Bot API: Part 1

Google’s Wave has also come with a host of new development APIs for creating bots and embeddable gadgets. I have not had the chance to mess around with gadgets yet. However, I have been working on a simple bot for Google Wave (see below), written in Java. This project was started to create group management for the Fortress Forever community on Google Wave. Currently the community is extremely small; only seven people. Some of those people may not even remain active on Wave; they just wanted to see it and mess around with it. Anyway, that has not stopped this small project. It started out with simple group management, but is progressing to something bigger, badder, and better.

This is the first in a small series of small reviews about Google’s bot API. As I discover new things about the API, I will review them here. This first article will focus on the basics of the API overall. Subsequent articles will delve into more specific things. This particular article will also be focusing on the Java API, as I have not experimented with the Python version of the client library yet.

The Basics: Pros

Overall, the API is very straightforward for a platform that is as complicated as Wave. All of the complications of the Wave protocol are hidden inside a nice client library (currently Python and Java versions exist). Bots written in Java are J2EE web applications that build on the foundational concepts of servlets. The Java bot API provides a servlet class named AbstractRobotServlet.

You must extend this class and implement its processEvents(RobotMessageBundle bundle) method. This is the entry point for the bot, and will be called every time an event is raised by the Wave platform. From the RobotMessageBundle, you can get the wavelet that raised the event, as well as check some event types. Once you have the wavelet, you can do more things such as create new wavelets, append blips, edit text, and more.

The Basics: Cons

The most annoying thing I’ve found so far is that certain bits of information you may want use in order to process an event tend to be spread across the RobotMessageBundle, the Wavelet, and the Blip objects. This leads to one passing around the three different objects to fully process different events. This problem mostly arises as a design issue in the software, though. With proper design practices, passing around all three objects as parameters to methods can be avoided. However, I don’t see it being completely impossible that a framework built on the Wave robot library will arise.

The other thing I have found to be of minor annoyance is having to increment “version numbers” in the capabilities.xml file when a new event is added to the bot. This forces Wave to update its list of known capabilities on the bot. It seems a bit obtuse, rather than just having a forced capabilites.xml update whenever a new version of the bot is deployed to Google’s App Engine, or allowing the user to manually trigger an update the next time the bot is hit from Wave. I suppose the forced update on every new deployment would eat up more of Google’s resources than necessary, but the manual trigger for an update wouldn’t be that bad.

Perhaps the versioning stuff could be abstracted by the Google Plugin for Eclipse with a checkbox on the deploy screen that gives the option to force the Wave to update the bot’s capabilities cache. It’s a small change (and definitely of low priority given everything else that needs to be finished first), but would make things even easier/straightforward.

Google Wave Development

I recently got ahold of a Google Wave account and started experimenting with development. My first foray into Wave development was with a bot. I made a simple group management bot (it’s not actually complete yet). Currently, Wave does not have any functionality for managing groups of users. While that will change in the near future, the immediate situation required a simple group management system. Users can add themselves to the community bot, and then they will immediately be subscribed to the community’s waves. Yes, there are bots that do this and are a bit more genericized to allow for multiple groups in one bot. I did it for the learning experience. The bot itself was not that hard to code. It’s not very big–having only four source files. It was learning the API, the way the Wave robot client library works, and Google’s App Engine that took up a little bit of time.

This bot will probably become useless within a few weeks after Google implements its groups functionality, but this will work for now.

Silly Thought: Client-Server Object Bridging

This was a silly thought I had at work the other day that I thought would be cool. I’m not quite sure if it’s practical or not, but I think it’s cool. The basic idea of this system is to use a JavaScript API to create objects in the client-side script, and then have those objects transparently replicated on the server. A possible extension to this would allow objects on the server to be transparently replicated on the client as well.

At the most basic level, neither idea seems too complicated (although client -> server binding is harder than server -> client binding). If the replication were to go both ways, we would have a bridged object, because it can be accessed in either the client or server environment.

Client -> Server Binding
Using XML-RPCs (remote procedure calls), the JavaScript library will send out requests to special receivers on the server. These receivers will use platform-specific methods to decode the XML and translate it into a server-side object. The response of these receivers, then, will be an XML message confirming that the binding was successful.

Server -> Client Binding
This is easier than the above. Basically, this boils down to getting all of an object’s member fields and method names and sending a bunch of JavaScript code in a response, which is then injected into the DOM.

Complications
The idea gets complex very quickly. For example, where does code that modifies the objects get executed? How does synchronization between client and server take place transparently in the background without having the developer have to do anything more than extend from a special object or implement some interfaces? How does this fit into the general lifecycle of a web application going from the user’s request to the server’s eventual response? Could an implementation of this concept be secure? How would the server know it’s receiving a valid request that the developer of the web application programmed in, and not a malicious JavaScript injection?

I propose here some basic answers to these fundamental questions. They are fundamental, because they are likely the most important for this concept to have any chance of working effectively, working only somewhat in a limited context, or if it won’t work well enough at all to be of any use. To start off, I think it would be prudent to begin with how these “bridged” objects fit into the general lifecycle of a web application from user request to server response.

How bridged objects fit into the FORM request-response life cycle?
At the most basic level, a web application consists of an HTML form, some server-side code to process that form, and an HTML response. The server-side code generally generates all or most of the response, so it can include dynamic response information such as a unique order number for a user. The server-side code may use objects to help keep everything in line. These objects will be created the developer and populated with data from the form submission. In order to make this bridged object binding idea work, we can actually re-think how forms are submitted.

Instead of relying on a normal POST request and manually-created code to decode the POST information on the server, we can instead create a server-side object that will encapsulate our data. Through the use of this binding library, then, we can submit the form information as one of these server-side objects. We could specify a special method type for submission with the form. When the page loads, the JavaScript library will take care of all the necessary things in the background. For example:

HTML:

<form action="processform.php" method="bind">
<input name="theBox" size="30" type="text" />
<input name="submit" type="submit" value="Submit" />
</form>

JavaScript:

function theForm_submit() {
     //FormObject's class definition resides somewhere on server.
     FormObject fo = new FormObject();
     fo.setInfo(theBox.value);
     fo.bind();
}

Server-side PHP (processform.php):
FormObject fo = getClientObject("fo");
string $info = fo->getInfo();
//do something with $info

This woefully incomplete pseudo-code example with invalid HTML markup is one of the simpler ways I could see this idea coming into usage. Here we have declared a function in JavaScript that handles the actual binding of the object with the server. Not very transparent at all. The first goal would be to remove the explicit call to the bind() method. The second goal would be to remove the need for the JavaScript entirely, where special values specified in the form itself are automatically interpreted by the JavaScript library and bound objects are created in the background.

Whether or not this usage of this hypothetical JavaScript/server-side library is even good, or useful, is another story entirely. To determine that, we will have to investigate other parts/usages of the web application request-response life cycle where a form is not necessarily being submitted. For example, AJAX calls based on button clicking could create an object and store it on the server. In the coming weeks, I hope to explore answer to those parts of this first question, and of course, answers to the remaining questions.

Concept: Simulating POST and GET Requests with JavaScript

After reading this title, you’re probably thinking “why would I EVER need to do that?” The answer is probably never in a practical situation. It could be employed in slightly less-than-ethical areas such as XSS attacks. It could be employed in quick data transfer through a multipage form. Of course, if you’re doing that you likely have a framework of some sort doing it for you in the background. Or, maybe you’re writing one of those kinds of frameworks. In my case, it’s part of a larger idea to simulate a “server” in cPHP–a (very) experimental client-side PHP->JavaScript translator/compatibility layer thing. Since cPHP itself is AJAX-based, and the JavaScript engine needs to control the entire translation and execution of the script, it needs to be able to maintain data between form submissions and provide to its PHP implementation a reliable “illusion” of a server.

There isn’t really any code in this post (that will hopefully come some day later); it’s just a few different ideas on how this could work. The first idea that crossed my mind was to allow form submissions to work as normal; as they are specified in the HTML. We could parse the querystring (the URL in the address bar) to get all of the information we need. This is probably the simplest approach, but it has severe limitations. Firstly, POST requests would not work at all as JavaScript is not aware of what is POSTed to a server. In my case, I don’t even care about the server. Secondly, the page will reload, which may require forwarding of other state information between the two pages.

My other idea is more complicated. The JavaScript, when loading the page, will actually inspect the form tag(s) in the document, modify them so that no request is sent, but find its submit inputs and create a function that “submits” the form, recording data internally. In the case of cPHP, it would be easy from this point to populate the $_POST associative array or $_GET associative array, depending on the original method attribute specified by the form tag.

RingMUD Being Worked on Once More

I’ve started working on RingMUD once more. This time, I’ve hammered out a to-do list that will finally put the MUD from alpha into beta. Beta, in this case, means that the base engine system is done and that the content needs to be created. Code work would continue of course during the beta phase, but the larger focus would be on developing the XML files that describe items, player classes, and the like.

RingMUD is a MUD engine, rather than a specific MUD coded in Java. That means the content should be specified outside of the code itself. The goal is to create a D&D 3.5-based system that can be used by people to design any world they want.

Here’s a general idea of the to-do list for the MUD:
1. Modify movement code to support different zones.
2. Implement serialization/saving of player characters and the world state. (IN PROGRESS)
3. Implement defining player classes, items, etc in XML files. (IN PROGRESS)
4. Convert all code documentation to Javadoc.
5. Move many options outside the hard code into a config file.
6. Implement equals methods for various classes such as Race, Body, and BodyPart.
7. Investigate display problems that some MUD clients seem to be having.

Here’s some things that have already been done in a day:
1. Standardize client-server communication code instead of having methods scattered everywhere.
2. Rewrite and simplify code that parses outgoing data.
3. Clean up the logon session class. It was extremely hard to read previously.

Thermetics Reborn (Again)

So I’ve decided to once again update my site. This time, however, I installed WordPress instead of coding my own site from scratch. The old site still exists, but this will be the new one that I use. I was lazy and didn’t feel like creating something new or fixing the problems with the old site. I’ve got WordPress customized to the way that I want it so that it meets the goals for my previous layout, which was basically a blog-esque content management system.

More improvements and content will be coming in the near future.

Software Management on OpenSolaris

One of the main issues I’ve always had with Solaris is finding packages to install. Back when I used Solaris XDE 9/07, this was much worse than it is today. In OpenSolaris, there is a new technology known as the Image Packaging System (IPS). This finally brings centralized software management capability to Solaris. It’s something many Linux distributions already do, and is basically an expected feature these days.

The default OpenSolaris package repository has a fairly large number of packages, but it seems to be lacking in some areas. The amount of packages in the default repository are growing, but it’s missing things such as MPlayer and other programs. Luckily, there are other IPS repositories that help alleviate this! The following commands can be executed in a terminal to add new repositories to the package manager:

pkg set-authority -O http://blastwave.network.com:10000/ Blastwave
pkg set-authority -O http://pkg.sunfreeware.com:9000/ Companion

Now, whenever you use the

pkg search -r

command, you should be able to search from these repositories and install from them.

Path Variables:

Packages from Blastwave will install to the directory /opt/csw and entirely isolate themselves from the rest of the system. It has its own bin directory, etc directory, etc. Make sure you add /opt/csw/bin to your PATH to get these programs to work properly:

export PATH=/opt/csw/bin:$PATH

You may want to put that in your .bash_profile or .bashrc file in your home directory so it gets loaded all the time.

Other Places for Packages:

Despite the addition of two more repositories, there will be times when the package image we want just doesn’t exist (Wine? Hello, where are you?). Luckily for us, OpenSolaris has pkgadd and friends for “backwards compatibility.” The pkgadd, pkgrm, and other related commands are used to install programs from .pkg files. It used to be the way Solaris handled software installation. There are some other websites that contain various software packages for installation:

http://www.sunfreepacks.com/

A small, but very nice collection of programs, including an up-to-date version of Wine for Solaris 10 and OpenSolaris (make sure to download the proper version!)

Finding Your Hard Drives in OpenSolaris

If you’re coming from a Linux or Windows system, the first thing you will notice about Solaris systems is that the disk device labeling scheme is rather omgwtf?! when you first see it. Every disk is labeled something like c0t1d0p2. AND, not to mention, there are two places where these huge lists of strange disk IDs appear: /dev/dsk and /dev/rdsk. This can be confusing to someone who doesn’t know their way around a Solaris system.

Fresh and Raw
The first thing to get straight is the difference between disk (/dev/dsk) and raw disk (/dev/rdsk). Devices in the /dev/dsk directory are things that you can mount, unmount, format, and otherwise manipulate with commands. Of course, not all of these devices are used at any given time, which is where a certain program comes in, described below.

The /dev/rdsk directory contains the “raw disks,” the operating system’s abstraction of the raw data. This is used with certain programs that need to access the raw data directly, as we will see below.

Controller What?
The next item on the agenda is to figure out the somewhat funky naming scheme of disk devices in Solaris. On Linux systems, it’s relatively simple. It’s usually something like hda1 or sdb2. The “a” or “b” in the name refers to a physical hard drive, with each number referring to a partition on that hard drive. The “s” and the “h” just seem to be used interchangeably. Maybe it was changed in a kernel upgrade or something.

Solaris uses a significantly different naming scheme. As stated before, the names are things like c0t1d0p2. This website offers a good technical/historical explanation of what the names mean in full. I suggest reading it. Although if you’re using OpenSolaris, you should ignore the part about the “s.” Assuming you only have one hard drive, the main thing of importance to you is the partition number: the number following the “p” in the name. Here’s a summary of what that website says about the names, organized by letter:

C: This refers to the disk’s controller, the internal thing that’s controlling read and write access to the disk. Some disks will appear on different ones depending on what they are. For example, my two internal HDDs appear on c3, while my external USB HDD appears on c1.

T: This is the target. It’s related to the controller. On my system, my two internal hard drives are different targets on the c1 controller: c3t0 and c3t1.

D: This refers to the actual disk. Disk number. Since my targets each refer to an individual hard disk, I have c3t0d0 and c3t1d0.

P: This is the partition number and is what you should be most interested in if you’re running a multi-boot system on a single partitioned hard disk. p0 refers to the entire disk. This is a very important concept. p1 – p4 refers to the primary partitions, with p5 and higher referring to logical partitions.

How I Find Mah Disks?
This is where we introduce the wonderful program of prtpart. It was developed by the BeleniX distribution of OpenSolaris as a partition table viewer. To install this you will need to download the following files:
FSWpart.tar.gz

FSWfsmisc.tar.gz

The first file contains the package file for prtpart while the other contains a package for support for a bunch of miscellaneous file systems that extend the capabilities of prtpart. Install them cding to the directory that you downloaded them to and then put in:

gunzip -c FSWpart.tar.gz | tar xvf -
pkgadd -d . FSWpart
gunzip -c FSWfsmisc.tar.gz | tar xvf -
pkgadd -d . FSWfsmisc

Now you should be ready to use the prtpart program. It must be run with root permissions, so use it with pfexec or start a root shell and execute it. Just executing prtpart without any arguments will produce output like:

Available disk devices:

/dev/rdsk/c3t0d0p0
/dev/rdsk/c3t1d0p0

Use /usr/bin/prtpart  to get partition details
Use /usr/bin/prtpart -help for usage help

This lists all the disk devices the OS currently knows about. Notice that it lists the rdsk devices. If you then execute “prtpart -ldevs”, it will give you the /dev/dsk devices corresponding to that raw disk. Example:

$ pfexec prtpart /dev/rdsk/c3t0d0p0 -ldevs

Fdisk information for device /dev/rdsk/c3t0d0p0

** NOTE **
/dev/dsk/c3t0d0p0      - Physical device referring to entire physical disk
/dev/dsk/c3t0d0p1 - p4 - Physical devices referring to the 4 primary partitions
/dev/dsk/c3t0d0p5 ...  - Virtual devices referring to logical partitions

Virtual device names can be used to access EXT2 and NTFS on logical partitions

/dev/dsk/c3t0d0p1	IFS: NTFS
/dev/dsk/c3t0d0p2	Linux native
/dev/dsk/c3t0d0p3	Linux native

If you’re just trying to find your hard drives, this is one way you can use to find them.