Algorithms and OOD (CSC 207 2014F) : Readings

An API for Ushahidi


Summary: We consider a simple API (application-programming interface) for accessing data from the Ushahidi open-source crowdmapping program. Along the way, we explore a few basic issues in object design.

Prerequisites: Java basics. Polymorphism, inheritance, and interfaces. Using Ushahidi.

Introduction

As you've discovered, Ushahidi is an interesting experiment in providing an open-source crowdmapping platform. By choosing a rich and task-independent data model and set of operations, the designers have produces an application that can be used for a wide variety of problems, from reporting on election violence to supporting emergency relief services to keeping track of all of the nearby restaurants.

As a programmer, you might ask yourself what you can do “behind the scenes” to work with Ushahidi. One possibility is to work on the Ushahidi platform itself. There's certainly a wealth of possible contributions one could make. Another would be to provide non-code support: Documntation, translations, etc.

However, we're going to consider a different approach. Rather than working directly with the Ushahidi code, which is primarily PHP, we will work with data from Ushahidi servers. How are we able to do so? Like all good modern application designers, the designers of Ushahidi have provided a language-independent way to access Ushahidi data (and Ushahidi servers) on the Web and some friendly folks have taken advantage of that service to build a set of Java classes that let you work with data on an Ushahidi server. As is convention, we call that set of classes an API, or Application-Programming Interface, because they provide an interface (set of capabilities) that programmers can use.

In this reading, we consider some issues in the design and use of a simple Java API for Ushahidi.

Pause and Reflect

If you have not done so already, reflect on what classes and methods you would use to represent not only the data in Ushahidi, but also the basic operations a client that analyzes data could use.

Getting Started: Modeling Incidents

As you have no doubt observed, at the core of Ushahidi is something that you might call an “incident” or “report”. (There's a subtle difference between the two, but we won't worry about it.) Each time someone contributes data to Ushahidi, Ushahidi wraps that data together into a neat package.

What data? Every incident seems to have a title; a description; a year, month, day, hour, and minute that the incident was created; a longitude, latitude, and location name; one or more categories; an optional collection of media; some comments, and more. Each incident also has some hidden characteristics, including a unique numeric identifier, a mode, and notations as to whether the incident is active and verified. Each incident can have some installation-specific fields.

How do we put that together into a class?

We might start by grouping together related pieces of data and identifying classes for those groups of data. For example, all of the values that go into a data or time should probably be one class. What other groups do we have? We have all of the location information. We probably have information on each user (although users were not mentioned above). If we delve deeper, we'll note that each comment has multiple pieces of information, including the user who made the comment, the date the comment was made, and the contents of the comment.

We now have a number of groups of fields. It probably makes sense to represent each group as a separate object. Why? In part, using objects rather than individual fields simplifies the design of the incident class a bit. More importantly, the additional objects help ensure that related data are kept together. Finally, it is likely that we can use many of these classes in other situations. For example, locations might be useful in other programs.

Of course, if these objects are likely to be useful in other programs, they might already exist. So we should check the Java API to see what kinds of classes are available. Amazingly, the design of times and dates in Java was somewhat iffy up to Java 8. As a recent article suggests, “A long-standing bugbear of Java developers has been the inadequate support for the date and time use cases of ordinary developers.” But Java 8 has a variety of better-designed classes which appear in the java.time package. The class java.time.LocalDateTime seems to serve our purposes.

What about our other groups of data? Java doesn't seem to have locations with latitude and longitude, and the fields in Ushahidi comments are specific enough that it's almost certainly necessary to create our own class. It's tempting to set up a representation of users, or at least user names. However, Java doesn't appear to have a person or name class (possibly because different applications will store such data very differently), and it appears that the data shared by Ushahidi doesn't have that additional information. We'll just stick with a string. Finally, the information on an Ushahidi category is specific enough that we'll want to represent it ourselves.

Wow. That was a lot of quick analysis. Let's see what classes we've come up with.

  • UshahidiIncident. Our primary class. Stores all of the information on an incident or report, or at least all of the information that is available to us.
  • java.time.LocalDateTime. Java 8's new mechanism for representing dates and times.
  • UshahidiComment. A comment on an incident. Includes the author of the comment, the text, the date it was created, and so on and so forth.
  • UshahidiLocation or Location. Information about a location. Locations are complicated. For example, how close do latitude and longitude have to be in order to consider two locations the same? And if we construct a location with only a name, should the constructor try to find/guess the latitidue and longitude? Since we're building this class for this particular application, and since we'll get all of the data from the server, we'll stick with a very simple representation.
  • UshahidiCategory. The categories of incidents.

We've thought about the kinds of data we're working with. And that's a reasonable first step when using objects to model data. It's almost time to think about the methods we want to include in each class. But not quite. Before looking at methods, we should consider some general design issues. In particular, we should decide whether or not the objects we create should be mutable or immutable. We might also consider what data should and should not be available to the client, or, more generally, whether we want to provide relatively open access to data or relatively closed access (e.g., should we only provide a s few methods that give compound info as a string, or should we give access to each field?).

Our first impulse is often to make our objects mutable. But it's not clear that that's a good idea in this case. For example, if we've fetched an incident from a server and change one of its fields, what are the semantics of that change? Does it mean that we're changing the local copy or that we're expecting the local change to propagate back to the server? Either choice is likely to have some unexpected effects, as some programmers will interpret it one way and others will interpret it the other, even if we document our choice carefully. So, let's make the objects immutable.

Do we provide atomic access to information, or only compound access? Since we can't predict how clients will use want to use the objects, it seems best to just give the client the ability to get each field individually, rather than to limit clients to compound data.

The analysis so far suggests that we should provide a getter for each primary field in each object. Do we need other methods? It will also be useful to have a toString method.

What next? We need ways to fetch incidents and we need ways to submit the incidents we create to a server. We might have a submit method associated with each incident. We might also provide class methods to fetch incidents. The latter seems a bit iffy ... you might want to fetch incidents from multiple servers, and that seems like an instance in which we should have an object for each server. And, if we're creating objects to fetch incidents, we should probably also create objects to send incidents.

All of this analysis leads to a fairly straightforward design for Ushahidi incidents and related classes: We will create immutable objects which provide individual access to the different pieces of data, but not much additional functionality. Do you need to see the code? Probably not; given a list of fields and a requirement to create a getter for each field, you should have no trouble envisioning the code or even writing it yourself. (You can read the documentation if you really want to see details.)

A Client's Perspective

In leaving the ability to fetch incidents to a separate class, we've left ourselves to design that class, or perhaps a corresponding interface. Let's start with the interface, since we are focusing more on actions (getting data) rather than on representing data.

In designing the interface, we are somewhat constrained by the ways in which Ushahidi servers present data and by the ways in which we envision clients working with the data. (We can work around some limitations in how the servers present data, but not as many in how we envision using the data.)

So, what are the simplest things a client wants to do in terms of incidents. Probably find out how many there are. Since each incident has an id, it might also be useful to identify those identifiers. Unfortunately, the Ushahidi public API provides access to neither sort of data.

But that's okay; we'll just look at the data differently. In a typical information processing task that involves a collection of items, one often looks at each item in turn. That suggests one core method, nextIncident(), which returns a new incident each time it is called. Will the method ever fail? Well, at some point we'll run out of incidents. How should we handle that potential failure? We could provide a hasMoreIncidents() predicate that holds only when more incidents are available and require that the client check before calling nextIncident(). We could throw an exception when no more incidents remain. We'll do both.

Given the decision to throw exceptions when no incidents remain, we might have the following signature.

  /**
   * Get the next unseen incident.
   * 
   * @return
   *            Some incident that has not been previously returned,
   *            assuming such an incident exists.
   *
   * @throws Exception
   *             If no incidents remain.
   */
  public UshahidiIncident nextIncident()
    throws Exception;

We also need a declaratoin for hasMoreIncidents.

  /**
   * Determine if any unseen incidents remain.
   *
   * @return true, if incidents remain; false, otherwise.
   */
  public boolean hasMoreIncidents();

That's a start. We have two methods. What else?

We need a way to supply information on the server and connect to the server. Those could be one or two methods (probably one, since it doesn't make sense to change the server after you've connected, and it doesn't make sense to connect unless you've already specified the server). But it might be simpler to supply that information as a parameter to the constructor - after all, it doesn't make sense to grab information from the server until you have a server.

But wait! We're designing an interface. Interfaces don't have constructors. So, what should we do? We could give up on having a separate interface, but that might limit some of our other choices. We'll just have to be extra careful to remember that the corresponding class needs to connect to the server.

Some clients might want to filter information, such as getting only incidents within a certain distance of a specified location, or only recent incidents, or something similar. To support such approaches, we need a way to represent those criteria. And there are a host of possible criteria. Representing selection criteria is complicated, and probably beyond the scope of this course.

As functional programmers, we might be tempted to pass a predicate to nextIncident and hasMoreIncidents and have them return the next incident that meets the predicate or information on whether such an incident exists. Until Java 8, that required that we design an appropriate structure for predicates. However, as of Java 8, there is a Predicate interface in Java. Still, predicates are somewhat complicated to think about, so we'll make them optional.

Are there other reasonable methods? Some clients might want to take all of the information in one fell swoop, rather than grabbing information a piece at a time. So we could provide a getIncidents method. Such a method is somewhat dangerous, particularly because we don't know how many incidents are available, but we'll include it for users who want to think about data in array form.

  /**
   * Get all of the incidents associated with this instance.
   * It is generally nicer to grab incidents one-by-one, but
   * getting all of them can be useful for certain circumstances.
   *
   * @return
   *            An array of incidents, in no specified order.
   */
  public UshahidiIncident[] getIncidents();

A Web Implementation

We have an interface. What next? We need to implement it. And the obvious implementation involves connecting to an Ushahidi server over the Web. So we'll call the class UshahidiWebClient.

/**
 * An Ushahidi client that gets data from the Web. The most typical form of
 * Ushahidi client.
 * 
 * @version 0.4.1 of 24 September 2014
 * @author Samuel A. Rebelsky
 * @author Daniel Torres
 */
public class UshahidiWebClient
    implements UshahidiClient

The details of the implementation are beyond the scope of this reading, which is intended to focus on overall design and what you can do with these various classes. However, we should note that we need to follow through on our promises from above and provide an appropriate constructor.

  /**
   * Create a new client that connects to server to obtain data.
   * 
   * @param server
   *            A string that gives the prefix of the URL, including the
   *            protocol and the hostname. For example,
   *            https://farmersmarket.crowdmap.com/
   * @exception Exception
   *                when we cannot connect to the server.
   * @pre The constructor must be a valid URL for an Ushahidi server.
   * @post The server is available for obtaining values.
   */
  public UshahidiWebClient(String server) throws Exception

When you explore the documentation for UshahidiWebClient, you'll note that there is an additional constructor and an additional method for people who want to customize interaction with the Ushahidi server. Everyone else can just use the standard UshahidiClient methods.

A Test Harness

Arent we done? Almost. But let's think about some good programming practice. When we're developing programs that use Ushahidi data, we want to be able to design tests. We also want to be polite users of Ushahidi servers. Hence, we should provide a way to support simulated Ushahidi servers. We're calling the simulated servers “testing clients”, although that's a bit ambiguous, since they provide code to support the testing of clients, rather than serving as clients that test (or things that test clients).

So, what is an UshahidiTestingClient? It's an object that provides an addIncident method so that we can add specific incidents to use for testing. It also implements the UshahidiClient interface, so that we can use it in place of an UshahidiWebClient during development. You will, of course, explore more of these issues in the corresponding lab.

Wrapping Up

We've gone through a variety of issues. What should you take away from this discussion. There are two broad categories of issues: You should have learned a few things about the design of a simple Ushahidi Web API that you will soon use to interact with Ushahidi servers and you might have learned a few things about designing APIs along the way.

Things you should remember about the API.

  • Individual incidents are represented by the UshahidiIncident class.
  • You get incidents from objects that implement the UshahidiClient interface.
  • Such objects provide two important methods: nextIncident and hasMoreIncidents.
  • When you want to get incidents from a Web site, you use UshahidiWebClient objects.
  • When you want to write experimental code and know exactly what incidents you'll be dealing with, you use UshahidiClientTestHarness objects.

What might you have learned beyond that? Here are some possibilities.

  • When developing an object with lots of fields, you should think about whether some fields can be grouped together. If so, you might make other classes for those groups of fields.
  • When appropriate, you should use classes that come as part of the standard Java API.
  • When designing objects that store data, you should consider whether or not those objects should be mutable.
  • Interfaces can help you think more about what you want to do than how you do it. Interfaces can also help you build frameworks for testing or experimentation.

I hope that you can also add many more items to this list.

Additional References

Information on the Ushahidi client Web API, which underlies our Web implementation, can be found at https://wiki.ushahidi.com/display/WIKI/Ushahidi+Public+API.

Documentation for the Simple Java Ushahidi API, which instantiates the design decisions described in this document, can be found at http://www.cs.grinnell.edu/~rebelsky/Glimmer/Ushahidi/docs/.

Source code for the Simple Java Ushahidi API can be found at https://github.com/CSG-CS2/simple-ushahidi-api.

Information on the new Predicate interface can be found at http://docs.oracle.com/javase/8/docs/api/java/util/function/Predicate.html

Some basic information on using the new predicates can also be found in the Java Tutorials trail on Lambda Expressions at http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html, although you may just want to wait until we cover lambda expressions to consider those issues.