3.9 Running the server

The functionality of the server should be defined in one Prolog file (of course this file is allowed to load other files). Depending on the wanted server setup this `body' is wrapped into a small Prolog file combining the body with the appropriate server interface. There are three supported server-setups. For most applications we advice the multi-threaded server. Examples of this server architecture are the PlDoc documentation system and the SeRQL Semantic Web server infrastructure.

All the server setups may be wrapped in a reverse proxy to make them available from the public web-server as described in section 3.9.7.

3.9.1 Common server interface options

All the server interfaces provide http_server(:Goal, +Options) to create the server. The list of options differ, but the servers share common options:

port(?Port)
Specify the port to listen to for stand-alone servers. Port is either an integer or unbound. If unbound, it is unified to the selected free port.

3.9.2 Multi-threaded Prolog

The library(http/thread_httpd.pl) provides the infrastructure to manage multiple clients using a pool of worker-threads. This realises a popular server design, also seen in Java Tomcat and Microsoft .NET. As a single persistent server process maintains communication to all clients startup time is not an important issue and the server can easily maintain state-information for all clients.

In addition to the functionality provided by the other (XPCE and inetd) servers, the threaded server can also be used to realise an HTTPS server exploiting the library(ssl) library. See option ssl(+SSLOptions) below.

http_server(:Goal, +Options)
Create the server. Options must provide the port(?Port) option to specify the port the server should listen to. If Port is unbound an arbitrary free port is selected and Port is unified to this port-number. The server consists of a small Prolog thread accepting new connection on Port and dispatching these to a pool of workers. Defined Options are:
port(?Port)
Port the server should listen to. If unbound Port is unified with the selected free port.
workers(+N)
Defines the number of worker threads in the pool. Default is to use two workers. Choosing the optimal value for best performance is a difficult task depending on the number of CPUs in your system and how much resources are required for processing a request. Too high numbers makes your system switch too often between threads or even swap if there is not enough memory to keep all threads in memory, while a too low number causes clients to wait unnecessary for other clients to complete. See also http_workers/2.
timeout(+SecondsOrInfinite)
Determines the maximum period of inactivity handling a request. If no data arrives within the specified time since the last data arrived the connection raises an exception, the worker discards the client and returns to the pool-queue for a new client. Default is infinite, making each worker wait forever for a request to complete. Without a timeout, a worker may wait forever on an a client that doesn't complete its request.
keep_alive_timeout(+SecondsOrInfinite)
Maximum time to wait for new activity on Keep-Alive connections. Choosing the correct value for this parameter is hard. Disabling Keep-Alive is bad for performance if the clients request multiple documents for a single page. This may ---for example-- be caused by HTML frames, HTML pages with images, associated CSS files, etc. Keeping a connection open in the threaded model however prevents the thread servicing the client servicing other clients. The default is 5 seconds.
local(+KBytes)
Size of the local-stack for the workers. Default is taken from the commandline option.
global(+KBytes)
Size of the global-stack for the workers. Default is taken from the commandline option.
trail(+KBytes)
Size of the trail-stack for the workers. Default is taken from the commandline option.
ssl(+SSLOptions)
Use SSL (Secure Socket Layer) rather than plan TCP/IP. A server created this way is accessed using the https:// protocol. SSL allows for encrypted communication to avoid others from tapping the wire as well as improved authentication of client and server. The SSLOptions option list is passed to ssl_init/3. The port option of the main option list is forwarded to the SSL layer. See the library(ssl) library for details.
http_server_property(?Port, ?Property)
True if Property is a property of the HTTP server running at Port. Defined properties are:
goal(:Goal)
Goal used to start the server. This is often http_dispatch/1.
start_time(?Time)
Time-stamp when the server was created. See format_time/3 for creating a human-readable representation.
http_workers(:Port, ?Workers)
Query or manipulate the number of workers of the server identified by Port. If Workers is unbound it is unified with the number of running servers. If it is an integer greater than the current size of the worker pool new workers are created with the same specification as the running workers. If the number is less than the current size of the worker pool, this predicate inserts a number of `quit' requests in the queue, discarding the excess workers as they finish their jobs (i.e. no worker is abandoned while serving a client).

This can be used to tune the number of workers for performance. Another possible application is to reduce the pool to one worker to facilitate easier debugging.

http_stop_server(+Port, +Options)
Stop the HTTP server at Port. Halting a server is done gracefully, which means that requests being processed are not abandoned. The Options list is for future refinements of this predicate such as a forced immediate abort of the server, but is currently ignored.
http_current_worker(?Port, ?ThreadID)
True if ThreadID is the identifier of a Prolog thread serving Port. This predicate is motivated to allow for the use of arbitrary interaction with the worker thread for development and statistics.
http_spawn(:Goal, +Spec)
Continue handling this request in a new thread running Goal. After http_spawn/2, the worker returns to the pool to process new requests. In its simplest form, Spec is the name of a thread pool as defined by thread_pool_create/3. Alternatively it is an option list, whose options are passed to thread_create_in_pool/4 if Spec contains pool(Pool) or to thread_create/3 of the pool option is not present. If the dispatch module is used (see section 3.2), spawning is normally specified as an option to the http_handler/3 registration.

We recomment the use of thread pools. They allow registration of a set of threads using common characteristics, specify how many can be active and what to do if all threads are active. A typical application may define a small pool of threads with large stacks for computation intensive tasks, and a large pool of threads with small stacks to serve media. The declaration could be the one below, allowing for max 3 concurrent solvers and a maximum backlog of 5 and 30 tasks creating image thumbnails.

:- use_module(library(thread_pool)).

:- thread_pool_create(compute, 3,
                      [ local(20000), global(100000), trail(50000),
                        backlog(5)
                      ]).
:- thread_pool_create(media, 30,
                      [ local(100), global(100), trail(100),
                        backlog(100)
                      ]).

:- http_handler('/solve',     solve,     [spawn(compute)]).
:- http_handler('/thumbnail', thumbnail, [spawn(media)]).

3.9.3 From an interactive Prolog session using XPCE

The library(http/xpce_httpd.pl) provides the infrastructure to manage multiple clients with an event-driven control-structure. This version can be started from an interactive Prolog session, providing a comfortable infra-structure to debug the body of your server. It also allows the combination of an (XPCE-based) GUI with web-technology in one application.

http_server(:Goal, +Options)
Create an instance of interactive_httpd. Options must provide the port(?Port) option to specify the port the server should listen to. If Port is unbound an arbitrary free port is selected and Port is unified to this port-number. Currently no options are defined.

The file demo_xpce gives a typical example of this wrapper, assuming demo_body defines the predicate reply/1.

:- use_module(xpce_httpd).
:- use_module(demo_body).

server(Port) :-
        http_server(reply, Port, []).

The created server opens a server socket at the selected address and waits for incoming connections. On each accepted connection it collects input until an HTTP request is complete. Then it opens an input stream on the collected data and using the output stream directed to the XPCE socket it calls http_wrapper/5. This approach is fundamentally different compared to the other approaches:

3.9.4 From (Unix) inetd

All modern Unix systems handle a large number of the services they run through the super-server inetd. This program reads /etc/inetd.conf and opens server-sockets on all ports defined in this file. As a request comes in it accepts it and starts the associated server such that standard I/O refers to the socket. This approach has several advantages:

The very small generic script for handling inetd based connections is in inetd_httpd, defining http_server/1:

http_server(:Goal, +Options)
Initialises and runs http_wrapper/5 in a loop until failure or end-of-file. This server does not support the Port option as the port is specified with the inetd configuration. The only supported option is After.

Here is the example from demo_inetd

#!/usr/bin/pl -t main -q -f
:- use_module(demo_body).
:- use_module(inetd_httpd).

main :-
        http_server(reply).

With the above file installed in /home/jan/plhttp/demo_inetd, the following line in /etc/inetd enables the server at port 4001 guarded by tcpwrappers. After modifying inetd, send the daemon the HUP signal to make it reload its configuration. For more information, please check inetd.conf(5).

4001 stream tcp nowait nobody /usr/sbin/tcpd /home/jan/plhttp/demo_inetd

3.9.5 MS-Windows

There are rumours that inetd has been ported to Windows.

3.9.6 As CGI script

To be done.

3.9.7 Using a reverse proxy

There are three options for public deployment of a service. One is to run it on a dedicated machine on port 80, the standard HTTP port. The machine may be a virtual machine running ---for example--- under VMWARE or XEN. The (virtual) machine approach isolates security threads and allows for using a standard port. The server can also be hosted on a non-standard port such as 8000, or 8080. Using non-standard ports however may cause problems with intermediate proxy- and/or firewall policies. Isolation can be achieved using a Unix chroot environment. Another option, also recommended for Tomcat servers, is the use of Apache reverse proxies. This causes the main web-server to relay requests below a given URL location to our Prolog based server. This approach has several advantages:

Note that the proxy technology can be combined with isolation methods such as dedicated machines, virtual machines and chroot jails. The proxy can also provide load balancing.

Setting up a reverse proxy

The Apache reverse proxy setup is really simple. Ensure the modules proxy and proxy_http are loaded. Then add two simple rules to the server configuration. Below is an example that makes a PlDoc server on port 4000 available from the main Apache server at port 80.

ProxyPass        /pldoc/ http://localhost:4000/pldoc/
ProxyPassReverse /pldoc/ http://localhost:4000/pldoc/

Apache rewrites the HTTP headers passing by, but using the above rules it does not examine the content. This implies that URLs embedded in the (HTML) content must use relative addressing. If the locations on the public and Prolog server are the same (as in the example above) it is allowed to use absolute locations. I.e. /pldoc/search is ok, but http://myhost.com:4000/pldoc/search is not. If the locations on the server differ, locations must be relative (i.e. not start with /.

This problem can also be solved using the contributed Apache module proxy_html that can be instructed to rewrite URLs embedded in HTML documents. In our experience, this is not troublefree as URLs can appear in many places in generated documents. JavaScript can create URLs on the fly, which makes rewriting virtually impossible.