3.9 Running the server
The functionality of the server should be defined in one Prolog file (of course this file is allowed to load other files). Depending on the wanted server setup this `body' is wrapped into a small Prolog file combining the body with the appropriate server interface. There are three supported server-setups. For most applications we advice the multi-threaded server. Examples of this server architecture are the PlDoc documentation system and the SeRQL Semantic Web server infrastructure.
All the server setups may be wrapped in a reverse proxy to make them available from the public web-server as described in section 3.9.7.
- Using
library(thread_httpd)for a multi-threaded server
This server exploits the multi-threaded version of SWI-Prolog, running the users body code parallel from a pool of worker threads. As it avoids the state engine and copying required in the event-driven server it is generally faster and capable to handle multiple requests concurrently.This server is harder to debug due to the involved threading, although the GUI tracer provides reasonable support for multi-threaded applications using the tspy/1 command. It can provide fast communication to multiple clients and can be used for more demanding servers.
- Using
library(xpce_httpd)for an event-driven server
This approach provides a single-threaded event-driven application. The clients talk to XPCE sockets that collect an HTTP request. The server infra-structure can talk to multiple clients simultaneously, but once a request is complete the wrappers call the user's goal and blocks all further activity until the request is handled. Requests from multiple clients are thus fully serialised in one Prolog process.This server setup is very suitable for debugging as well as embedded server in simple applications in a fairly controlled environment.
- Using
library(inetd_httpd)for server-per-client
In this setup the Unix inetd user-daemon is used to initialise a server for each connection. This approach is especially suitable for servers that have a limited startup-time. In this setup a crashing client does not influence other requests.This server is very hard to debug as the server is not connected to the user environment. It provides a robust implementation for servers that can be started quickly.
3.9.1 Common server interface options
All the server interfaces provide http_server(:Goal, +Options)
to create the server. The list of options differ, but the servers share
common options:
- port(?Port)
- Specify the port to listen to for stand-alone servers. Port is either an integer or unbound. If unbound, it is unified to the selected free port.
3.9.2 Multi-threaded Prolog
The library(http/thread_httpd.pl) provides the
infrastructure to manage multiple clients using a pool of worker-threads.
This realises a popular server design, also seen in Java Tomcat and
Microsoft .NET. As a single persistent server process maintains
communication to all clients startup time is not an important issue and
the server can easily maintain state-information for all clients.
In addition to the functionality provided by the other (XPCE and
inetd) servers, the threaded server can also be used to realise an HTTPS
server exploiting the library(ssl) library. See option
ssl(+SSLOptions) below.
- http_server(:Goal, +Options)
- Create the server. Options must provide the
port(?Port)option to specify the port the server should listen to. If Port is unbound an arbitrary free port is selected and Port is unified to this port-number. The server consists of a small Prolog thread accepting new connection on Port and dispatching these to a pool of workers. Defined Options are:- port(?Port)
- Port the server should listen to. If unbound Port is unified with the selected free port.
- workers(+N)
- Defines the number of worker threads in the pool. Default is to use two workers. Choosing the optimal value for best performance is a difficult task depending on the number of CPUs in your system and how much resources are required for processing a request. Too high numbers makes your system switch too often between threads or even swap if there is not enough memory to keep all threads in memory, while a too low number causes clients to wait unnecessary for other clients to complete. See also http_workers/2.
- timeout(+SecondsOrInfinite)
- Determines the maximum period of inactivity handling a request. If no
data arrives within the specified time since the last data arrived the
connection raises an exception, the worker discards the client and
returns to the pool-queue for a new client. Default is
infinite, making each worker wait forever for a request to complete. Without a timeout, a worker may wait forever on an a client that doesn't complete its request. - keep_alive_timeout(+SecondsOrInfinite)
- Maximum time to wait for new activity on Keep-Alive connections. Choosing the correct value for this parameter is hard. Disabling Keep-Alive is bad for performance if the clients request multiple documents for a single page. This may ---for example-- be caused by HTML frames, HTML pages with images, associated CSS files, etc. Keeping a connection open in the threaded model however prevents the thread servicing the client servicing other clients. The default is 5 seconds.
- local(+KBytes)
- Size of the local-stack for the workers. Default is taken from the commandline option.
- global(+KBytes)
- Size of the global-stack for the workers. Default is taken from the commandline option.
- trail(+KBytes)
- Size of the trail-stack for the workers. Default is taken from the commandline option.
- ssl(+SSLOptions)
- Use SSL (Secure Socket Layer) rather than plan TCP/IP. A server created
this way is accessed using the
https://protocol. SSL allows for encrypted communication to avoid others from tapping the wire as well as improved authentication of client and server. The SSLOptions option list is passed to ssl_init/3. The port option of the main option list is forwarded to the SSL layer. See thelibrary(ssl)library for details.
- http_server_property(?Port, ?Property)
- True if Property is a property of the HTTP server running at
Port. Defined properties are:
- goal(:Goal)
- Goal used to start the server. This is often http_dispatch/1.
- start_time(?Time)
- Time-stamp when the server was created. See format_time/3 for creating a human-readable representation.
- http_workers(:Port, ?Workers)
- Query or manipulate the number of workers of the server identified by
Port. If Workers is unbound it is unified with the
number of running servers. If it is an integer greater than the current
size of the worker pool new workers are created with the same
specification as the running workers. If the number is less than the
current size of the worker pool, this predicate inserts a number of
`quit' requests in the queue, discarding the excess workers as they
finish their jobs (i.e. no worker is abandoned while serving a client).
This can be used to tune the number of workers for performance. Another possible application is to reduce the pool to one worker to facilitate easier debugging.
- http_stop_server(+Port, +Options)
- Stop the HTTP server at Port. Halting a server is done gracefully, which means that requests being processed are not abandoned. The Options list is for future refinements of this predicate such as a forced immediate abort of the server, but is currently ignored.
- http_current_worker(?Port, ?ThreadID)
- True if ThreadID is the identifier of a Prolog thread serving Port. This predicate is motivated to allow for the use of arbitrary interaction with the worker thread for development and statistics.
- http_spawn(:Goal, +Spec)
- Continue handling this request in a new thread running Goal.
After
http_spawn/2,
the worker returns to the pool to process new requests. In its simplest
form, Spec is the name of a thread pool as defined by
thread_pool_create/3.
Alternatively it is an option list, whose options are passed to thread_create_in_pool/4
if Spec contains
pool(Pool)or to thread_create/3 of the pool option is not present. If the dispatch module is used (see section 3.2), spawning is normally specified as an option to the http_handler/3 registration.We recomment the use of thread pools. They allow registration of a set of threads using common characteristics, specify how many can be active and what to do if all threads are active. A typical application may define a small pool of threads with large stacks for computation intensive tasks, and a large pool of threads with small stacks to serve media. The declaration could be the one below, allowing for max 3 concurrent solvers and a maximum backlog of 5 and 30 tasks creating image thumbnails.
:- use_module(library(thread_pool)). :- thread_pool_create(compute, 3, [ local(20000), global(100000), trail(50000), backlog(5) ]). :- thread_pool_create(media, 30, [ local(100), global(100), trail(100), backlog(100) ]). :- http_handler('/solve', solve, [spawn(compute)]). :- http_handler('/thumbnail', thumbnail, [spawn(media)]).
3.9.3 From an interactive Prolog session using XPCE
The library(http/xpce_httpd.pl) provides the
infrastructure to manage multiple clients with an event-driven
control-structure. This version can be started from an interactive
Prolog session, providing a comfortable infra-structure to debug the
body of your server. It also allows the combination of an (XPCE-based)
GUI with web-technology in one application.
- http_server(:Goal, +Options)
- Create an instance of interactive_httpd. Options must
provide the
port(?Port)option to specify the port the server should listen to. If Port is unbound an arbitrary free port is selected and Port is unified to this port-number. Currently no options are defined.
The file demo_xpce gives a typical example of this
wrapper, assuming demo_body defines the predicate reply/1.
:- use_module(xpce_httpd).
:- use_module(demo_body).
server(Port) :-
http_server(reply, Port, []).
The created server opens a server socket at the selected address and waits for incoming connections. On each accepted connection it collects input until an HTTP request is complete. Then it opens an input stream on the collected data and using the output stream directed to the XPCE socket it calls http_wrapper/5. This approach is fundamentally different compared to the other approaches:
- Server can handle multiple connections
When inetd will start a server for each client, and CGI starts a server for each request, this approach starts a single server handling multiple clients. - Requests are serialised
All calls to Goal are fully serialised, processing on behalf of a new client can only start after all previous requests are answered. This easier and quite acceptable if the server is mostly inactive and requests take not very long to process. - Lifetime of the server
The server lives as long as Prolog runs.
3.9.4 From (Unix) inetd
All modern Unix systems handle a large number of the services they
run through the super-server inetd. This program reads
/etc/inetd.conf and opens server-sockets on all ports
defined in this file. As a request comes in it accepts it and starts the
associated server such that standard I/O refers to the socket. This
approach has several advantages:
- Simplification of servers
Servers don't have to know about sockets and -operations. - Centralised authorisation
Using tcpwrappers simple and effective firewalling of all services is realised. - Automatic start and monitor
The inetd automatically starts the server `just-in-time' and starts additional servers or restarts a crashed server according to the specifications.
The very small generic script for handling inetd based connections is
in inetd_httpd, defining http_server/1:
- http_server(:Goal, +Options)
- Initialises and runs http_wrapper/5 in a loop until failure or end-of-file. This server does not support the Port option as the port is specified with the inetd configuration. The only supported option is After.
Here is the example from demo_inetd
#!/usr/bin/pl -t main -q -f
:- use_module(demo_body).
:- use_module(inetd_httpd).
main :-
http_server(reply).
With the above file installed in /home/jan/plhttp/demo_inetd,
the following line in /etc/inetd enables the server at port
4001 guarded by tcpwrappers. After modifying inetd, send the
daemon the HUP signal to make it reload its configuration.
For more information, please check inetd.conf(5).
4001 stream tcp nowait nobody /usr/sbin/tcpd /home/jan/plhttp/demo_inetd
3.9.5 MS-Windows
There are rumours that inetd has been ported to Windows.
3.9.6 As CGI script
To be done.
3.9.7 Using a reverse proxy
There are three options for public deployment of a service. One is to run it on a dedicated machine on port 80, the standard HTTP port. The machine may be a virtual machine running ---for example--- under VMWARE or XEN. The (virtual) machine approach isolates security threads and allows for using a standard port. The server can also be hosted on a non-standard port such as 8000, or 8080. Using non-standard ports however may cause problems with intermediate proxy- and/or firewall policies. Isolation can be achieved using a Unix chroot environment. Another option, also recommended for Tomcat servers, is the use of Apache reverse proxies. This causes the main web-server to relay requests below a given URL location to our Prolog based server. This approach has several advantages:
- We can access the server on port 80, just as for a dedicated machine. We do not need a machine though and we only need access to the Apache configuration.
- As Apache is doing the front-line service, the Prolog server is normally protected from malformed HTTP requests that could result in denial of service or otherwise compromise the server. In addition, Apache can provide encodings such as compression to the outside world.
Note that the proxy technology can be combined with isolation methods such as dedicated machines, virtual machines and chroot jails. The proxy can also provide load balancing.
Setting up a reverse proxy
The Apache reverse proxy setup is really simple. Ensure the modules
proxy and proxy_http are loaded. Then add two
simple rules to the server configuration. Below is an example that makes
a PlDoc server on port 4000 available from the main Apache server at
port 80.
ProxyPass /pldoc/ http://localhost:4000/pldoc/ ProxyPassReverse /pldoc/ http://localhost:4000/pldoc/
Apache rewrites the HTTP headers passing by, but using the above
rules it does not examine the content. This implies that URLs embedded
in the (HTML) content must use relative addressing. If the locations on
the public and Prolog server are the same (as in the example above) it
is allowed to use absolute locations. I.e. /pldoc/search is
ok, but http://myhost.com:4000/pldoc/search is not.
If the locations on the server differ, locations must be relative (i.e. not
start with /.
This problem can also be solved using the contributed Apache module
proxy_html that can be instructed to rewrite URLs embedded
in HTML documents. In our experience, this is not troublefree as URLs
can appear in many places in generated documents. JavaScript can create
URLs on the fly, which makes rewriting virtually impossible.