1/* Part of SWI-Prolog 2 3 Author: Jan Wielemaker 4 E-mail: J.Wielemaker@vu.nl 5 WWW: http://www.swi-prolog.org 6 Copyright (c) 2002-2023, University of Amsterdam 7 VU University Amsterdam 8 CWI, Amsterdam 9 SWI-Prolog Solutions b.v. 10 All rights reserved. 11 12 Redistribution and use in source and binary forms, with or without 13 modification, are permitted provided that the following conditions 14 are met: 15 16 1. Redistributions of source code must retain the above copyright 17 notice, this list of conditions and the following disclaimer. 18 19 2. Redistributions in binary form must reproduce the above copyright 20 notice, this list of conditions and the following disclaimer in 21 the documentation and/or other materials provided with the 22 distribution. 23 24 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 25 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 26 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 27 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 28 COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 29 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 30 BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 31 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 32 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 33 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN 34 ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 35 POSSIBILITY OF SUCH DAMAGE. 36*/ 37 38:- module(http_open, 39 [ http_open/3, % +URL, -Stream, +Options 40 http_set_authorization/2, % +URL, +Authorization 41 http_close_keep_alive/1 % +Address 42 ]). 43:- autoload(library(aggregate),[aggregate_all/3]). 44:- autoload(library(apply),[foldl/4,include/3]). 45:- autoload(library(base64),[base64/3]). 46:- autoload(library(debug),[debug/3,debugging/1]). 47:- autoload(library(error), 48 [ domain_error/2, must_be/2, existence_error/2, instantiation_error/1 49 ]). 50:- autoload(library(lists),[last/2,member/2]). 51:- autoload(library(option), 52 [ meta_options/3, option/2, select_option/4, merge_options/3, 53 option/3, select_option/3 54 ]). 55:- autoload(library(readutil),[read_line_to_codes/2]). 56:- autoload(library(uri), 57 [ uri_resolve/3, uri_components/2, uri_data/3, 58 uri_authority_components/2, uri_authority_data/3, 59 uri_encoded/3, uri_query_components/2, uri_is_global/1 60 ]). 61:- autoload(library(http/http_header), 62 [ http_parse_header/2, http_post_data/3 ]). 63:- autoload(library(http/http_stream),[stream_range_open/3]). 64:- if(exists_source(library(ssl))). 65:- autoload(library(ssl), [ssl_upgrade_legacy_options/2]). 66:- endif. 67:- use_module(library(socket)).
174:- multifile 175 http:encoding_filter/3, % +Encoding, +In0, -In 176 http:current_transfer_encoding/1, % ?Encoding 177 http:disable_encoding_filter/1, % +ContentType 178 http:http_protocol_hook/5, % +Protocol, +Parts, +StreamPair, 179 % -NewStreamPair, +Options 180 http:open_options/2, % +Parts, -Options 181 http:write_cookies/3, % +Out, +Parts, +Options 182 http:update_cookies/3, % +CookieLine, +Parts, +Options 183 http:authenticate_client/2, % +URL, +Action 184 http:http_connection_over_proxy/6. 185 186:- meta_predicate 187 http_open( , , ). 188 189:- predicate_options(http_open/3, 3, 190 [ authorization(compound), 191 final_url(-atom), 192 header(+atom, -atom), 193 headers(-list), 194 raw_headers(-list(string)), 195 connection(+atom), 196 method(oneof([delete,get,put,head,post,patch,options])), 197 size(-integer), 198 status_code(-integer), 199 output(-stream), 200 timeout(number), 201 unix_socket(+atom), 202 proxy(atom, integer), 203 proxy_authorization(compound), 204 bypass_proxy(boolean), 205 request_header(any), 206 user_agent(atom), 207 version(-compound), 208 % The option below applies if library(http/http_header) is loaded 209 post(any), 210 % The options below apply if library(http/http_ssl_plugin)) is loaded 211 pem_password_hook(callable), 212 cacert_file(atom), 213 cert_verify_hook(callable) 214 ]).
User-Agent
, can be overruled using the
option user_agent(Agent)
of http_open/3.
221user_agent('SWI-Prolog').
false
(default true
), do not try to automatically
authenticate the client if a 401 (Unauthorized) status code
is received.curl(1)
's option
`--unix-socket`.Connection
header. Default is close
. The
alternative is Keep-alive
. This maintains a pool of
available connections as determined by keep_connection/1.
The library(http/websockets)
uses Keep-alive, Upgrade
.
Keep-alive connections can be closed explicitly using
http_close_keep_alive/1. Keep-alive connections may
significantly improve repetitive requests on the same server,
especially if the IP route is long, HTTPS is used or the
connection uses a proxy.header(Name,Value)
option. A pseudo header status_code(Code)
is added to provide
the HTTP status as an integer. See also raw_headers(-List)
which provides the entire HTTP reply header in unparsed
representation.get
(default), head
, delete
, post
, put
or
patch
.
The head
message can be
used in combination with the header(Name, Value)
option to
access information on the resource without actually fetching
the resource itself. The returned stream must be closed
immediately.
If post(Data)
is provided, the default is post
.
Content-Length
in the reply header.Major-Minor
, where Major and Minor
are integers representing the HTTP version in the reply header.end
. HTTP 1.1 only supports Unit = bytes
. E.g.,
to ask for bytes 1000-1999, use the option
range(bytes(1000,1999))
raw_encoding('applocation/gzip')
the system will not
decompress the stream if it is compressed using gzip
.headers(-List)
.false
(default true
), do not automatically redirect
if a 3XX code is received. Must be combined with
status_code(Code)
and one of the header options to read the
redirect reply. In particular, without status_code(Code)
a
redirect is mapped to an exception.infinite
).POST
request on the HTTP server. Data is
handed to http_post_data/3.proxy(+Host:Port)
. Deprecated.authorization
option.true
, bypass proxy hooks. Default is false
.infinite
.
The default value is 10
.User-Agent
field of the HTTP
header. Default is SWI-Prolog
.
The hook http:open_options/2 can be used to provide default
options based on the broken-down URL. The option
status_code(-Code)
is particularly useful to query REST
interfaces that commonly return status codes other than 200
that need to be be processed by the client code.
423:- multifile 424 socket:proxy_for_url/3. % +URL, +Host, -ProxyList 425 426http_open(URL, Stream, QOptions) :- 427 meta_options(is_meta, QOptions, Options0), 428 ( atomic(URL) 429 -> parse_url_ex(URL, Parts) 430 ; Parts = URL 431 ), 432 autoload_https(Parts), 433 upgrade_ssl_options(Parts, Options0, Options), 434 add_authorization(Parts, Options, Options1), 435 findall(HostOptions, hooked_options(Parts, HostOptions), AllHostOptions), 436 foldl(merge_options_rev, AllHostOptions, Options1, Options2), 437 ( option(bypass_proxy(true), Options) 438 -> try_http_proxy(direct, Parts, Stream, Options2) 439 ; term_variables(Options2, Vars2), 440 findall(Result-Vars2, 441 try_a_proxy(Parts, Result, Options2), 442 ResultList), 443 last(ResultList, Status-Vars2) 444 -> ( Status = true(_Proxy, Stream) 445 -> true 446 ; throw(error(proxy_error(tried(ResultList)), _)) 447 ) 448 ; try_http_proxy(direct, Parts, Stream, Options2) 449 ). 450 451try_a_proxy(Parts, Result, Options) :- 452 parts_uri(Parts, AtomicURL), 453 option(host(Host), Parts), 454 ( option(unix_socket(Path), Options) 455 -> Proxy = unix_socket(Path) 456 ; ( option(proxy(ProxyHost:ProxyPort), Options) 457 ; is_list(Options), 458 memberchk(proxy(ProxyHost,ProxyPort), Options) 459 ) 460 -> Proxy = proxy(ProxyHost, ProxyPort) 461 ; socket:proxy_for_url(AtomicURL, Host, Proxy) 462 ), 463 debug(http(proxy), 464 'http_open: Connecting via ~w to ~w', [Proxy, AtomicURL]), 465 ( catch(try_http_proxy(Proxy, Parts, Stream, Options), E, true) 466 -> ( var(E) 467 -> !, Result = true(Proxy, Stream) 468 ; Result = error(Proxy, E) 469 ) 470 ; Result = false(Proxy) 471 ), 472 debug(http(proxy), 'http_open: ~w: ~p', [Proxy, Result]). 473 474try_http_proxy(Method, Parts, Stream, Options0) :- 475 option(host(Host), Parts), 476 proxy_request_uri(Method, Parts, RequestURI), 477 select_option(visited(Visited0), Options0, OptionsV, []), 478 Options = [visited([Parts|Visited0])|OptionsV], 479 parts_scheme(Parts, Scheme), 480 default_port(Scheme, DefPort), 481 url_part(port(Port), Parts, DefPort), 482 host_and_port(Host, DefPort, Port, HostPort), 483 ( option(connection(Connection), Options0), 484 keep_alive(Connection), 485 get_from_pool(Host:Port, StreamPair), 486 debug(http(connection), 'Trying Keep-alive to ~p using ~p', 487 [ Host:Port, StreamPair ]), 488 catch(send_rec_header(StreamPair, Stream, HostPort, 489 RequestURI, Parts, Options), 490 Error, 491 keep_alive_error(Error, StreamPair)) 492 -> true 493 ; http:http_connection_over_proxy(Method, Parts, Host:Port, 494 SocketStreamPair, Options, Options1), 495 ( catch(http:http_protocol_hook(Scheme, Parts, 496 SocketStreamPair, 497 StreamPair, Options), 498 Error, 499 ( close(SocketStreamPair, [force(true)]), 500 throw(Error))) 501 -> true 502 ; StreamPair = SocketStreamPair 503 ), 504 send_rec_header(StreamPair, Stream, HostPort, 505 RequestURI, Parts, Options1) 506 ), 507 return_final_url(Options). 508 509proxy_request_uri(direct, Parts, RequestURI) :- 510 !, 511 parts_request_uri(Parts, RequestURI). 512proxy_request_uri(unix_socket(_), Parts, RequestURI) :- 513 !, 514 parts_request_uri(Parts, RequestURI). 515proxy_request_uri(_, Parts, RequestURI) :- 516 parts_uri(Parts, RequestURI). 517 518httphttp_connection_over_proxy(unix_socket(Path), _, _, 519 StreamPair, Options, Options) :- 520 !, 521 unix_domain_socket(Socket), 522 tcp_connect(Socket, Path), 523 tcp_open_socket(Socket, In, Out), 524 stream_pair(StreamPair, In, Out). 525httphttp_connection_over_proxy(direct, _, Host:Port, 526 StreamPair, Options, Options) :- 527 !, 528 open_socket(Host:Port, StreamPair, Options). 529httphttp_connection_over_proxy(proxy(ProxyHost, ProxyPort), Parts, _, 530 StreamPair, Options, Options) :- 531 \+ ( memberchk(scheme(Scheme), Parts), 532 secure_scheme(Scheme) 533 ), 534 !, 535 % We do not want any /more/ proxy after this 536 open_socket(ProxyHost:ProxyPort, StreamPair, 537 [bypass_proxy(true)|Options]). 538httphttp_connection_over_proxy(socks(SocksHost, SocksPort), _Parts, Host:Port, 539 StreamPair, Options, Options) :- 540 !, 541 tcp_connect(SocksHost:SocksPort, StreamPair, [bypass_proxy(true)]), 542 catch(negotiate_socks_connection(Host:Port, StreamPair), 543 Error, 544 ( close(StreamPair, [force(true)]), 545 throw(Error) 546 )).
cacerts_file(File)
option to a cacerts(List)
option to ensure proper
merging of options.554hooked_options(Parts, Options) :- 555 http:open_options(Parts, Options0), 556 upgrade_ssl_options(Parts, Options0, Options). 557 558:- if(current_predicate(ssl_upgrade_legacy_options/2)). 559upgrade_ssl_options(Parts, Options0, Options) :- 560 requires_ssl(Parts), 561 !, 562 ssl_upgrade_legacy_options(Options0, Options). 563:- endif. 564upgrade_ssl_options(_, Options, Options). 565 566merge_options_rev(Old, New, Merged) :- 567 merge_options(New, Old, Merged). 568 569is_meta(pem_password_hook). % SSL plugin callbacks 570is_meta(cert_verify_hook). 571 572 573httphttp_protocol_hook(http, _, StreamPair, StreamPair, _). 574 575default_port(https, 443) :- !. 576default_port(wss, 443) :- !. 577default_port(_, 80). 578 579host_and_port(Host, DefPort, DefPort, Host) :- !. 580host_and_port(Host, _, Port, Host:Port).
586autoload_https(Parts) :- 587 requires_ssl(Parts), 588 memberchk(scheme(S), Parts), 589 \+ clause(http:http_protocol_hook(S, _, StreamPair, StreamPair, _),_), 590 exists_source(library(http/http_ssl_plugin)), 591 !, 592 use_module(library(http/http_ssl_plugin)). 593autoload_https(_). 594 595requires_ssl(Parts) :- 596 memberchk(scheme(S), Parts), 597 secure_scheme(S). 598 599secure_scheme(https). 600secure_scheme(wss).
608send_rec_header(StreamPair, Stream, Host, RequestURI, Parts, Options) :- 609 ( catch(guarded_send_rec_header(StreamPair, Stream, 610 Host, RequestURI, Parts, Options), 611 E, true) 612 -> ( var(E) 613 -> ( option(output(StreamPair), Options) 614 -> true 615 ; true 616 ) 617 ; close(StreamPair, [force(true)]), 618 throw(E) 619 ) 620 ; close(StreamPair, [force(true)]), 621 fail 622 ). 623 624guarded_send_rec_header(StreamPair, Stream, Host, RequestURI, Parts, Options) :- 625 user_agent(Agent, Options), 626 method(Options, MNAME), 627 http_version(Version), 628 option(connection(Connection), Options, close), 629 debug(http(send_request), "> ~w ~w HTTP/~w", [MNAME, RequestURI, Version]), 630 debug(http(send_request), "> Host: ~w", [Host]), 631 debug(http(send_request), "> User-Agent: ~w", [Agent]), 632 debug(http(send_request), "> Connection: ~w", [Connection]), 633 format(StreamPair, 634 '~w ~w HTTP/~w\r\n\c 635 Host: ~w\r\n\c 636 User-Agent: ~w\r\n\c 637 Connection: ~w\r\n', 638 [MNAME, RequestURI, Version, Host, Agent, Connection]), 639 parts_uri(Parts, URI), 640 x_headers(Options, URI, StreamPair), 641 write_cookies(StreamPair, Parts, Options), 642 ( option(post(PostData), Options) 643 -> http_post_data(PostData, StreamPair, []) 644 ; format(StreamPair, '\r\n', []) 645 ), 646 flush_output(StreamPair), 647 % read the reply header 648 read_header(StreamPair, Parts, ReplyVersion, Code, Comment, Lines), 649 update_cookies(Lines, Parts, Options), 650 reply_header(Lines, Options), 651 do_open(ReplyVersion, Code, Comment, Lines, Options, Parts, Host, 652 StreamPair, Stream).
660http_version('1.1') :- 661 http:current_transfer_encoding(chunked), 662 !. 663http_version('1.0'). 664 665method(Options, MNAME) :- 666 option(post(_), Options), 667 !, 668 option(method(M), Options, post), 669 ( map_method(M, MNAME0) 670 -> MNAME = MNAME0 671 ; domain_error(method, M) 672 ). 673method(Options, MNAME) :- 674 option(method(M), Options, get), 675 ( map_method(M, MNAME0) 676 -> MNAME = MNAME0 677 ; map_method(_, M) 678 -> MNAME = M 679 ; domain_error(method, M) 680 ).
METHOD
keywords. Default are the official
HTTP methods as defined by the various RFCs.687:- multifile 688 map_method/2. 689 690map_method(delete, 'DELETE'). 691map_method(get, 'GET'). 692map_method(head, 'HEAD'). 693map_method(post, 'POST'). 694map_method(put, 'PUT'). 695map_method(patch, 'PATCH'). 696map_method(options, 'OPTIONS').
request_header(Name=Value)
options in
Options.
705x_headers(Options, URI, Out) :- 706 x_headers_(Options, [url(URI)|Options], Out). 707 708x_headers_([], _, _). 709x_headers_([H|T], Options, Out) :- 710 x_header(H, Options, Out), 711 x_headers_(T, Options, Out). 712 713x_header(request_header(Name=Value), _, Out) :- 714 !, 715 debug(http(send_request), "> ~w: ~w", [Name, Value]), 716 format(Out, '~w: ~w\r\n', [Name, Value]). 717x_header(proxy_authorization(ProxyAuthorization), Options, Out) :- 718 !, 719 auth_header(ProxyAuthorization, Options, 'Proxy-Authorization', Out). 720x_header(authorization(Authorization), Options, Out) :- 721 !, 722 auth_header(Authorization, Options, 'Authorization', Out). 723x_header(range(Spec), _, Out) :- 724 !, 725 Spec =.. [Unit, From, To], 726 ( To == end 727 -> ToT = '' 728 ; must_be(integer, To), 729 ToT = To 730 ), 731 debug(http(send_request), "> Range: ~w=~d-~w", [Unit, From, ToT]), 732 format(Out, 'Range: ~w=~d-~w\r\n', [Unit, From, ToT]). 733x_header(_, _, _).
737auth_header(basic(User, Password), _, Header, Out) :- 738 !, 739 format(codes(Codes), '~w:~w', [User, Password]), 740 phrase(base64(Codes), Base64Codes), 741 debug(http(send_request), "> ~w: Basic ~s", [Header, Base64Codes]), 742 format(Out, '~w: Basic ~s\r\n', [Header, Base64Codes]). 743auth_header(bearer(Token), _, Header, Out) :- 744 !, 745 debug(http(send_request), "> ~w: Bearer ~w", [Header,Token]), 746 format(Out, '~w: Bearer ~w\r\n', [Header, Token]). 747auth_header(Auth, Options, _, Out) :- 748 option(url(URL), Options), 749 add_method(Options, Options1), 750 http:authenticate_client(URL, send_auth_header(Auth, Out, Options1)), 751 !. 752auth_header(Auth, _, _, _) :- 753 domain_error(authorization, Auth). 754 755user_agent(Agent, Options) :- 756 ( option(user_agent(Agent), Options) 757 -> true 758 ; user_agent(Agent) 759 ). 760 761add_method(Options0, Options) :- 762 option(method(_), Options0), 763 !, 764 Options = Options0. 765add_method(Options0, Options) :- 766 option(post(_), Options0), 767 !, 768 Options = [method(post)|Options0]. 769add_method(Options0, [method(get)|Options0]).
780 % Redirections 781do_open(_, Code, _, Lines, Options0, Parts, _, In, Stream) :- 782 redirect_code(Code), 783 option(redirect(true), Options0, true), 784 location(Lines, RequestURI), 785 !, 786 debug(http(redirect), 'http_open: redirecting to ~w', [RequestURI]), 787 close(In), 788 parts_uri(Parts, Base), 789 uri_resolve(RequestURI, Base, Redirected), 790 parse_url_ex(Redirected, RedirectedParts), 791 ( redirect_limit_exceeded(Options0, Max) 792 -> format(atom(Comment), 'max_redirect (~w) limit exceeded', [Max]), 793 throw(error(permission_error(redirect, http, Redirected), 794 context(_, Comment))) 795 ; redirect_loop(RedirectedParts, Options0) 796 -> throw(error(permission_error(redirect, http, Redirected), 797 context(_, 'Redirection loop'))) 798 ; true 799 ), 800 redirect_options(Parts, RedirectedParts, Options0, Options), 801 http_open(RedirectedParts, Stream, Options). 802 % Need authentication 803do_open(_Version, Code, _Comment, Lines, Options0, Parts, _Host, In0, Stream) :- 804 authenticate_code(Code), 805 option(authenticate(true), Options0, true), 806 parts_uri(Parts, URI), 807 parse_headers(Lines, Headers), 808 http:authenticate_client( 809 URI, 810 auth_reponse(Headers, Options0, Options)), 811 !, 812 close(In0), 813 http_open(Parts, Stream, Options). 814 % Accepted codes 815do_open(Version, Code, _, Lines, Options, Parts, Host, In0, In) :- 816 ( option(status_code(Code), Options), 817 Lines \== [] 818 -> true 819 ; successful_code(Code) 820 ), 821 !, 822 parts_uri(Parts, URI), 823 parse_headers(Lines, Headers), 824 return_version(Options, Version), 825 return_size(Options, Headers), 826 return_fields(Options, Headers), 827 return_headers(Options, [status_code(Code)|Headers]), 828 consider_keep_alive(Lines, Parts, Host, In0, In1, Options), 829 transfer_encoding_filter(Lines, In1, In, Options), 830 % properly re-initialise the stream 831 set_stream(In, file_name(URI)), 832 set_stream(In, record_position(true)). 833do_open(_, _, _, [], Options, _, _, _, _) :- 834 option(connection(Connection), Options), 835 keep_alive(Connection), 836 !, 837 throw(error(keep_alive(closed),_)). 838 % report anything else as error 839do_open(_Version, Code, Comment, _, _, Parts, _, _, _) :- 840 parts_uri(Parts, URI), 841 ( map_error_code(Code, Error) 842 -> Formal =.. [Error, url, URI] 843 ; Formal = existence_error(url, URI) 844 ), 845 throw(error(Formal, context(_, status(Code, Comment)))). 846 847 848successful_code(Code) :- 849 between(200, 299, Code).
855redirect_limit_exceeded(Options, Max) :-
856 option(visited(Visited), Options, []),
857 length(Visited, N),
858 option(max_redirect(Max), Options, 10),
859 (Max == infinite -> fail ; N > Max).
869redirect_loop(Parts, Options) :-
870 option(visited(Visited), Options, []),
871 include(==(Parts), Visited, Same),
872 length(Same, Count),
873 Count > 2.
method(post)
and post(Data)
options from
the original option-list.
If we are connecting over a Unix domain socket we drop this option if the redirect host does not match the initial host.
885redirect_options(Parts, RedirectedParts, Options0, Options) :- 886 select_option(unix_socket(_), Options0, Options1), 887 memberchk(host(Host), Parts), 888 memberchk(host(RHost), RedirectedParts), 889 debug(http(redirect), 'http_open: redirecting AF_UNIX ~w to ~w', 890 [Host, RHost]), 891 Host \== RHost, 892 !, 893 redirect_options(Options1, Options). 894redirect_options(_, _, Options0, Options) :- 895 redirect_options(Options0, Options). 896 897redirect_options(Options0, Options) :- 898 ( select_option(post(_), Options0, Options1) 899 -> true 900 ; Options1 = Options0 901 ), 902 ( select_option(method(Method), Options1, Options), 903 \+ redirect_method(Method) 904 -> true 905 ; Options = Options1 906 ). 907 908redirect_method(delete). 909redirect_method(get). 910redirect_method(head).
920map_error_code(401, permission_error). 921map_error_code(403, permission_error). 922map_error_code(404, existence_error). 923map_error_code(405, permission_error). 924map_error_code(407, permission_error). 925map_error_code(410, existence_error). 926 927redirect_code(301). % Moved Permanently 928redirect_code(302). % Found (previously "Moved Temporary") 929redirect_code(303). % See Other 930redirect_code(307). % Temporary Redirect 931 932authenticate_code(401).
945open_socket(Address, StreamPair, Options) :- 946 debug(http(open), 'http_open: Connecting to ~p ...', [Address]), 947 tcp_connect(Address, StreamPair, Options), 948 stream_pair(StreamPair, In, Out), 949 debug(http(open), '\tok ~p ---> ~p', [In, Out]), 950 set_stream(In, record_position(false)), 951 ( option(timeout(Timeout), Options) 952 -> set_stream(In, timeout(Timeout)) 953 ; true 954 ). 955 956 957return_version(Options, Major-Minor) :- 958 option(version(Major-Minor), Options, _). 959 960return_size(Options, Headers) :- 961 ( memberchk(content_length(Size), Headers) 962 -> option(size(Size), Options, _) 963 ; true 964 ). 965 966return_fields([], _). 967return_fields([header(Name, Value)|T], Headers) :- 968 !, 969 ( Term =.. [Name,Value], 970 memberchk(Term, Headers) 971 -> true 972 ; Value = '' 973 ), 974 return_fields(T, Headers). 975return_fields([_|T], Lines) :- 976 return_fields(T, Lines). 977 978return_headers(Options, Headers) :- 979 option(headers(Headers), Options, _).
headers(-List)
option. Invalid
header lines are skipped, printing a warning using
pring_message/2.987parse_headers([], []) :- !. 988parse_headers([Line|Lines], Headers) :- 989 catch(http_parse_header(Line, [Header]), Error, true), 990 ( var(Error) 991 -> Headers = [Header|More] 992 ; print_message(warning, Error), 993 Headers = More 994 ), 995 parse_headers(Lines, More).
final_url(URL)
, unify URL with the final
URL after redirections.1003return_final_url(Options) :- 1004 option(final_url(URL), Options), 1005 var(URL), 1006 !, 1007 option(visited([Parts|_]), Options), 1008 parts_uri(Parts, URL). 1009return_final_url(_).
1021transfer_encoding_filter(Lines, In0, In, Options) :- 1022 transfer_encoding(Lines, Encoding), 1023 !, 1024 transfer_encoding_filter_(Encoding, In0, In, Options). 1025transfer_encoding_filter(Lines, In0, In, Options) :- 1026 content_encoding(Lines, Encoding), 1027 content_type(Lines, Type), 1028 \+ http:disable_encoding_filter(Type), 1029 !, 1030 transfer_encoding_filter_(Encoding, In0, In, Options). 1031transfer_encoding_filter(_, In, In, _Options). 1032 1033transfer_encoding_filter_(Encoding, In0, In, Options) :- 1034 option(raw_encoding(Encoding), Options), 1035 !, 1036 In = In0. 1037transfer_encoding_filter_(Encoding, In0, In, _Options) :- 1038 stream_pair(In0, In1, Out), 1039 ( nonvar(Out) 1040 -> close(Out) 1041 ; true 1042 ), 1043 ( http:encoding_filter(Encoding, In1, In) 1044 -> true 1045 ; autoload_encoding(Encoding), 1046 http:encoding_filter(Encoding, In1, In) 1047 -> true 1048 ; domain_error(http_encoding, Encoding) 1049 ). 1050 1051:- multifile 1052 autoload_encoding/1. 1053 1054:- if(exists_source(library(zlib))). 1055autoload_encoding(gzip) :- 1056 use_module(library(zlib)). 1057:- endif. 1058 1059content_type(Lines, Type) :- 1060 member(Line, Lines), 1061 phrase(field('content-type'), Line, Rest), 1062 !, 1063 atom_codes(Type, Rest).
Content-encoding
as Transfer-encoding
encoding for specific values of ContentType. This predicate is
multifile and can thus be extended by the user.1071httpdisable_encoding_filter('application/x-gzip'). 1072httpdisable_encoding_filter('application/x-tar'). 1073httpdisable_encoding_filter('x-world/x-vrml'). 1074httpdisable_encoding_filter('application/zip'). 1075httpdisable_encoding_filter('application/x-gzip'). 1076httpdisable_encoding_filter('application/x-zip-compressed'). 1077httpdisable_encoding_filter('application/x-compress'). 1078httpdisable_encoding_filter('application/x-compressed'). 1079httpdisable_encoding_filter('application/x-spoon').
Transfer-encoding
header.1086transfer_encoding(Lines, Encoding) :- 1087 what_encoding(transfer_encoding, Lines, Encoding). 1088 1089what_encoding(What, Lines, Encoding) :- 1090 member(Line, Lines), 1091 phrase(encoding_(What, Debug), Line, Rest), 1092 !, 1093 atom_codes(Encoding, Rest), 1094 debug(http(What), '~w: ~p', [Debug, Rest]). 1095 1096encoding_(content_encoding, 'Content-encoding') --> 1097 field('content-encoding'). 1098encoding_(transfer_encoding, 'Transfer-encoding') --> 1099 field('transfer-encoding').
Content-encoding
header.
1106content_encoding(Lines, Encoding) :-
1107 what_encoding(content_encoding, Lines, Encoding).
Invalid reply header
.
1126read_header(In, Parts, Major-Minor, Code, Comment, Lines) :- 1127 read_line_to_codes(In, Line), 1128 ( Line == end_of_file 1129 -> parts_uri(Parts, Uri), 1130 existence_error(http_reply,Uri) 1131 ; true 1132 ), 1133 Line \== end_of_file, 1134 phrase(first_line(Major-Minor, Code, Comment), Line), 1135 debug(http(open), 'HTTP/~d.~d ~w ~w', [Major, Minor, Code, Comment]), 1136 read_line_to_codes(In, Line2), 1137 rest_header(Line2, In, Lines), 1138 !, 1139 ( debugging(http(open)) 1140 -> forall(member(HL, Lines), 1141 debug(http(open), '~s', [HL])) 1142 ; true 1143 ). 1144read_header(_, _, 1-1, 500, 'Invalid reply header', []). 1145 1146rest_header([], _, []) :- !. % blank line: end of header 1147rest_header(L0, In, [L0|L]) :- 1148 read_line_to_codes(In, L1), 1149 rest_header(L1, In, L).
1155content_length(Lines, Length) :- 1156 member(Line, Lines), 1157 phrase(content_length(Length0), Line), 1158 !, 1159 Length = Length0. 1160 1161location(Lines, RequestURI) :- 1162 member(Line, Lines), 1163 phrase(atom_field(location, RequestURI), Line), 1164 !. 1165 1166connection(Lines, Connection) :- 1167 member(Line, Lines), 1168 phrase(atom_field(connection, Connection0), Line), 1169 !, 1170 Connection = Connection0. 1171 1172first_line(Major-Minor, Code, Comment) --> 1173 "HTTP/", integer(Major), ".", integer(Minor), 1174 skip_blanks, 1175 integer(Code), 1176 skip_blanks, 1177 rest(Comment). 1178 1179atom_field(Name, Value) --> 1180 field(Name), 1181 rest(Value). 1182 1183content_length(Len) --> 1184 field('content-length'), 1185 integer(Len). 1186 1187field(Name) --> 1188 { atom_codes(Name, Codes) }, 1189 field_codes(Codes). 1190 1191field_codes([]) --> 1192 ":", 1193 skip_blanks. 1194field_codes([H|T]) --> 1195 [C], 1196 { match_header_char(H, C) 1197 }, 1198 field_codes(T). 1199 1200match_header_char(C, C) :- !. 1201match_header_char(C, U) :- 1202 code_type(C, to_lower(U)), 1203 !. 1204match_header_char(0'_, 0'-). 1205 1206 1207skip_blanks --> 1208 [C], 1209 { code_type(C, white) 1210 }, 1211 !, 1212 skip_blanks. 1213skip_blanks --> 1214 [].
1220integer(Code) --> 1221 digit(D0), 1222 digits(D), 1223 { number_codes(Code, [D0|D]) 1224 }. 1225 1226digit(C) --> 1227 [C], 1228 { code_type(C, digit) 1229 }. 1230 1231digits([D0|D]) --> 1232 digit(D0), 1233 !, 1234 digits(D). 1235digits([]) --> 1236 [].
1242rest(Atom) --> call(rest_(Atom)). 1243 1244rest_(Atom, L, []) :- 1245 atom_codes(Atom, L).
raw_headers(-Headers)
.1253reply_header(Lines, Options) :- 1254 option(raw_headers(Headers), Options), 1255 !, 1256 maplist(string_codes, Headers, Lines). 1257reply_header(_, _). 1258 1259 1260 /******************************* 1261 * AUTHORIZATION MANAGEMENT * 1262 *******************************/
-
, possibly defined
authorization is cleared. For example:
?- http_set_authorization('http://www.example.com/private/', basic('John', 'Secret'))
1278:- dynamic 1279 stored_authorization/2, 1280 cached_authorization/2. 1281 1282http_set_authorization(URL, Authorization) :- 1283 must_be(atom, URL), 1284 retractall(stored_authorization(URL, _)), 1285 ( Authorization = (-) 1286 -> true 1287 ; check_authorization(Authorization), 1288 assert(stored_authorization(URL, Authorization)) 1289 ), 1290 retractall(cached_authorization(_,_)). 1291 Var) (:- 1293 var(Var), 1294 !, 1295 instantiation_error(Var). 1296check_authorization(basic(User, Password)) :- 1297 must_be(atom, User), 1298 must_be(text, Password). 1299check_authorization(digest(User, Password)) :- 1300 must_be(atom, User), 1301 must_be(text, Password).
1309authorization(_, _) :- 1310 \+ stored_authorization(_, _), 1311 !, 1312 fail. 1313authorization(URL, Authorization) :- 1314 cached_authorization(URL, Authorization), 1315 !, 1316 Authorization \== (-). 1317authorization(URL, Authorization) :- 1318 ( stored_authorization(Prefix, Authorization), 1319 sub_atom(URL, 0, _, _, Prefix) 1320 -> assert(cached_authorization(URL, Authorization)) 1321 ; assert(cached_authorization(URL, -)), 1322 fail 1323 ). 1324 _, Options, Options) (:- 1326 option(authorization(_), Options), 1327 !. 1328add_authorization(Parts, Options0, Options) :- 1329 url_part(user(User), Parts), 1330 url_part(password(Passwd), Parts), 1331 !, 1332 Options = [authorization(basic(User,Passwd))|Options0]. 1333add_authorization(Parts, Options0, Options) :- 1334 stored_authorization(_, _) -> % quick test to avoid work 1335 parts_uri(Parts, URL), 1336 authorization(URL, Auth), 1337 !, 1338 Options = [authorization(Auth)|Options0]. 1339add_authorization(_, Options, Options).
1347parse_url_ex(URL, [uri(URL)|Parts]) :- 1348 uri_components(URL, Components), 1349 phrase(components(Components), Parts), 1350 ( option(host(_), Parts) 1351 -> true 1352 ; domain_error(url, URL) 1353 ). 1354 1355components(Components) --> 1356 uri_scheme(Components), 1357 uri_path(Components), 1358 uri_authority(Components), 1359 uri_request_uri(Components). 1360 1361uri_scheme(Components) --> 1362 { uri_data(scheme, Components, Scheme), nonvar(Scheme) }, 1363 !, 1364 [ scheme(Scheme) 1365 ]. 1366uri_scheme(_) --> []. 1367 1368uri_path(Components) --> 1369 { uri_data(path, Components, Path0), nonvar(Path0), 1370 ( Path0 == '' 1371 -> Path = (/) 1372 ; Path = Path0 1373 ) 1374 }, 1375 !, 1376 [ path(Path) 1377 ]. 1378uri_path(_) --> []. 1379 Components) (--> 1381 { uri_data(authority, Components, Auth), nonvar(Auth), 1382 !, 1383 uri_authority_components(Auth, Data) 1384 }, 1385 [ authority(Auth) ], 1386 auth_field(user, Data), 1387 auth_field(password, Data), 1388 auth_field(host, Data), 1389 auth_field(port, Data). 1390uri_authority(_) --> []. 1391 1392auth_field(Field, Data) --> 1393 { uri_authority_data(Field, Data, EncValue), nonvar(EncValue), 1394 !, 1395 ( atom(EncValue) 1396 -> uri_encoded(query_value, Value, EncValue) 1397 ; Value = EncValue 1398 ), 1399 Part =.. [Field,Value] 1400 }, 1401 [ Part ]. 1402auth_field(_, _) --> []. 1403 1404uri_request_uri(Components) --> 1405 { uri_data(path, Components, Path0), 1406 uri_data(search, Components, Search), 1407 ( Path0 == '' 1408 -> Path = (/) 1409 ; Path = Path0 1410 ), 1411 uri_data(path, Components2, Path), 1412 uri_data(search, Components2, Search), 1413 uri_components(RequestURI, Components2) 1414 }, 1415 [ request_uri(RequestURI) 1416 ].
1424parts_scheme(Parts, Scheme) :- 1425 url_part(scheme(Scheme), Parts), 1426 !. 1427parts_scheme(Parts, Scheme) :- % compatibility with library(url) 1428 url_part(protocol(Scheme), Parts), 1429 !. 1430parts_scheme(_, http). 1431 1432parts_authority(Parts, Auth) :- 1433 url_part(authority(Auth), Parts), 1434 !. 1435parts_authority(Parts, Auth) :- 1436 url_part(host(Host), Parts, _), 1437 url_part(port(Port), Parts, _), 1438 url_part(user(User), Parts, _), 1439 url_part(password(Password), Parts, _), 1440 uri_authority_components(Auth, 1441 uri_authority(User, Password, Host, Port)). 1442 1443parts_request_uri(Parts, RequestURI) :- 1444 option(request_uri(RequestURI), Parts), 1445 !. 1446parts_request_uri(Parts, RequestURI) :- 1447 url_part(path(Path), Parts, /), 1448 ignore(parts_search(Parts, Search)), 1449 uri_data(path, Data, Path), 1450 uri_data(search, Data, Search), 1451 uri_components(RequestURI, Data). 1452 1453parts_search(Parts, Search) :- 1454 option(query_string(Search), Parts), 1455 !. 1456parts_search(Parts, Search) :- 1457 option(search(Fields), Parts), 1458 !, 1459 uri_query_components(Search, Fields). 1460 1461 1462parts_uri(Parts, URI) :- 1463 option(uri(URI), Parts), 1464 !. 1465parts_uri(Parts, URI) :- 1466 parts_scheme(Parts, Scheme), 1467 ignore(parts_authority(Parts, Auth)), 1468 parts_request_uri(Parts, RequestURI), 1469 uri_components(RequestURI, Data), 1470 uri_data(scheme, Data, Scheme), 1471 uri_data(authority, Data, Auth), 1472 uri_components(URI, Data). 1473 1474parts_port(Parts, Port) :- 1475 parts_scheme(Parts, Scheme), 1476 default_port(Scheme, DefPort), 1477 url_part(port(Port), Parts, DefPort). 1478 1479url_part(Part, Parts) :- 1480 Part =.. [Name,Value], 1481 Gen =.. [Name,RawValue], 1482 option(Gen, Parts), 1483 !, 1484 Value = RawValue. 1485 1486url_part(Part, Parts, Default) :- 1487 Part =.. [Name,Value], 1488 Gen =.. [Name,RawValue], 1489 ( option(Gen, Parts) 1490 -> Value = RawValue 1491 ; Value = Default 1492 ). 1493 1494 1495 /******************************* 1496 * COOKIES * 1497 *******************************/ 1498 Out, Parts, Options) (:- 1500 http:write_cookies(Out, Parts, Options), 1501 !. 1502write_cookies(_, _, _). 1503 _, _, _) (:- 1505 predicate_property(http:update_cookies(_,_,_), number_of_clauses(0)), 1506 !. 1507update_cookies(Lines, Parts, Options) :- 1508 ( member(Line, Lines), 1509 phrase(atom_field('set_cookie', CookieData), Line), 1510 http:update_cookies(CookieData, Parts, Options), 1511 fail 1512 ; true 1513 ). 1514 1515 1516 /******************************* 1517 * OPEN ANY * 1518 *******************************/ 1519 1520:- multifile iostream:open_hook/6.
http
and
https
URLs for Mode == read
.1528iostreamopen_hook(URL, read, Stream, Close, Options0, Options) :- 1529 (atom(URL) -> true ; string(URL)), 1530 uri_is_global(URL), 1531 uri_components(URL, Components), 1532 uri_data(scheme, Components, Scheme), 1533 http_scheme(Scheme), 1534 !, 1535 Options = Options0, 1536 Close = close(Stream), 1537 http_open(URL, Stream, Options0). 1538 1539http_scheme(http). 1540http_scheme(https). 1541 1542 1543 /******************************* 1544 * KEEP-ALIVE * 1545 *******************************/
1551consider_keep_alive(Lines, Parts, Host, StreamPair, In, Options) :- 1552 option(connection(Asked), Options), 1553 keep_alive(Asked), 1554 connection(Lines, Given), 1555 keep_alive(Given), 1556 content_length(Lines, Bytes), 1557 !, 1558 stream_pair(StreamPair, In0, _), 1559 connection_address(Host, Parts, HostPort), 1560 debug(http(connection), 1561 'Keep-alive to ~w (~D bytes)', [HostPort, Bytes]), 1562 stream_range_open(In0, In, 1563 [ size(Bytes), 1564 onclose(keep_alive(StreamPair, HostPort)) 1565 ]). 1566consider_keep_alive(_, _, _, Stream, Stream, _). 1567 1568connection_address(Host, _, Host) :- 1569 Host = _:_, 1570 !. 1571connection_address(Host, Parts, Host:Port) :- 1572 parts_port(Parts, Port). 1573 1574keep_alive(keep_alive) :- !. 1575keep_alive(Connection) :- 1576 downcase_atom(Connection, 'keep-alive'). 1577 1578:- public keep_alive/4. 1579 1580keep_alive(StreamPair, Host, _In, 0) :- 1581 !, 1582 debug(http(connection), 'Adding connection to ~p to pool', [Host]), 1583 add_to_pool(Host, StreamPair). 1584keep_alive(StreamPair, Host, In, Left) :- 1585 Left < 100, 1586 debug(http(connection), 'Reading ~D left bytes', [Left]), 1587 read_incomplete(In, Left), 1588 add_to_pool(Host, StreamPair), 1589 !. 1590keep_alive(StreamPair, _, _, _) :- 1591 debug(http(connection), 1592 'Closing connection due to excessive unprocessed input', []), 1593 ( debugging(http(connection)) 1594 -> catch(close(StreamPair), E, 1595 print_message(warning, E)) 1596 ; close(StreamPair, [force(true)]) 1597 ).
1604read_incomplete(In, Left) :- 1605 catch(setup_call_cleanup( 1606 open_null_stream(Null), 1607 copy_stream_data(In, Null, Left), 1608 close(Null)), 1609 _, 1610 fail). 1611 1612:- dynamic 1613 connection_pool/4, % Hash, Address, Stream, Time 1614 connection_gc_time/1. 1615 1616add_to_pool(Address, StreamPair) :- 1617 keep_connection(Address), 1618 get_time(Now), 1619 term_hash(Address, Hash), 1620 assertz(connection_pool(Hash, Address, StreamPair, Now)). 1621 1622get_from_pool(Address, StreamPair) :- 1623 term_hash(Address, Hash), 1624 retract(connection_pool(Hash, Address, StreamPair, _)).
1633keep_connection(Address) :- 1634 close_old_connections(2), 1635 predicate_property(connection_pool(_,_,_,_), number_of_clauses(C)), 1636 C =< 10, 1637 term_hash(Address, Hash), 1638 aggregate_all(count, connection_pool(Hash, Address, _, _), Count), 1639 Count =< 2. 1640 1641close_old_connections(Timeout) :- 1642 get_time(Now), 1643 Before is Now - Timeout, 1644 ( connection_gc_time(GC), 1645 GC > Before 1646 -> true 1647 ; ( retractall(connection_gc_time(_)), 1648 asserta(connection_gc_time(Now)), 1649 connection_pool(Hash, Address, StreamPair, Added), 1650 Added < Before, 1651 retract(connection_pool(Hash, Address, StreamPair, Added)), 1652 debug(http(connection), 1653 'Closing inactive keep-alive to ~p', [Address]), 1654 close(StreamPair, [force(true)]), 1655 fail 1656 ; true 1657 ) 1658 ).
http_close_keep_alive(_)
closes all currently known keep-alive connections.
1667http_close_keep_alive(Address) :-
1668 forall(get_from_pool(Address, StreamPair),
1669 close(StreamPair, [force(true)])).
1680keep_alive_error(error(keep_alive(closed), _), _) :- 1681 !, 1682 debug(http(connection), 'Keep-alive connection was closed', []), 1683 fail. 1684keep_alive_error(error(io_error(_,_), _), StreamPair) :- 1685 !, 1686 close(StreamPair, [force(true)]), 1687 debug(http(connection), 'IO error on Keep-alive connection', []), 1688 fail. 1689keep_alive_error(Error, StreamPair) :- 1690 close(StreamPair, [force(true)]), 1691 throw(Error). 1692 1693 1694 /******************************* 1695 * HOOK DOCUMENTATION * 1696 *******************************/
:- multifile http:open_options/2. http:open_options(Parts, Options) :- option(host(Host), Parts), Host \== localhost, Options = [proxy('proxy.local', 3128)].
This hook may return multiple solutions. The returned options are combined using merge_options/3 where earlier solutions overrule later solutions.
Cookie:
header for the current connection. Out is an
open stream to the HTTP server, Parts is the broken-down request
(see uri_components/2) and Options is the list of options passed
to http_open. The predicate is called as if using ignore/1.
Set-Cookie
field, Parts is the broken-down request (see
uri_components/2) and Options is the list of options passed to
http_open.
HTTP client library
This library defines http_open/3, which opens an URL as a Prolog stream. The functionality of the library can be extended by loading two additional modules that act as plugins:
https
is requested using a default SSL context. See the plugin for additional information regarding security.gzip
transfer encoding. This plugin is lazily loaded if a connection is opened that claims this transfer encoding.Here is a simple example to fetch a web-page:
The example below fetches the modification time of a web-page. Note that
Modified
is''
(the empty atom) if the web-server does not provide a time-stamp for the resource. See also parse_time/2.Then next example uses Google search. It exploits library(uri) to manage URIs, library(sgml) to load an HTML document and library(xpath) to navigate the parsed HTML. Note that you may need to adjust the XPath queries if the data returned by Google changes (this example indeed no longer works and currently fails at the first xpath/3 call)
An example query is below:
Content-Type
header. */