1:- module(md_span, [
    2    md_span_codes/2, % +Codes, -HtmlTerms
    3    md_span_string/2 % +String, -HtmlTerms
    4]).

Span-level Markdown parser

Parses span-level Markdown elements: emphasis, inline-code, links and others. More info: http://daringfireball.net/projects/markdown/syntax#span */

   13:- use_module(library(dcg/basics)).   14:- use_module(library(apply)).   15
   16:- use_module(md_trim).   17:- use_module(md_links).   18:- use_module(md_span_link).   19:- use_module(md_span_decorate).   20:- use_module(md_escape).   21:- use_module(md_line).
 md_span_string(+String, -HtmlTerms) is det
Same as md_span_codes/2 but uses a string ans input.
   28md_span_string(String, HtmlTerms):-
   29    string_codes(String, Codes),
   30    md_span_codes(Codes, HtmlTerms).
 md_span_codes(+Codes, -HtmlTerms) is det
Turns the list of codes into a structure acceptable by SWI-Prolog's html//1 predicate. More info: http://www.swi-prolog.org/pldoc/doc_for?object=html/1
   38md_span_codes(Codes, HtmlTerms):-
   39    md_span_codes(Codes, [strong, em, code, del], HtmlTerms).
   40
   41md_span_codes(Codes, Allow, Out):-
   42    phrase(span(Spans, Allow), Codes), !,
   43    phrase(atomize(Out), Spans).
   44
   45% Optimized case for normal text.
   46
   47span([Code1,Code2|Spans], Allow) -->
   48    [Code1,Code2],
   49    {
   50        code_type(Code1, alnum),
   51        code_type(Code2, alnum),
   52        Code1 \= 0'h,
   53        Code2 \= 0't
   54    }, !,
   55    span(Spans, Allow).
   56
   57% Escape sequences.
   58% More info:
   59% http://daringfireball.net/projects/markdown/syntax#backslash
   60% Processed first.
   61
   62span([Atom|Spans], Allow) -->
   63    "\\", [Code],
   64    {
   65        md_escaped_code(Code),
   66        atom_codes(Atom, [Code])
   67    }, !,
   68    span(Spans, Allow).
   69
   70% Entities. These must be left alone.
   71% More info:
   72% http://daringfireball.net/projects/markdown/syntax#autoescape
   73
   74span([\[Atom]|Spans], Allow) -->
   75    "&", string_limit(Codes, 10), ";",
   76    {
   77        maplist(alnum, Codes),
   78        append([0'&|Codes], [0';], Entity),
   79        atom_codes(Atom, Entity)
   80    }, !,
   81    span(Spans, Allow).
   82
   83% Special characters & and <.
   84% More info:
   85% http://daringfireball.net/projects/markdown/syntax#autoescape
   86
   87span(['&'|Spans], Allow) -->
   88    "&", !, span(Spans, Allow).
   89
   90% As inline HTML is allowed, < is only escaped
   91% when the following character is not a letter and / or
   92% < appears at end of stream.
   93
   94span(['<'|Spans], Allow) -->
   95    "<", lookahead(Code),
   96    {
   97        \+ code_type(Code, alpha),
   98        Code \= 47
   99    }, !,
  100    span(Spans, Allow).
  101
  102span(['<'], _) -->
  103    "<", eos, !.
  104
  105% Line break with two or more spaces.
  106% More info:
  107% http://daringfireball.net/projects/markdown/syntax#p
  108
  109span([br([])|Spans], Allow) -->
  110    "  ", whites, ln, !,
  111    span(Spans, Allow).
  112
  113% Recognizes links and images.
  114
  115span([Link|Spans], Allow) -->
  116    lookahead(Code),
  117    {
  118        % performance optimization
  119        (   Code = 0'[
  120        ;   Code = 0'!
  121        ;   Code = 0'<
  122        ;   Code = 0'h)
  123    },
  124    md_span_link(Link), !,
  125    span(Spans, Allow).
  126
  127% Recognizes <script ... </script>.
  128% Protects script contents from being processed as Markdown.
  129
  130span([\[String]|Spans], Allow) -->
  131    "<script", string(Codes), "</script>", !,
  132    {
  133        string_codes(Content, Codes),
  134        atomics_to_string(['<script', Content, '</script>'], String)
  135    },
  136    span(Spans, Allow).
  137
  138% Prevent in-word underscores to trigger
  139% emphasis.
  140
  141span([Code, 0'_, 0'_|Spans], Allow) -->
  142    [Code], "__",
  143    { code_type(Code, alnum) }, !,
  144    span(Spans, Allow).
  145
  146span([Code, 0'_|Spans], Allow) -->
  147    [Code], "_",
  148    { code_type(Code, alnum) }, !,
  149    span(Spans, Allow).
  150
  151% Recognizes text stylings like
  152% strong, emphasis and inline code.
  153
  154span([Span|Spans], Allow) -->
  155    lookahead(Code),
  156    {
  157        % performance optimization
  158        (   Code = 0'`
  159        ;   Code = 0'_
  160        ;   Code = 0'*
  161        ;   Code = 0'~)
  162    },
  163    md_span_decorate(Dec, Allow), !,
  164    {
  165        Dec =.. [Name, Codes],
  166        (   Name = code
  167        ->  string_codes(Atom, Codes),
  168            Span =.. [Name, Atom]
  169        ;   select(Name, Allow, AllowNest),
  170            md_span_codes(Codes, AllowNest, Nested),
  171            Span =.. [Name, Nested])
  172    },
  173    span(Spans, Allow).
  174
  175span([Code|Spans], Allow) -->
  176    [Code], !,
  177    span(Spans, Allow).
  178
  179span([], _) -->
  180    eos.
  181
  182% Collects remaining codes into atoms suitable
  183% for SWI-s html//1.
  184% Atoms will appear as \[text] as they can contain
  185% raw HTML which must not be escaped.
  186
  187atomize([]) -->
  188    eos, !.
  189
  190atomize([\[Atom]|Tokens]) -->
  191    [Num], { number(Num) }, !,
  192    text_codes(Codes),
  193    { string_codes(Atom, [Num|Codes]) },
  194    atomize(Tokens).
  195
  196atomize([Token|Tokens]) -->
  197    [Token], atomize(Tokens).
  198
  199text_codes([Code|Codes]) -->
  200    [Code], { number(Code) }, !,
  201    text_codes(Codes).
  202
  203text_codes([]) --> "".
  204
  205% Recognizes single symbol code of
  206% type alnum.
  207
  208alnum(Code):-
  209    code_type(Code, alnum)