ietf-nntp Wildmats (was: draft posted)
Stan O Barber
sob at academ.com
Tue Nov 27 20:44:49 PST 2001
Yes, it seems fine to me per se.
I have not diffed it against -14.
Others? I'd like to get something done on this by IETF as well.
Russ Allbery wrote:
> Charles Lindsey <chl at clw.cs.man.ac.uk> writes:
>
>
>>Last thing I remember is that Clive had two texts (both should still be
>>on his website) and we had reached consensus on the shorter one, without
>>the character classes.
>>
>
> I'm including that change for reference. It seems fine to me. Does
> anyone have any objections with making the following changes in the draft?
>
> (Note that the mention of LIST NEWSGROUPS being very efficient with a
> single group name has since been removed from the draft and shouldn't be
> reintroduced by this change, and the other changes to the text elsewhere
> in the document should be reviewed against -14.)
>
> NNTP proposed text
> Section 5
> Working copy
> Last changed 2001-05-02 14:45 UTC
>
> This text consists of my updated proposal of 2001-04-30 with wildmat
> sets removed. I've diff-marked the changes in section 5 relative to the
> updated proposal. This proposal is a strict subset of the previous one
> (that is, every wildmat in the previous proposal either has the same
> meaning in this one or is not permitted).
>
>
> 5. The WILDMAT format
>
> The WILDMAT format described here is based on the version
> first developed by Rich Salz [5], which in turn was derived from
> the format used in the UNIX "find" command to articulate file names.
> It was developed to provide a uniform mechanism for matching
> patterns in the same manner that the UNIX shell matches filenames.
>
> 5.1 Wildmat syntax
>
> A wildmat is described by the following augmented BNF[9] syntax
> (note that this syntax contains ambiguities and special cases described
> at the end):
>
> wildmat = wildmat-pattern *("," ["!"] wildmat-pattern)
>
> wildmat-pattern = 1*wildmat-item
>
> wildmat-item = wildmat-exact / wildmat-wild
>
> ! wildmat-exact = %x21-29 / %x2B / %x2D-3E / %x40-5A / %x5E-7F /
> ! UTF-8-non-ascii ; exclude * , ? [ \ ]
>
> ! wildmat-wild = "*" / "?"
> -
> UTF-8-non-ascii is defined in section 13.
>
> This syntax must be interpreted subject to the following rule:
>
> - Where a wildmat-pattern is not immediately preceded by "!", it shall
> not begin with a "!".
> -
> ! NOTE: the characters \ , [ and ] are not allowed in wildmats, while *
> ! and ? are always wildcards. This should not be a problem since these
> ! characters cannot occur in newsgroup names, which is the only current
> ! use of wildmats. Backslash is commonly used to supress the special
> ! meaning of characters and brackets to introduce sets, but there is no
> ! existing standard practice for these in wildmats and so they were omitted
> ! from this specification. A future extension to this document may provide
> ! semantics for these characters.
>
> 5.2 Wildmat semantics
>
> A wildmat is tested against a string, and either matches or does not
> match. To do this, each constituent wildmat-pattern is matched against
> the string and the rightmost pattern that matches is identified. If
> that wildmat-pattern is not preceded with "!", the whole wildmat matches.
> If it is preceded by "!", or if no wildmat-pattern matches, the whole
> wildmat does not match.
>
> For example, consider the wildmat "a*,!*b,*c*":
>
> the string "aaa" matches because the rightmost match is with "a*"
> the string "abb" does not match because the rightmost match is with "*b"
> the string "ccb" matches because the rightmost match is with "*c*"
> the string "xxx" does not match because no wildmat-pattern matches
>
> A wildmat-pattern matches a string if the string can be broken into
> components, each of which matches the corresponding wildmat-item in
> the pattern; the matches must be in the same order, and the whole string
> must be used in the match. The pattern is "anchored"; that is, the first
> and last characters in the string must match the first and last item
> respectively (unless that item is an asterisk matching zero characters).
>
> A wildmat-exact matches the same character (which may be more than one
> octet in UTF-8).
>
> "?" matches exactly one character (which may be more than one octet).
>
> "*" matches zero or more characters. It can match an empty string, but
> it cannot match a subsequence of a UTF-8 sequence that is not aligned
> to the character boundaries.
> -
> 5.3 Extensions
>
> An NNTP server or extension MAY extend the syntax or semantics of
> wildmats provided that all wildmats that meet the requirements of
> section 5.1 have the meaning ascribed to them by section 5.2.
> Future editions of this document may also extend wildmats.
>
> 5.4 Examples
>
> In these examples, $ and @ are used to represent the two octets 0xC2
> and 0xA3 respectively; $@ is thus the UTF-8 encoding for the pound
> sterling symbol, shown as # in the descriptions.
>
> Wildmat Description of strings that match
>
> abc the one string "abc"
> abc,def the two strings "abc" and "def"
> $@ the one character string "#"
> a* any string that begins with "a"
> a*b any string that begins with "a" and ends with "b"
> a*,*b any string that begins with "a" or ends with "b"
> a*,!*b any string that begins with "a" and does not end with "b"
> a*,!*b,c* any string that begins with "a" and does not end with "b", and
> any string that begins with "c" no matter what it ends with
> a*,c*,!*b any string that begins with "a" or "c" and does not end
> with "b"
> ?a* any string with "a" as its second character
> ??a* any string with "a" as its third character
> *a? any string with "a" as its penultimate character
> *a?? any string with "a" as its antepenultimate character
> -
>
> ========
>
> [The following changes also need be made to other sections for consistency.
> In addition the formal grammar will need updating.]
>
>
> 6. Format for Keyword Descriptions
>
> [...]
>
> The name "wildmat" for a parameter indicates that it is a
> wildmat format pattern as defined in section 5. If the parameter
> does not meet the requirements of that section (for example, if
> it does not fit the grammar of 5.1) the NNTP server MAY place some
> interpretation on it (not specified by this document) or otherwise
> MUST generate a 501 response.
>
> 9.4 The LIST Keyword
>
> 9.4.1 LIST
>
> [...]
>
> If the optional wildmat parameter is specified, the list is
> limited to only those groups whose names match the wildmat. This
> will normally be very efficient if the wildmat is a simple group
> name.
>
> 9.4.2 LIST ACTIVE.TIMES
>
> LIST ACTIVE.TIMES [wildmat]
>
> [...]
>
> If the optional wildmat parameter is specified, the list is
> limited to only those groups whose names match the wildmat. This
> will normally be very efficient if the wildmat is a simple group
> name.
>
> 9.4.4 LIST DISTRIB.PATS
>
> LIST DISTRIB.PATS
>
> The distrib.pats file is maintained by some news transport
> systems to allow clients to choose a value for the
> Distribution: line in the header of a news article being
> posted. The information returned consists of lines, in no
> particular order, each of which contains three fields
> separated by colons: a weight, a wildmat (which may be a simple
> group name), and a Distribution: value, in that order.
>
> [...]
>
> 9.4.5 LIST NEWSGROUPS
>
> LIST NEWSGROUPS [wildmat]
>
> [...]
> If the information is not available, the
> server will return the 503 response. If the server does not
> recognize the command it should return a 501 response. If
> the optional wildmat parameter is specified, the list is
> limited to only those groups that match the wildmat (no
> matching is done on the group descriptions). This will
> normally be very efficient if the wildmat is a simple group
> name. If nothing is matched
> an empty list is returned, not an error.
>
> 11.4 NEWNEWS
>
> NEWNEWS wildmat date time [GMT]
>
> The message-ids of all articles added to a set of newsgroups
> since the given date-time will be listed. The set consists
> of all newsgroups whose name matches the wildmat.
> The format of the listing will be one message-id per line, as
> though text were being sent. Each message-id MUST appear only
> once in a response. The order of the response has no specific
> significance and may vary from response to response in the
> same session. Date and time are in the same format as the
> NEWGROUPS command.
>
> Note that an empty list (i.e., the text body returned by this
> command consists only of the terminating period) is a possible
> valid response, and indicates that there is currently no new
> news.
>
> Clients SHOULD make all queries in Coordinated Universal Time
> when possible.
>
>
More information about the ietf-nntp
mailing list