Skip to content

Commit

Permalink
✨ Enable parenthesized lists in search criteria [🚧 WIP: SequenceSet c…
Browse files Browse the repository at this point in the history
…oercion]

This affects search, uid_search, sort, uid_sort, thread, and uid_thread.

Prior to this, sending a parenthesized list in the search criteria for
any of these commands required the use of strings, which are converted
to RawData, which has security implications with untrusted inputs.

With this change, arrays will only be converted into SequenceSet when
_every_ element in the array is a valid SequenceSet input.  Otherwise,
the array will be left alone, which allows us to send parenthesized
lists without using strings and RawData.

For example, some search criteria this change enables:
* `["not", %w[flagged unread]]` converts to `not (flagged unread)`.
* `["return", ["partial", 1..50]]` converts to `return (partial 1:50)`.
  • Loading branch information
nevans committed Nov 8, 2024
1 parent 8591ec0 commit b6493c5
Show file tree
Hide file tree
Showing 3 changed files with 51 additions and 24 deletions.
31 changes: 15 additions & 16 deletions lib/net/imap.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1944,13 +1944,19 @@ def uid_expunge(uid_set)
#
# * When +criteria+ is an array, each member is a +SEARCH+ command argument:
# * Any SequenceSet sends SequenceSet#valid_string.
# +Range+, <tt>-1</tt>, and nested +Array+ elements are converted to
# SequenceSet.
# * Any +String+ is sent verbatim when it is a valid \IMAP atom,
# These types are converted to SequenceSet for validation and encoding:
# * +Set+
# * +Range+
# * <tt>-1</tt> and +:*+ -- both translate to <tt>*</tt>
# * responds to +#to_sequence_set+
# * +String+, when formatted as a \IMAP sequence-set
# * deeply nested +Array+, when all members are one of these types.
# * Any other +String+ is sent verbatim when it is a valid \IMAP atom,
# and encoded as an \IMAP quoted or literal string otherwise.
# * Any other nested +Array+ is encoded as a parenthesized list, to group
# multiple search keys (e.g., for use with +OR+ and +NOT+).
# * Any other +Integer+ (besides <tt>-1</tt>) will be sent as +#to_s+.
# * +Date+ objects will be encoded as an \IMAP date (see ::encode_date).
#
# * When +criteria+ is a string, it will be sent directly to the server
# <em>without any validation or encoding</em>. *WARNING:* This is
# vulnerable to injection attacks when external inputs are used.
Expand All @@ -1972,13 +1978,13 @@ def uid_expunge(uid_set)
# The following searches send the exact same command to the server:
#
# # criteria array, charset arg
# imap.search(%w[OR UNSEEN FLAGGED SUBJECT foo], "UTF-8")
# imap.search(["OR", "UNSEEN", %w(FLAGGED SUBJECT foo)], "UTF-8")
# # criteria string, charset arg
# imap.search("OR UNSEEN FLAGGED SUBJECT foo", "UTF-8")
# imap.search("OR UNSEEN (FLAGGED SUBJECT foo)", "UTF-8")
# # criteria array contains charset arg
# imap.search(%w[CHARSET UTF-8 OR UNSEEN FLAGGED SUBJECT foo])
# imap.search([*%w[CHARSET UTF-8], "OR", "UNSEEN", %w(FLAGGED SUBJECT foo)])
# # criteria string contains charset arg
# imap.search("CHARSET UTF-8 OR UNSEEN FLAGGED SUBJECT foo")
# imap.search("CHARSET UTF-8 OR UNSEEN (FLAGGED SUBJECT foo)")
#
# ===== Search keys
#
Expand Down Expand Up @@ -3191,14 +3197,7 @@ def thread_internal(cmd, algorithm, search_keys, charset)

def normalize_searching_criteria(criteria)
return RawData.new(criteria) if criteria.is_a?(String)
criteria.map do |i|
case i
when -1, Range, Array
SequenceSet.new(i)
else
i
end
end
criteria.map {|i| SequenceSet::Coercible[i] ? SequenceSet[i] : i }
end

def build_ssl_ctx(ssl)
Expand Down
19 changes: 16 additions & 3 deletions lib/net/imap/sequence_set.rb
Original file line number Diff line number Diff line change
Expand Up @@ -276,16 +276,29 @@ class SequenceSet
# The largest possible non-zero unsigned 32-bit integer
UINT32_MAX = 2**32 - 1

REGEXP = ResponseParser::Patterns::SEQUENCE_SET_STR
private_constant :REGEXP

# represents "*" internally, to simplify sorting (etc)
STAR_INT = UINT32_MAX + 1
private_constant :STAR_INT

# valid inputs for "*"
STARS = [:*, ?*, -1].freeze
private_constant :STAR_INT, :STARS
private_constant :STARS

COERCIBLE = ->{ _1.respond_to? :to_sequence_set }
private_constant :COERCIBLE
# Matches objects which should be implicitly converted into SequenceSet
# objects. Note that the inputs are not validated, and some valid inputs
# (Enumerable other than Array or Set) will be rejected.
Coercible = ->(obj) do
case obj
when SequenceSet then true
when Integer, Range, *STARS then true
when String then REGEXP.match?(obj.b)
when Array, Set then obj.all?(Coercible) && !obj.empty?
else obj.respond_to?(:to_sequence_set)
end
end

class << self

Expand Down
25 changes: 20 additions & 5 deletions test/net/imap/test_imap.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1206,16 +1206,31 @@ def test_unselect
end

server.on "SEARCH", &search_resp
assert_equal search_result, imap.search(["subject", "hello",
[1..5, 8, 10..-1]])
server.on "UID SEARCH", &search_resp

assert_equal search_result, imap.search(
["subject", "hello", [1..5, 8, 10..-1]]
)
cmd = server.commands.pop
assert_equal ["SEARCH", "subject hello 1:5,8,10:*"], [cmd.name, cmd.args]

server.on "UID SEARCH", &search_resp
assert_equal search_result, imap.uid_search(["subject", "hello",
[1..22, 30..-1]])
assert_equal search_result, imap.uid_search(
["subject", "hello", [1..22, 30..-1]]
)
cmd = server.commands.pop
assert_equal ["UID SEARCH", "subject hello 1:22,30:*"], [cmd.name, cmd.args]

assert_equal search_result, imap.search(
"RETURN (COUNT) NOT (FLAGGED (OR SEEN ANSWERED))"
)
cmd = server.commands.pop
assert_equal "RETURN (COUNT) NOT (FLAGGED (OR SEEN ANSWERED))", cmd.args

assert_equal search_result, imap.search([
"RETURN", %w(MIN MAX COUNT), "NOT", ["FLAGGED", %w(OR SEEN ANSWERED)]
])
cmd = server.commands.pop
assert_equal "RETURN (MIN MAX COUNT) NOT (FLAGGED (OR SEEN ANSWERED))", cmd.args
end
end

Expand Down

0 comments on commit b6493c5

Please sign in to comment.