-
Notifications
You must be signed in to change notification settings - Fork 101
Add Array#remove #293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Array#remove #293
Conversation
As an aside: The docs for
but I don't understand what it means by that. Do you?
|
Crazy idea: abuse
|
It probably means or refers on how is implemented Set#-
Looks familiar? |
Thanks for the reply, @esparta!
Ah, okay. I admit, I didn't look closely enough at Haha, I guess it looks familiar in the sense that it looks remarkably like the implementation I came up with for
So to clarify, what I find confusing about them mentioning "If you need set-like behavior, see the library class Set" in the documentation for If example in [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ] #=> [ 1, 2, 3, 3, 5 ] instead of this: [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ] #=> [ 3, 3, 5 ] , then I could see making a call-out to But since their example was: [ 1, 1, 2, 2, 3, 3, 4, 5 ] - [ 1, 2, 4 ]
#=> [ 3, 3, 5 ] , the behavior this would be contrasted with in [ 1, 1, 2, 2, 3, 3, 4, 5 ].to_set - [ 1, 2, 4 ].to_set
# => #<Set: {3, 5}> But that result is remarkably similar! The only difference is that But that difference in behavior seems due more to the inherent differences in the data structures in general ( Maybe it's just me, but the if (RARRAY_LEN(ary1) <= SMALL_ARRAY_LEN || RARRAY_LEN(ary2) <= SMALL_ARRAY_LEN) {
for (i=0; i<RARRAY_LEN(ary1); i++) {
VALUE elt = rb_ary_elt(ary1, i);
if (rb_ary_includes_by_eql(ary2, elt)) continue;
rb_ary_push(ary3, elt);
}
return ary3;
} (Sorry this went long...) |
:) that's what I meant. When I did lookup for the implementation of
I know what you meant, and if it helps it confused me too. It needs to be changed somehow in order to either be clear enough or remove it if doesn't make sense in the current behavior. There's a on going process of re-write the documentation on ruby core, coincidentally it's nowadays on About the Set behavior... [ 1, 1, 2, 2, 3, 3, 4, 5 ].to_set - [ 1, 2, 4 ].to_set
# => #<Set: {3, 5}>
It's tricky to explain without going too deep into mathematics, but I'd say the gist is: the subtraction operator is performed after applying the conversion to
if (RARRAY_LEN(ary1) <= SMALL_ARRAY_LEN || RARRAY_LEN(ary2) <= SMALL_ARRAY_LEN) {
for (i=0; i<RARRAY_LEN(ary1); i++) {
VALUE elt = rb_ary_elt(ary1, i);
if (rb_ary_includes_by_eql(ary2, elt)) continue;
rb_ary_push(ary3, elt);
}
return ary3;
} IMO, it's kind of different, on both array's branches (for smallers < 16 or big > 16) is the same: add to C if A[i] is not in B; for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like an error would happen with
[1, 2, 3].substract! [4]
b7b9911
to
29f0f0a
Compare
Thanks, glad I'm not the only one 😄
Wow, Burdette Lamar is doing some excellent work there! Thanks for pointing that out. I hope this part of the docs can be fixed too as part of that effort.
Not sure if I quite get your point here, but at least we agree that it is confusing. 😄 When you say the [ 1, 1, 2, 2, 4, 5 ].to_set - [ 1, 2, 4 ].to_set
# => #<Set: {5}> So sure, it makes sense to give that result when you explicitly convert the objects to [ 1, 1, 2, 2, 4, 5 ] - [ 1, 2, 4 ]
#=> [ 5 ] I think @olivierlacan said it best when he said:
and:
|
It occurred to me that this would be easier to think about and explain if we actually split this into 2 operations, which I'm tentatively calling:
This would fill in 3 holes in this table...
I don't love the name Downsides of name
This operation really should be the operation that bears the name So, is I still like the name |
How about And thinking about that option more, it doesn't seem as objectionable to me as it once did:
|
29f0f0a
to
066c8f5
Compare
Motivation
There are many cases when
'racecar'.chars
). (If they weren't important, why aren't you usingSet
?)So why isn't there an easy way to remove an element from such an array in a way that respects both the order and number (count) of each element? Why do all methods for removing elements from an array assume that you always want to remove all matching elements from the receiver, with no option to do otherwise?
I could not find any way to do this with the standard library. (Let me know if I'm missing something.) You would think this would be possible with
Array
s, which do allow you to have duplicate values (as opposed toSet
s, which do not).One might even think (as I did until I tried it) that this would be the behavior that the built-in
Array#-
would give us, but alas it is not.This is similar to
Array#-
andArray#difference
, except that instead of removing all matches, it only removes as many occurrences as you actually ask it to subtract:Name
I'm not sure of the best name for this, but for now I'm calling it
subtract
since it's a nice and simple name.I would prefer to call it
Array#-
but that operator is obviously taken by the language itself.subtract
?The biggest downside I can think of for
subtract
is that one might assume that it is completely synonymous and identical toArray#-
operator. Just likeSet#-
is identical toSet#subtract
(technicallydup.subtract
). (I was going to point toString#-
as an example (from string/remove.rb / string/op_sub.rb, but I guess that is aliased asremove
rather thansubtract
.)The other downside is that
Array#subtract
would be inconsistent withSet#subtract
in thatSet#subtract
modifies the receiver butArray#subtract
does not (does adup
). I would like to have both an in-place modification version and adup
version, so I was thinkingsubtract!
andsubtract
. But that's not consistent with the analogs inSet
:subtract
(in-place modification) and-
(dup
). I guess that's further evidence that users should rightfully expectsubtract
to have the same semantics as-
(the only allowed difference being in-place modification).So as much as I like the name
subtract
, I think I may have reluctantly talked myself out of that name, since I value consistency of semantics with built-in method names more highly.One thing I like about the name
subtract
is that it still has the connotation of the arithmetic operation of subtraction, which "represents the operation of removing objects from a collection", and I believe this implementation behaves more like the intuitive understanding of subtraction than the built-inArray#-
does. If you have 3 apples (['apple']*3
) and you take away 2 of them (['apple']*2
), you should be left with 1 apple (['apple']*1
), not 0 apples. Yet:Wat. Therefore, I propose
subtract
:Doesn't read as nicely as an operator would, but what alternative do we have? The only obvious operator,
-
, is taken. ... Unless we could abuse-@
somehow to our benefit?Comparison table
A table might be helpful...
Set
: operation to remove all occurrences of a single element/conditiondelete
,delete_if
reject
Set
: operation to remove all occurrences of every element inobject
subtract
-
Array
: operation to remove all occurrences of a single element/conditiondelete
,delete_at
,delete_if
reject
Array
: operation to remove all occurrences of every element inobject
delete_values
(Facets)-
,difference
,without
(ActiveSupport)Array
: operation to remove only first/n occurrences of a single element/conditiondelete_first
(proposed)Array
: operation to remove only first/n occurrences of every element inobject
delete_first_each!
(proposed)delete_first_each
(proposed)I guess the analog to
Set#subtract
inArray
isdelete_values
(Facets).So I propose adding
subtract
as alias todelete_values
and a new method, perhapssubtract_preserving_duplicates
for the new operation I am proposing.The semantics of generic operations like
-
are open to interpretationThe problem with generic-sounding methods like
subtract
is that everyone may have their own assumptions about what it will/should do.Hash#-
is a good example of this. But unlikeHash#-
, which probably has dozens of ways it could behave, I can only think of 2 main plausible variations ofArray#-
: either remove all matches or only one of eachel
inother_array
.Other names considered
subtract_first_occurrence
subtract_first
/remove_first
subtract_once
/delete_each_once
/remove_once
delete_only
/remove_only
/subtract_only
(to contrast with the implied all (delete_all
,subtract_all
) implied bydelete
,-
, etc.)other_array
argument, and not theother_array
argument itself. We would almost have to change the method signature todelete_only(*elements)
instead of accepting an array/enumerable, in order for this name to make sense.elementwise_delete_once
subtract_respecting_counts
subtract_preserving_duplicates
/difference_preserving_duplicates
subtract_occurrences
subtract_n
(as in, "only n occurrences of each element inother
")subtract_allowing_for_dups
/diff_allowing_for_dups
delete_each_from
delete
, butdelete
removes all copies ofel
, so this name is ambiguous)delete_each_first
delete
semantics...delete_first_each
delete_first
, then it may make it clearer that it uses thedelete_first
operation for each, rather thandelete
for each.remove
-
likesubtract
does.String#remove
delete
does). But as a counterexample,String#remove
doesn't modify the receiver (there is a#remove!
variant that does).String#remove
, then that suffers from the same problem that thesubtract
<=>-
analogy suffers from:String#remove
removes all occurrences (gsub
).Looking at other synonyms for "subtract"...
whittle
downsize
Perhaps you can think of a better name?
How can we make the semantics clear without being wordy?