[Bug probably?] `nn.Linear` should set self.bias to None when initializer is called with param `bias=False` #780

adhulipa · 2024-03-04T05:49:50Z

adhulipa
Mar 4, 2024

Info

Not entirely sure if this a bug. Thought of opening up a discussion to kick off initial conversation.

It seems to me that nn.Linear's should provide a default .bias = None property when bias=False during initialization
Example code:

import mlx.core as mx
import mlx.nn as nn

linear = nn.Linear(5, 10, bias=False)
linear.bias 
""" Throws error
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/adi/miniconda3/envs/mlx-examples/lib/python3.11/site-packages/mlx/nn/layers/base.py", line 137, in __getattr__
    super(Module, self).__getattr__(key, val)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'super' object has no attribute '__getattr__'

"""

Something similar in PyTorch seems to not throw the error

>>> import torch
>>> import torch.nn as tnn
>>> tlinear = tnn.Linear(5, 10)
>>> tlinear
Linear(in_features=5, out_features=10, bias=True)

>>> tlinear.bias
Parameter containing:
tensor([-0.2372,  0.0079,  0.3954, -0.2948, -0.3942, -0.0641, -0.0758, -0.3202,
         0.2919, -0.2903], requires_grad=True)

>>> tlinear = tnn.Linear(5, 10, bias=False)
>>> tlinear.bias
## Notice no error; just empty output ## 
>>> tlinear.bias == None
True

Potential Fix

Perhaps a simple else: self.bias = None to L60 ../nn/layers/linear.py could suffice?

        if bias:
            self.bias = mx.random.uniform(
                low=-scale,
                high=scale,
                shape=(output_dims,),
            )
        else:
            self.bias = None

Why does it matter? Why do I ask?

Or, more cotext

Im experimenting in building a BitLinear layer by inheriting from nn.Linear as follows. In __call__ I essentially need to perform a linear transformation. Im using mx.addmm() technique or reverting to matmul @ if bias doesn't exist just like mlx linear.py does. BitLinear originally from kyegomez BitNet here

class BitLinear(nn.Linear):

    def __init__(
        self,
        in_features: int,
        out_features: int,
        bias: bool = True,
        num_groups: int = 1,
        b: int = 8,
    ):
        super().__init__(in_features, out_features, bias)
        ........... truncated ...............
        
    def __call__(self, x: mx.array) -> mx.array:
        x = self.norm(x)
        binarized_weights = self.binarize_weights_groupwise()
        # Perform linear transformation
        # output = mx.nn.functional.linear(x, binarized_weights, self.bias)
        if self.bias:
            output = mx.addmm(self.bias, x, binarized_weights.T)
        else:
            output = x @ binarized_weights.T
        # Quantize activations
        output = self.quantize_activations_groupwise(output)
        # Dequatization according to Eq.(11)
        output *= self.beta * self.gamma / self.Qb
        return output

Answered by angeloskath

Mar 4, 2024

I would not say it is a bug but rather a choice, maybe a bad one but a choice nonetheless. Namely, we choose to not add it and then we do the check using "bias" in self as is done in nn.Linear.

Now having said that, I see that you are indeed encountering a bug which is fixed on main (but not on v0.5.0). This function should be calling __getattribute__ and not __getattr__ on super().

View full answer

awni · 2024-03-04T05:57:48Z

awni
Mar 4, 2024
Maintainer

Does something like the following work well enough for you?

if "bias" in self:
  # do admm
else:
  # matmul only

That said, I'm not against defaulting bias to None if it is more intuitive.. I don't see any downsides off the top of my head but would be good to think through that a bit.

1 reply

adhulipa Mar 4, 2024
Author

I think that should work though I didn't try it exactly as so. My current workaround is

        if hasattr(self, 'bias'):
            output = mx.addmm(self.bias, x, binarized_weights.T)
        else:
            output = x @ binarized_weights.T

perhaps it's identical to what you suggest? My python-fu is not as strong (yet) :D

angeloskath · 2024-03-04T06:03:50Z

angeloskath
Mar 4, 2024
Maintainer

I would not say it is a bug but rather a choice, maybe a bad one but a choice nonetheless. Namely, we choose to not add it and then we do the check using "bias" in self as is done in nn.Linear.

Now having said that, I see that you are indeed encountering a bug which is fixed on main (but not on v0.5.0). This function should be calling __getattribute__ and not __getattr__ on super().

0 replies

adhulipa · 2024-03-04T06:24:41Z

adhulipa
Mar 4, 2024
Author

That said, I'm not against defaulting bias to None if it is more intuitive

I would not say it is a bug but rather a choice

Ack. I will defer to you folks' design decision here. It feels intuitive to have self.bias == None to me. It's nicer than seeing an error. But OTOH definitely has a downside of potential becoming a "silent bug" i.e. perhaps better to error than to fail silently.

check using "bias" in self as is done in nn.Linear

Ah, yes, indeed. I missed that earlier when I was scanning through linear.py

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug probably?] `nn.Linear` should set self.bias to None when initializer is called with param `bias=False` #780

{{title}}

Replies: 3 comments 1 reply

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

[Bug probably?] nn.Linear should set self.bias to None when initializer is called with param bias=False #780

adhulipa Mar 4, 2024

Info

Potential Fix

Why does it matter? Why do I ask?

Or, more cotext

Replies: 3 comments · 1 reply

awni Mar 4, 2024 Maintainer

adhulipa Mar 4, 2024 Author

angeloskath Mar 4, 2024 Maintainer

adhulipa Mar 4, 2024 Author

[Bug probably?] `nn.Linear` should set self.bias to None when initializer is called with param `bias=False` #780

adhulipa
Mar 4, 2024

Replies: 3 comments 1 reply

awni
Mar 4, 2024
Maintainer

adhulipa Mar 4, 2024
Author

angeloskath
Mar 4, 2024
Maintainer

adhulipa
Mar 4, 2024
Author