Skip to content

"SUM()" on vops_float4 yeilds different results from HEAP #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
tvesely opened this issue Aug 16, 2019 · 1 comment
Closed

"SUM()" on vops_float4 yeilds different results from HEAP #18

tvesely opened this issue Aug 16, 2019 · 1 comment

Comments

@tvesely
Copy link
Contributor

tvesely commented Aug 16, 2019

We were running the example queries from your README, and noticed that heap and vops tables don't necessarily return the same results. When we ran this query, we noticed that the float4 columns didn't match with the equivalent query in heap:

-- Mixed mode: let's Postgres does group by and calculates VOPS aggregates for each group
select
    l_returnflag,
    l_linestatus,
    sum(l_quantity) as sum_qty,
    sum(l_extendedprice) as sum_base_price,
    sum(l_extendedprice*(1-l_discount)) as sum_disc_price,
    sum(l_extendedprice*(1-l_discount)*(1+l_tax)) as sum_charge,
    avg(l_quantity) as avg_qty,
    avg(l_extendedprice) as avg_price,
    avg(l_discount) as avg_disc,
    count(*) as count_order
from
    vops_lineitem_projection
where
    l_shipdate <= '1998-12-01'::date
group by
    l_returnflag,
    l_linestatus
order by
    l_returnflag,
    l_linestatus;

In a simplified example:

CREATE TABLE
pivotal=# create table vops_foo(a vops_float4);
CREATE TABLE
pivotal=# insert into foo select (i+.1258) from generate_series(1, 1000000)i;
INSERT 0 1000000
pivotal=# select populate(destination := 'vops_foo'::regclass, source := 'foo'::regclass, sort := 'a');
 populate
----------
  1000000
(1 row)

When we run sum(a) on foo, it gets a number that is significantly different than a sum on vops_foo:

pivotal=# select sum(a) from foo;
      sum
---------------
 5.0000986e+11
(1 row)
pivotal=# select sum(a) from vops_foo;
        sum
-------------------
 500000625015.1876
(1 row)

Why does this happen? Is vops_float4 incompatible with the standard float4? Is this a bug?

@knizhnik
Copy link
Contributor

knizhnik commented Aug 17, 2019

VOPS is calculating aggregates on float4/float8 ytpe using double,
while Postgres is accumulating result using numeric type.

Please also notice that result of sum of floating point values may depend on order of calculation.
You can see in in vanilla Postgres:

postgres=# select sum(x) from ff;
sum

499914.6792653507
(1 row)

postgres=# select sum(x) from ff;
sum

499914.6792653515
(1 row)

postgres=# select sum(x) from ff;
sum

499914.67926534865
(1 row)

You can notice small differences in results - it is because using of parallel plan for execution of this query:

postgres=# explain select sum(x) from ff;
QUERY PLAN

Finalize Aggregate (cost=11614.55..11614.56 rows=1 width=8)
-> Gather (cost=11614.33..11614.54 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=10614.33..10614.34 rows=1 width=8)
-> Parallel Seq Scan on ff (cost=0.00..9572.67 rows=416667 width=8)
(5 rows)

So some small difference in the results is not an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants