SQL like joins in datalog #2523
Unanswered
sreevyshnavi02
asked this question in
Q&A
Replies: 3 comments 4 replies
-
Hi, while i'm not an expert in SQL, the standard joins, you should be able
to simulate using equijoins/antijoins and disjunction provided by datalog.
e.g.,
Res(x) :- Table1(x), Table2(x).
Res(x) :- Table1(x).
would be something like a left join.
…On Thu, Dec 26, 2024 at 3:59 AM C. SREE VYSHNAVI ***@***.***> wrote:
Does souffle support the use of joins in datalog like the joins supported
in SQL?
—
Reply to this email directly, view it on GitHub
<#2523>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC3OOKL27DMHKF4R2JH3ND2HNWHJAVCNFSM6AAAAABUGTATKGVHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZXG42DIMRUGI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
the error is because CustomerName is ungrounded. In datalog every attribute
in an IDB relation needs to be bound to an EDB relation or a constant. So
for example if you want
values to be NULL if the don't exist in an EDB relation, you need to do
this explicitly. e.g.,
LeftJoin(CustomerID, "NULL", OrderID, OrderDate) :-
Order(CustomerID, OrderID, OrderDate).
…On Fri, Dec 27, 2024 at 12:54 PM C. SREE VYSHNAVI ***@***.***> wrote:
Hello! This is the example I was trying with datalog for left join. The
inner join works as expected but I am unable to achieve a left join where
there should be NULL for some values (OrderID and OrderDate should be NULL
for customerID 2). How can I achieve this in datalog?
*Table 1: Customer*
CustomerID,CustomerName
1,Alice
2,Bob
3,Charlie
4,David
*Table 2: Order*
OrderID,CustomerID,OrderDate
101,1,2024-01-15
102,3,2024-02-20
103,1,2024-03-05
104,4,2024-04-10
*Datalog program:*
// defining relations (tables)
.decl Customer(CustomerID:number, CustomerName:symbol)
.decl Order(OrderID:number, CustomerID:number, OrderDate:symbol)
.decl InnerJoin(CustomerID:number, CustomerName:symbol, OrderID:number, OrderDate:symbol)
.decl LeftJoin(CustomerID:number, CustomerName:symbol, OrderID:number, OrderDate:symbol)
// input the source files for both the tables
.input Customer(filename="customer.csv", delimiter=",", headers=true)
.input Order(filename="order.csv", delimiter=",", headers=true)
InnerJoin(CustomerID, CustomerName, OrderID, OrderDate) :-
Customer(CustomerID, CustomerName),
Order(OrderID, CustomerID, OrderDate).
.output InnerJoin(delimiter=",")
LeftJoin(CustomerID, CustomerName, OrderID, OrderDate) :-
Customer(CustomerID, CustomerName),
Order(OrderID, CustomerID, OrderDate).
LeftJoin(CustomerID, CustomerName, OrderID, OrderDate) :-
Order(CustomerID, OrderID, OrderDate).
.output LeftJoin(delimiter=",")
This code returns an error in the left join relation.
Error: Ungrounded variable CustomerName in file prgm.dl at line 21
LeftJoin(CustomerID, CustomerName, OrderID, OrderDate) :-
---------------------^------------------------------------
Warning: Variable CustomerName only occurs once in file prgm.dl at line 21
LeftJoin(CustomerID, CustomerName, OrderID, OrderDate) :-
---------------------^------------------------------------
—
Reply to this email directly, view it on GitHub
<#2523 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAC3OONNAD2RX7ATH3CJRBL2HU5YJAVCNFSM6AAAAABUGTATKGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNRXG4YDGMY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Hi ,
Im not aware of a null keyword in souffle. It could be implemented with a bit of work. The easiest is for each type to designate a null value. This in rare circumstances might be problematic E.g., “Null” or -1 is a legit value for string and int types respectively. Perhaps there is some type system magic that can be done, maybe others can chime in regarding that.
Get Outlook for iOS<https://aka.ms/o0ukef>
…________________________________
From: C. SREE VYSHNAVI ***@***.***>
Sent: Saturday, December 28, 2024 10:50:03 AM
To: souffle-lang/souffle ***@***.***>
Cc: Pavle Subotic ***@***.***>; Comment ***@***.***>
Subject: Re: [souffle-lang/souffle] SQL like joins in datalog (Discussion #2523)
Hello! As you suggested, I tried using "NULL" in the LeftJoin. That worked. However, when I modified it to use two "NULL" I am getting an error.
Error: Symbol constant (type mismatch) in file prgm.dl at line 21
LeftJoin(CustomerID, CustomerName,"NULL","NULL") :-
----------------------------------^-----------------
1 errors generated, evaluation aborted
I understand the error arises because OrderID is declared as number type but is being given the value of "NULL" here. But, is there any keyword like NULL or NA in datalog that can be used in place of a string "NULL" to indicate that there is no data in the cell?
When I declared OrderID to be of type symbol, I could avoid the type mismatch at "NULL". However, this is the output I got.
1,Alice,NULL,NULL
1,Alice,101,2024-01-15
1,Alice,103,2024-03-05
2,Bob,NULL,NULL
3,Charlie,NULL,NULL
3,Charlie,102,2024-02-20
4,David,NULL,NULL
4,David,104,2024-04-10
I wanted to perform left join with the Customer relation, hence set the OrderID and OrderDate to NULL. The expected output of this left join is -
CustomerID,CustomerName,OrderID,OrderDate
1,Alice,101,2024-01-15
1,Alice,103,2024-03-05
2,Bob,NULL,NULL
3,Charlie,102,2024-02-20
4,David,104,2024-04-10
The CustomerIDs that are not part of the Order relation only should appear with NULL for OrderID and OrderDate. How can I achieve something like this in datalog?
—
Reply to this email directly, view it on GitHub<#2523 (reply in thread)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AAC3OOJM3GCSR6VGVQPYAEL2HZX4XAVCNFSM6AAAAABUGTATKGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCNRYGM3TOOA>.
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Does souffle support the use of joins in datalog like the joins supported in SQL?
Beta Was this translation helpful? Give feedback.
All reactions