You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: getting_started/python-client/lesson_2.md
+13-13Lines changed: 13 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -1,12 +1,12 @@
1
1
# Lesson 2 - Importing a CSV file into the database
2
2
3
-
> **_NOTE:_** from version 10.1.0 the cli command is `tdbpy` instead of `terminusdb`
3
+
> **_NOTE:_** from version 10.1.0 the CLI command is `tdbpy` instead of `terminusdb`
4
4
5
5
## The `tdbpy importcsv` Command
6
6
7
-
In this lesson, we will try to import a CSV file with the `tdbpy importcsv` command. It provide a very simple way to import a CSV with less manual effort. There are a few things that you can control with the `tdbpy importcsv` command, like setting the separator character, how to handle NAs and linking columns using a key columns etc. For more complicated data handling, a Python script will be needed and we will demonstrate that the next lesson: [Importing data form Python script](lesson_3.md)
7
+
In this lesson, we will import a CSV file with the `tdbpy importcsv` command. It provides a very simple way to import a CSV. There are a few things that you can control with the `tdbpy importcsv` command, such as setting the separator character, how to handle NAs, and linking columns using a key columns. For more complicated data handling, a Python script is needed and we will demonstrate that in lesson 5: [Importing data form Python script](lesson_3.md)
8
8
9
-
To see all the available options in `tdbpy importcsv`:
9
+
The example below enables you to see all the available options in `tdbpy importcsv`:
| 004 | Fabian Dalby | Web Service Manager | IT ||
54
54
55
-
As you see there are`Employee id`which are used as a key to link the `Manager` field to the person that is the manager of that employee.
55
+
As you see there is`Employee id` used as a key to link to the `Manager` field showing who the employee's manager is.
56
56
57
57
To link them, we must first install the `pandas` library.
58
58
```sh
@@ -68,7 +68,7 @@ Schema object EmployeesFromCSV created with Employees.csv being imported into da
68
68
Records in Employees.csv inserted as type EmployeesFromCSV into database with specified ids.
69
69
```
70
70
71
-
We have imported the CSV file with the class as `EmployeesFromCSV` you can see that there is a new class object in `schema.py` that is created as well:
71
+
We have imported the CSV file with the class as `EmployeesFromCSV`. There is a new class object in `schema.py` that was created along with the import:
In [later chapters](lesson_5.md) we will also learn how to query this data and/ or export data into CSV ([or using Singer.io to export data into other data products](https://github.com/terminusdb/terminusdb-tutorials/tree/master/google_sheets/README.md)).
89
+
In [chapter 5](lesson_5.md) we will learn how to query this data and/ or export data into CSV.
Copy file name to clipboardExpand all lines: getting_started/python-client/lesson_3.md
+18-18Lines changed: 18 additions & 18 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
-
# Lesson 3 - Importing data form Python script
1
+
# Lesson 3 - Importing Data From a Python Script
2
2
3
-
> **_NOTE:_** from version 10.1.0 the cli command is `tdbpy` instead of `terminusdb`
3
+
> **_NOTE:_** from version 10.1.0 the CLI command is `tdbpy` instead of `terminusdb`
4
4
5
-
In last lesson, we have import the [Employees.csv](Employees.csv)with the `tdbpy` commmand. It auto generate the schema for you and pipe in the data form the CSV. Let's check the [schema.py](schema.py), you can see the schema that is generate from the CSV:
5
+
In the last lesson we imported [Employees.csv](Employees.csv)using the `tdbpy importcsv` commmand. It autogenerated the schema and piped in the data from the CSV. If we check the [schema.py](schema.py) we can see the schema that was generated from the CSV:
6
6
7
7
```python
8
8
classEmployeesFromCSV(DocumentTemplate):
@@ -13,7 +13,7 @@ class EmployeesFromCSV(DocumentTemplate):
13
13
title: Optional[str]
14
14
```
15
15
16
-
It is not the same as the one we have planned in [Lesson 1](lesson_1.md):
16
+
You may noticed that the schema is not the same as the one we have talked about in [Lesson 1](lesson_1.md):
17
17
18
18
```python
19
19
classEmployee(DocumentTemplate):
@@ -27,26 +27,26 @@ class Employee(DocumentTemplate):
27
27
title: str
28
28
```
29
29
30
-
because we have some data in the [Contact.csv](Contact.csv). Fetching data from different CSVs and combining them according to our schema requires more customization and it can be done by creating a Python script with the [TerminusDB Python Client](https://github.com/terminusdb/terminusdb-client-python).
30
+
This is because we had data in the [Contact.csv](Contact.csv). Fetching data from different CSVs and matching them to our schema requires a little more customization. This can be done by creating a Python script using the [TerminusDB Python Client](https://github.com/terminusdb/terminusdb-client-python).
31
31
32
32
## Creating the Python script
33
33
34
-
Let's start a new `.py` file [insert_data.py](insert_data.py). You can copy and paste the one we have in this repo or build one yourself. We will explain the one we have so you have an idea what it does.
34
+
Let's start a new `.py` file [insert_data.py](insert_data.py). You can copy and paste the one in this repo or build one yourself. We'll explain the example script so you understand what it does.
35
35
36
-
In the first half of the script, we have to manage and import the data form CSV. In Python there is the [`csv` standard library](https://docs.python.org/3/library/csv.html)where can aid reading of CSV files. Go ahead and import that:
36
+
In the first half of the script, we have to manage and import the data form CSV. In Python there is the [`csv` standard library](https://docs.python.org/3/library/csv.html)that helps reading of CSV files. Go ahead and import that:
37
37
38
38
```python
39
39
import csv
40
40
```
41
41
42
-
Also we need to import `WOQLClient` which is the client that communitcate with the TerminusDB/ TerminusCMS and `schema.py`:
42
+
We also need to import `WOQLClient` which is the client that communitcates with the TerminusDB/ TerminusCMS and `schema.py`:
43
43
44
44
```python
45
45
from terminusdb_client import WOQLClient
46
46
from schema import*
47
47
```
48
48
49
-
At the top of the script, we prepare a few empty dictionaries to hold the data, we use dictionaries cause the keys can be the `Employees id` for easy mapping:
49
+
At the top of the script, we prepare a few empty dictionaries to hold the data, we use dictionaries because the keys can be the `Employees id` for easy mapping:
50
50
51
51
```python
52
52
employees = {}
@@ -55,7 +55,7 @@ addresses = {}
55
55
managers = {}
56
56
```
57
57
58
-
The goal is to populate the `employees` dictionaries with the `Employee` objects. To help, we also need `contact_numbers` to hold the contact numbers while reading the `Contact.csv`. The rest of the information in `Contact.csv` will be used to construct `Address` objects and stored in `addresses`. `managers` is used to store the employee id in the `Manager` column in the `Employees.csv`. We store the id at first and make the linking later cause the manager of that employee may have not been "created" yet.
58
+
The goal is to populate the `employees` dictionaries with the `Employee` objects. To help, we also need `contact_numbers` to hold the contact numbers while reading the `Contact.csv`. The rest of the information in `Contact.csv` will be used to construct `Address` objects and stored in `addresses`. `managers` is used to store the employee id in the `Manager` column in `Employees.csv`. We store the id at first and make the linking later because the manager of that employee may have not been "created" yet.
59
59
60
60
Then we go head and read the CSVs and do the corresponding data managing:
61
61
@@ -89,25 +89,25 @@ with open("Employees.csv") as file:
89
89
managers[row[0]] = row[4]
90
90
```
91
91
92
-
Last, we have to make the manager links:
92
+
Finally, we have to make the manager links:
93
93
94
94
```python
95
95
for emp_id, man_id in managers.items():
96
96
if man_id:
97
97
employees[emp_id].manager = employees[man_id]
98
98
```
99
99
100
-
Now, the `employees` dictionary should be populated with the `Employee` objects that is ready to be insert into the database.
100
+
Now, the `employees` dictionary should be populated with the `Employee` objects, ready to be inserted into the database.
101
101
102
102
## Using the Python client
103
103
104
-
The next step is the insert all `Employees` into the database. But before, we need to create a client with our endpoint:
104
+
The next step is the insert all `Employees` into the database. But before that, we need to create a client with our endpoint:
105
105
106
106
```python
107
107
client = WOQLClient("http://127.0.0.1:6363/")
108
108
```
109
109
110
-
Then we will connect the client to our database. If you are connecting locally and use default setting, just provide the database you are connecting to:
110
+
Then we will connect the client to our database. If you are connecting locally and use the default setting, just provide the database you are connecting to:
Copy file name to clipboardExpand all lines: getting_started/python-client/lesson_5.md
+14-14Lines changed: 14 additions & 14 deletions
Original file line number
Diff line number
Diff line change
@@ -1,16 +1,16 @@
1
-
# Lesson 5 - Query on the database and get result back as CSV or DataFrame
1
+
# Lesson 5 - Query the database and get results back as a CSV or DataFrame
2
2
3
-
> **_NOTE:_** from version 10.1.0 the cli command is `tdbpy` instead of `terminusdb`
3
+
> **_NOTE:_** from version 10.1.0 the CLI command is `tdbpy` instead of `terminusdb`
4
4
5
-
In the past lessons, we have learnt how to build schema and import data. Now the database has all the data we wanted, what we will do after is to get information out of the data we have in the database.
5
+
In previous lessons we learnt how to build schema and import data. Now the database has all the data we wanted. Now we want to get information out of the database.
6
6
7
-
In this lesson, we will learn how to query the database, get the information that we wanted. Then export either to a CSV for later use or to import as Pandas DataFrames for further investigation.
7
+
In this lesson we will learn how to query the database, get the information we need, and export either to a CSV or as a Pandas DataFrames.
8
8
9
-
## Query data with `tdbpy` Command
9
+
## Query Data with `tdbpy` Command
10
10
11
-
The most direct way to query on data and export it as CSV is to use the `tdbpy` command.
11
+
The most direct way to query data and export it as CSV is to use the `tdbpy` command.
12
12
13
-
If you are planning to export all documents in a particular type. You can simply use the `tdbpy exportcsv` command. Let's have a look a the command:
13
+
If you are planning to export all documents of a particular type. You can simply use the `tdbpy exportcsv` command. Let's have a look at the command:
14
14
15
15
```
16
16
$ tdbpy exportcsv --help
@@ -36,9 +36,9 @@ Now let's try to export all `Employee` to a file named `exported_employees.csv`
Inspect [exported_employees.csv](exported_employees.csv) and it look just as what we expected. All 5 employees information is there.
39
+
We'll quickly inspect [exported_employees.csv](exported_employees.csv) and can see it looks good. Information for all 5 employees is there.
40
40
41
-
But now, we want to only want the members of the IT team to be export in a CSV, we will have to do a bit of query. Let's try using the `-q`options with `tdbpy alldocs`
41
+
Say we want to export only members of the IT team in a CSV, we have to do a bit of query. Let's try using the `-q`option with `tdbpy alldocs`
42
42
43
43
```
44
44
$ tdbpy alldocs --type Employee -q team=it
@@ -51,22 +51,22 @@ It's a bit hard to see so we are going to export it to [a CSV](exported_it_team.
51
51
52
52
## Query data in Python script
53
53
54
-
If we want to do something more complicated, for example, see which team has longer names in average. We may have to export the result to a Pandas Dataframe and do more investigation. Let's have a look at [query_data.py](query_data.py).
54
+
If we want to do something more complicated, for example see which team has longer names in average. We can export the result to a Pandas Dataframe and do more investigation. Let's have a look at [query_data.py](query_data.py).
55
55
56
-
We can make use of the magic function `result_to_df` to convert a result json to a Pandas DataFrame:
56
+
We can make use of the magic function `result_to_df` to convert the JSON results to a Pandas DataFrame:
57
57
58
58
```python
59
59
from terminusdb_client.woqldataframe import result_to_df
60
60
```
61
61
62
-
Querying would be done by `query_document`, you will have to provide a template json that has `@type` and the specific requirement(s) (in our case, `"team": "it"` or `"team": "marketing"`).
62
+
Querying can be done by `query_document`, you will have to provide a template JSON that has `@type` and the specific requirement(s) (in our case, `"team": "it"` or `"team": "marketing"`).
0 commit comments