Skip to content

Commit 221041b

Browse files
Updated
0 parents  commit 221041b

File tree

4 files changed

+106
-0
lines changed

4 files changed

+106
-0
lines changed

README.md

+70
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
<h1 align="center">Scrape Twitter JSON to HTML Table</h1></br>
2+
3+
<p align="center">
4+
This script shows how to scrape your tweets by date or year using the opensource project Twint and turn it into a beautiful HTML table. No API key is required..
5+
</p>
6+
<br>
7+
8+
<p align="center">
9+
Twint: https://github.com/twintproject/twint
10+
</p>
11+
<p align="center">
12+
json2html: https://github.com/softvar/json2html
13+
</p>
14+
<br>
15+
16+
<p align="center">
17+
Video Usage Instructions on Awesome Dev Notes YouTube: https://youtu.be/042-QIhC5ms
18+
</p>
19+
20+
## Installing Twint
21+
22+
Make sure you have python and pip installed.
23+
```
24+
pip3 install --user --upgrade git+https://github.com/twintproject/twint.git@origin/master#egg=twint
25+
26+
```
27+
28+
## Run Twint
29+
30+
Parameters are explained in the video or go to [wiki](https://github.com/twintproject/twint/wiki)
31+
32+
```
33+
twint -u androiddevnotes --since 2020-05-20 --until 2021-03-01 -o twint_androiddevnotes.json --json
34+
```
35+
36+
The output will be JSON data.
37+
38+
39+
## Watch the [Video]() to see how to turn the above scraped Twitter JSON data to HTML table.
40+
41+
The video shows the usage of filter.py and jsonhtml.py
42+
43+
<br>
44+
45+
<p align="center">
46+
<a href="#"><img alt="Twitter Badge" src="https://badgen.net/badge/Platform/Twitter?icon=https://raw.githubusercontent.com/androiddevnotes/learn-jetpack-compose-android/master/assets/android.svg&color=3ddc84"/></a>
47+
<a href="https://github.com/androiddevnotes"><img alt="androiddevnotes GitHub badge" src="https://badgen.net/badge/GitHub/androiddevnotes?icon=github&color=24292e"/></a>
48+
49+
</p>
50+
51+
<br>
52+
<p align="center">
53+
<img src="assets/twint_androiddevnotes.png" alt="twint awesomedevnotes - androiddevnotes youtube thumbnail"></img>
54+
</p><br>
55+
56+
57+
filter.py: To filter out the JSON nodes. You can modify the script to remove the JSON data that is not required or add the ones you need.
58+
59+
jsonhtml.py: To convert the JSON data to HTML table. You don't need to modify this script to work.
60+
61+
62+
<br>
63+
64+
## :computer: Find us on
65+
66+
<div align="center">
67+
<a href="https://github.com/androiddevnotes"> GitHub </a> / <a href="https://discord.gg/vBnEhuC"> Discord </a> / <a href="https://twitter.com/androiddevnotes"> Twitter </a> / <a href="https://www.instagram.com/androiddevnotes"> Instagram </a> / <a href="https://www.youtube.com/channel/UCQATLaT0xKkSm-KKVQzpu0Q"> YouTube </a> / <a href="https://medium.com/@androiddevnotes"> Medium </a>
68+
<br><br>
69+
<img width="320px" src="https://raw.githubusercontent.com/androiddevnotes/androiddevnotes/master/assets/androiddevnotes.png" alt="androiddevnotes logo"></img>
70+
</div>

assets/twint_androiddevnotes.png

101 KB
Loading

filter.py

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import argparse
2+
import json
3+
4+
parser = argparse.ArgumentParser()
5+
parser.add_argument('-i', help='path of input json')
6+
parser.add_argument('-o', help='path of output json')
7+
8+
def main(i, o):
9+
data = json.load(open(i))
10+
new_data = {'items': []}
11+
new_data['items'] = [{'created': item['created_at'], 'tweet': item['tweet'], 'link': item['link'], 'likes_count': item['likes_count'], 'retweets_count': item['retweets_count'], 'replies_count': item['replies_count'], 'quote_url': item['quote_url'], 'reply_to': item['reply_to'], 'mentions': item['mentions'], 'urls': item['urls'], 'photos': item['photos']} for item in data['items']]
12+
json.dump(new_data, open(o, 'w'), indent=2)
13+
print('done')
14+
15+
16+
if __name__ == '__main__':
17+
args = parser.parse_args()
18+
main(args.i, args.o)

jsonhtml.py

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import argparse
2+
import json
3+
4+
from json2html import *
5+
6+
parser = argparse.ArgumentParser()
7+
parser.add_argument('-i', help='path of input json')
8+
parser.add_argument('-o', help='path of output json')
9+
10+
def main(i, o):
11+
data = json.load(open(i, encoding='utf-8'))
12+
print(json2html.convert(json = data).encode("utf-8"))
13+
14+
15+
16+
if __name__ == '__main__':
17+
args = parser.parse_args()
18+
main(args.i, args.o)

0 commit comments

Comments
 (0)