Skip to content
Discussion options

You must be logged in to vote

Thanks, you were right :)

For anyone else who might be reading this and uses RSS:

  • I made modifications to "def parse_rss_2_0" in the rss2schema.py file and it all worked out well at the end :)

FYI: I established that the default RSS ingestion was not precise enough for my website (the default produced only 1 point in the db), so I further adjusted db_load.py and introduced chunking based on article's HTML elements (H1,H2,H3,H4,H5, <p>,<ul>,<table>, etc.).
This worked well -> approx. 3000 words article ended up into 50 points in qdrant, with proper schema_json, and the chat was able to pull details out of article afterwards much better, with high precision and no errors or lack of info.

Replies: 3 comments 3 replies

Comment options

You must be logged in to vote
1 reply
@galaxy101quest
Comment options

Comment options

You must be logged in to vote
1 reply
@galaxy101quest
Comment options

Answer selected by chelseacarter29
Comment options

You must be logged in to vote
1 reply
@galaxy101quest
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants