Install gpt4all, streamlit
Have "acs2023_1yr_variables_LABEL_CONCEPT_Btables.json" in the 01_raw folder
Have "acs2023_1yr_c_tables_with_vars_LLM_cleaned.json" in the 03_processed folder
- Run the "cleaning_script_1.py" file which takes the raw B Table information as input
- Run the "Merging_B_C_tables.py" file which combines the processed B Table and the pre-cleaned C Table
- Run the "llm_dev_03.py" file as: streamlit run < insert path > llm_dev_03.py [ARGUMENTS]
This demo uses the Meta Llama 3 8B Instruct model. This model is one of the best performing open source models (Hugging Face Meta Llama 3). It runs without a GPU, but it may run more efficiently with one.
The ACS B and C tables are available for the LLM to evaluate. Given the context limits, the demo selects only 5 tables with the variables they cover.
This selection is managed by the random seed, so the 5 tables do not change.
The tables currently contain only information about what variables are addressed. This could be expanded to have information about the universe of the tables and the geography limits.
The context limits could be expanded with a vector database.