The RPPL Visualizer is a data visualization and aggregation tool designed to support professional learning through interactive dashboards, filtering, and comparative analytics. It includes both a Visualizer interface for exploring survey datasets and an Admin Panel for managing user access to CSV files. This project is tailored to run within the Stronghold secure computing environment.
RPPLVisualizerAndConverter.mp4
- Use your Brown credentials (or any authorized credentials) to access the Stronghold environment via Big-IP.
- Link: https://www.f5.com/trials/big-ip-virtual-edition
- After connecting through Big-IP, open Remote Desktop Connection (pre-installed on Windows) and connect to your assigned Stronghold machine.
- Log in using the same credentials.
- Visit https://app.globus.org and log in with your Globus account.
- Search for and connect to BrownU_SH_PAPAY_IMPORT (a guest collection under BrownU_SH_LOEB).

- Upload the contents of this repository (or the appropriate folder).
- The uploaded files will appear in the mapped network drive on your Stronghold machine.
-
Copy the uploaded folder from the network drive to your Stronghold Desktop (or another working directory).
-
Open the
RPPLVisualizer.batfile with a text editor (e.g., Notepad). -
Run
ipconfigin Command Prompt to find your computer’s IPv4 address. -
Replace the placeholder in the batch file with your IP:
@echo off cd /d %~dp0 start python libraries\server.py timeout /t 2 start http://192.168.1.123:8000/pages/RPPL_LocalVisualizerCORS.html
Replace
192.168.1.123with the actual IP address of your machine.
- Double-click
RPPLVisualizer.bat - The server will start and begin listening on the specified IP
- Each Stronghold user who needs access to the visualizer must have a copy of the
client/folder. - You can find this inside the main repository.
-
Open the
start_visualizer.batfile inside theclientfolder with a text editor. -
Update the line below to match the server’s IP:
start http://192.168.1.123:8000/pages/RPPL_LocalVisualizerCORS.html
Replace
192.168.1.123with the IP address of the server machine (from theipconfigcommand run on the server).
- Double-click
start_visualizer.bat - The client will open in a browser window and connect to the visualizer server
The Visualizer automatically detects the logged-in Stronghold user and displays only the datasets that user has access to, as determined by the Access Control Matrix (admin.html). It provides a secure, user-friendly interface to explore survey results with robust filtering and aggregation tools.
-
Data Source Selection
Users can select from multiple available datasets. Each user only sees the files they’ve been granted access to via the admin panel. -
Cascading Filters
- Primary and Secondary Filters: Filter responses based on demographic or categorical values such as organization, role, region, etc. The secondary filter list dynamically updates based on your primary filter selection.
- Outcome Filter: Select a particular outcome or question of interest to analyze as the target variable in the graph.
-
Aggregation Options
Choose how data is aggregated using:MeanMedianModeFrequency
Not all aggregation types apply to every dataset (e.g., median may not apply to categorical variables).
-
Within Org and Across Org Averages
Graphs include two visual average lines:- A dashed line representing the average response for the current organization.
- A dotted line of x’s representing the average of all other organizations with matching questionnaire structure.
To enable this feature, ensure that datasets belonging to the same questionnaire are stored in the
/data/folder with the same base filename and differing numeric suffixes (e.g.,SurveyA.csv,SurveyA2.csv,SurveyA3.csv, etc.).
-
Preset Saving & Loading
- Users can export their current visualization setup as a
.jsonpreset file. - Import previously saved presets to quickly reload a specific chart configuration and filters.
- Users can export their current visualization setup as a
RPPLAccessControlMatrixDemo.mp4
The Admin Panel (admin.html) provides fine-grained access control over which datasets users can see.
-
User Permissions
Grant or revoke access to specific.csvdata files on a per-user basis. -
Automatic File Detection
Any.csvfile added to the/data/directory is automatically detected by the panel and added as a new permission toggle column. -
Add & Remove Users
Easily manage user entries by adding new usernames or removing existing ones.
ShowSavedPresetInDashboard.mp4
The Dashboard allows users to consolidate multiple saved visualizations into one shareable page.
-
Graph Compilation
Add multiple saved presets to a dashboard using the dropdown and “Add to Dashboard” button. -
Titles & Descriptions
Each chart supports a customizable title and description area. -
Image Rendering
To preserve formatting, each chart is converted into a static image when the dashboard is refreshed. -
Custom HTML Editing
Click “Edit HTML” to gain full control over the dashboard layout and graph styling.
Both the Visualizer and Dashboard support local session states and preset saving.
-
Auto-Restore
While the client is running, current graphs and settings persist automatically. -
Manual Export/Import
Export presets to.jsonfiles to save graph configurations. Re-import them anytime to reload your analysis setup.
-
Datasets
All data is stored in standard.csvformat inside the/data/folder. -
Access Matrix
Theaccess.csvfile maps users to datasets they can access. This ensures that users can only view their authorized data. -
Secure Comparison Logic
Even though users only see their own data, the system can still compute average comparisons against other datasets without exposing raw records from other organizations.
This Python script is a pre-processor for raw survey exports from Qualtrics and Google Forms. Its goal is to convert messy platform-specific CSVs into clean, standardized, typed datasets ready for analysis. It removes PII, normalizes True/False items, generates unique IDs, and tags each column as categorical (c) or numeric (n).
convert_datasets() processes every .csv inside an input folder. For each file, it performs:
- If the filename contains
"Qualtrics"→ skip the first metadata row and use the second row as headers. - If the filename contains
"GoogleForms"→ use the first row as headers, and do not include timestamps from the file. - Everything else is treated as a generic CSV.
The converter builds a clean row shaped like:
Unique ID | Timestamp | Q1 | Q2 | Q3 | ...
It does the following:
Every row gets a synthetic ID:
- Pattern:
ID_<random six digits>
- If an
"End Date"column is present → use that as the timestamp. - Otherwise → fall back to the current date/time.
- For Google Forms files, only
Unique IDis added (no timestamp column).
Any column whose header contains substrings like:
IP Address,Recipient Email,Recipient First Name,Progress,
Location Latitude,Location Longitude,Response ID,Start Date,End Date, etc.
is removed. This step is what de-identifies the export.
Certain questions are exported as two columns:
... ? - True... ? - False
The script:
- Keeps only the
"True"column. - Renames it to the base question text (everything before
" - True"). - Writes
"TRUE"if the source cell is1, otherwise"FALSE".
This makes downstream analysis easier and keeps booleans in a single column.
All remaining survey columns (that aren’t metadata or special True/False pairs) are kept as-is.
Depending on the --separate flag:
--separate yes- Each input CSV becomes its own file:
converted_<original_filename>.csv
- Each input CSV becomes its own file:
--separate no(default)- All cleaned rows from all files are appended into:
converted_combined.csv
- All cleaned rows from all files are appended into:
In both cases, the result is a clean, rectangular dataset with Unique ID (and usually Timestamp) in the first columns.
Once the cleaned data is written, the script calls final_tagging() which:
- Re-opens the output CSV.
- Reads all rows to inspect each column’s contents.
- Forces the first two headers to:
c Unique IDc Timestamp(if present)
- For every other column:
- Looks at all non-empty values in that column.
- If every value can be parsed as a number (
float(...)succeeds) → tags asn. - Otherwise → tags as
c.
Example final header row:
c Unique ID, c Timestamp, n Score, c Favorite Activity, n Hours Per Week
These tags are used by older RPPL tooling and any downstream pipeline that needs to know if a variable is numeric or categorical.
Run the converter from a terminal or command prompt:
python converter.py INPUT_FOLDER OUTPUT_FOLDER [--separate yes|no]
-
INPUT_FOLDER
Folder containing the raw Qualtrics / Google Forms CSV exports. -
OUTPUT_FOLDER
Folder where the converted CSV(s) will be written. -
--separate(optional, default =no)yes→ write one converted file per input CSV.no→ write a single combined file calledconverted_combined.csv.
python converter.py raw_exports converted --separate no
- All
.csvfiles inraw_exports/are read. - A single
converted_combined.csvappears underconverted/.
python converter.py raw_exports converted --separate yes
- Each input becomes
converted_<original>.csvinconverted/.
The helper is_numeric(value) simply tries:
float(value)
If it succeeds for all non-empty cells in a column:
- Column is tagged as
n(numeric).
If any non-empty cell fails numeric parsing:
- Column is tagged as
c(categorical).
This heuristic is usually sufficient for survey data where Likert items are numeric and text responses are free-form strings.
You can adapt the script to your own survey exports:
Edit the remove_columns list inside convert_datasets():
- Any header containing one of these substrings will be dropped.
- Add/remove items to control which metadata fields are stripped.
Edit the paired_columns list:
- Include the exact header text from your export.
- The script will treat each
"... - True"/"... - False"pair as one boolean variable.
If you want to:
- Use a different ID pattern
- Pull timestamps from another column
- Store dates in another format
you can edit:
generate_unique_id()- The timestamp section in
clean_row().
This converter is designed to:
- Clean raw Qualtrics / Google Forms exports
- Remove identifying metadata
- Normalize True/False paired questions
- Generate unique IDs per response
- Add timestamps (where applicable)
- Tag each column as categorical (
c) or numeric (n)
The resulting CSVs are ready for ingestion into RPPL Insights (RPPL Visualizer v1.0) or any other analysis pipeline that expects typed, de-identified survey data.
This project is licensed for internal use within RPPL and Brown University’s Stronghold environment.
