Skip to content

Added sanitize function to check the author's name on snow observations to insert into the database properly. #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

BobTorgerson
Copy link
Contributor

This PR adds the validator library to the build that escapes special characters in the author's name of a given observation. We had an issue with a user with the last name D'Angelo because the import script had not escaped those special characters.

I would recommend this be merged into the production branch.

If there is a preferred sanitization method, please feel free to modify this PR as desired to make use of that method instead.

@emiliom
Copy link
Member

emiliom commented Feb 27, 2021

I see the change from o.author_name to validator.escape(o.author_name) in src/import/index.js. In the validator library documentation link you provided, the description of the escape function is:

replace <, >, &, ', " and / with HTML entities.

I'm not sure that's what we want, though. Ultimately, we want database-safe insertions of the strings that are provided, but validator.escape would actually change the strings themselves -- granted, in a way that's fully recoverable (unescaped) when handled as HTML entities.

How about using PostgreSQL string functions instead? For example, I think replacing '${validator.escape(o.author_name)}' with quote_literal(${o.author_name}) would address the issue you've described without changing the strings themselves (I took the definition of quote_literal from https://www.postgresql.org/docs/11/functions-string.html#FUNCTIONS-STRING-OTHER). I don't remember Javascript quoting syntax, though, so this may require further tweaking.

@BobTorgerson
Copy link
Contributor Author

Hey Emilio,

You are correct in that we don't necessarily want the HTML code entities to be used in the database regardless of if we can decode it on the other end. To fix this, Bruce and I spent time today refactoring the code for importing these snow observations to rather than generating a single query via string concatenation, which lends itself to SQL injection, it now uses a parameterized query such that we pass the values to the backend PostgreSQL server to handle the validation and proper insert functionality for each of the fields. This results in the user in question George D'Angelo being passed as desired to the database server and the server ingesting that string as expected.

Please have a look when you have a chance!

Thanks,

Bob

@emiliom
Copy link
Member

emiliom commented Mar 1, 2021

That sounds and looks great! I've gone over the changes to src/import/index.js (changes to other files are more about JavaScript specifics and error catching). The only comment I have is that I noticed you removed the casting of o.timestamp.toISOString() to a TIMESTAMP; I assume PostgreSQL will do that automatically via the parameterized query as it recognizes the datetime ISO string?

@BobTorgerson
Copy link
Contributor Author

Yes, that's exactly right! It assigns the type of the value based on the database table's schema and since we are passing a correctly formatted time string, it inserts it properly in the database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants