You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@rnikutta, I suggest that rather than make any assumptions about the directory path at all, we define an environment variable, SCRATCH or DLAIRFLOW_SCRATCH that must be set in a DAG (could be inside or outside of a task). I think it is better to raise an error if the environment variable is not set, although the fallback could be to use /tmp instead.
The problem with /tmp though is that some systems wipe /tmp on reboot, and I think we want something more persistent than that.
Also note, I don't want to use AIRFLOW_SCRATCH since Airflow already defines many environment variables, and I want to preclude any conflict.
Another thing I think is needed here is a way for multiple users to have separate, non-conflicting scratch directories. That can't be based on os.environ['USER'] though because the value of that is always airflow. Perhaps there is a way to programmatically obtain the user as defined in the Airflow web interface.
dlairflow.util.user_scratch
is specific not just to Data Lab, but one particular Data Lab server. This can be made more portable.The text was updated successfully, but these errors were encountered: