-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Currently, there are several hard coded file paths that could be better handled. For example, in preprocess.py the code requires the testing and training data and labels to be saved in a very rigorous file structure.
Some thoughts on better handling:
- Users could be able to pass training and testing data and labels as numpy arrays or pandas tables to
preprocess.py train_data,train_labels,AFB,data_minsetc should be attributes of theprocess()objectprocess()can then be pickled/saved and passed by the user tonetwork.nn()
The only potential issue is that a lot of the stuff generated by process() is needed later for later evaluation of the network. We want to maintain the separation between preprocessing and training, since there is a random element to the splitting and shuffling of training and testing data that we don't want to repeat when we resume training. In principle, the required components of process() could be made attributes of network.nn() and network.nn() could then be pickled and saved rather than process(). As part of pickling network.nn() in theory we could save the tensorflow model too as an attribute rather than separately saving it, but I will need to investigate this.
Needs some thought... Very related to #7, and I think resolving these two issues would lead to a more accessible code base, but would also constitute a breaking change.