LOG

Log of progress, meetings, but also random thoughts.
====================================================

2014-05-25
==========
* It would be interesting to write a small compability layer between the
  standard lexer/parser and the ones developed in this project.
* Also; it would be useful to generate a sample main file from BNFC to suggest
  usage of the parser.

2014-05-15
==========
* Why is the homomorphism stuff so important? It seems so obvious.

2014-05-05, before supervision
==============================
* Bug somewhere leads to more than one result. Not related to oracle though.
* Stack space overflow on large inputs

2014-04-14, after supervision
=============================
* Any merging of sections/parts of the report can be done later - just write the
  stuff for now.
* Say what you're going to say in the report!
* Any stuff that you've learnt during this should be cited - unless it is
  something that should be known by heart by any CS major.
* Testing: Remember to use the delimiters clause in BNFC grammar for lists.
* Testing: Make sure to have large enough parts to merge (like large while
  clauses)
* Measurements are more important than the positioning information! Do position
  information if there is time later.
* Remember to re-download JP:s paper from his homepage (long version)

2014-04-14, random notes
========================
* Use the Token data type, and change position to Range. IntToken will not be
  needed then. 
* Rewrite combineTokens to take range into account.

2014-04-07, after supervision
=============================
* We want to annotate InvalidTokens constructor with position information
* This can be done by newtype wrapper for (Tokens, Range), where mappend can
  depend between the two parts of the tuple (to avoid problem of always getting
  the last position at problems).
* Prioritize measurements for now, otherwise we don't know if it's incremental
  or even works as expected. Measure speed of concatenation.
* Don't care about parse errors, it's a whole project in itself. 
* Until next time: better outline of report
* Until 5/5, good draft of report done. Send PDF before the meeting (no later
  than the morning) and we will discuss it during supervision.
* Don't forget to book presentation and defense in good time. Check the time
  slots on the master thesis home page.

2014-04-02
==========
* Range in RingP doesn't seem to work. Are \n etc really captured in their
  respective tokens?
* Range instead of Size in the measure instance in the lexer could be
  problematic, since the (a,b) in a monoid can't affect the other part
  of the tuple. Currently, whenever something is wrong, the Range information
  denotes the end of the lexed string, and not the position of the error.
* Perhaps the easiest way would be to let range be a part of the IntToken?

2014-03-31, after supervision
============================
* Problem with many results was due to the chopped row being part of the closure
  (of itself), which led to some stuff ending up both on the left and the right
  hand side. Solution: only keep the leftmost element on that row. This leads to
  the other elements being re-computed, so this is not the best solution.
* The division of pair/mat-functions is now done already from "merge". This is
  consistent with the division I did for chopFirst and chopFirstRow before.
* Next week: discussion about the report
* After easter: JP on vacation, so no supervision for two weeks.

2014-03-28
==========
* Still there are too many branches in the parse results, for reasons unknown.
    - Could it be too many/too few calls to mul?
    - Could it be something in the chopping? But that shouldn't matter AT ALL!


===========
* It is important to use one/row/quad instead of One/Row/Quad because of Zero
  optimisation. Lesson learned the hard way.
* Implement generation of parser in BNFC next step?
* Still: testing time!

2014-03-17, after supervision
=============================
* chopLast is not needed since we only have to chop the first row of the second
  matrix (the first col is only zero anyway). 
* Having the merge call in mappend placed within unsafePerformIO makes sure that
  the random call is evaluated each time, since the call is now dependent on the
  parameters being merged.
* For performance: use the delimiters directive in BNF grammars (see updated
  Javalette.cf in Test/ and the C.cf in BNFC.
* Testing time! Parse two code fragments separately and then merge them.
* Until next supervision: outline of report should be done.
* There could be a bug in the code for finding Paths and rightMostOnLine, since
  we are getting a VERY large number of results (i.e. ambiguities) for even a
  small code piece (256 for Small.jl).
* TODO: The incremental parser and lexer should both be generated by BNFC, as of
  now only the lexer is generated (with --alex3inc). 

2014-03-14
==========
* Check the CNF grammar. Bad performance could be due to rule conversion being
  weird.
* Bug in XPM files is probably a bug in the new code, since it used to work with
  the general CNF parser.
* Bad performance could have something to do with the oracle forcing computation
  of all subtrees.

2014-03-07
==========
* It seems that, by looking at measureToTokens, tokens that have no meaning are
  simply discarded (AlexAccNone and AlexAccSkip). Hence it is easy to just write
  IntToken -> Maybe Token (with a dummy line position) and base conversion into
  Category and later SomeTri on those. This sure feels weird though, and might
  be too easy.

2014-03-06
==========
* After further investigation, it seems that the conversion from IntToken to
  Token has to be done before any parsing is done, since cnftables is based on
  Tokens, and not half-tokens (IntToken). This will most likely be a problem,
  and should prove interesting. Note to self: spend friday making sure you
  understand how the merging of IntToken tables is done. Perhaps one could have
  IntToken until an accept for a Token is found in the merge? But then the
  measure has to be something like Either IntToken Token I suppose.

* Measure instance is trivial though, as will the template be once the above
  mentioned issue is resolved.

2014-03-03
==========

TODO:
  * Finish the "template"
  * Write the measure instance from IntToks to SomeTri
    * The boolean flag should be picked by using unsafePerformIO . random, 
      but for now it can be constant.

Ideas for keeping track of sizes:
  * Write a ringP instance for (category,Any,Range), where range in a range of
    the input, measured in characters.
  * Later, this ranging info can be put into bnfc nonterminals, 
    perhaps using a pragma.
  * Then only, the ranging info can be used by programs which traverse the ast,
    to prune away the parts which are not relevant.

Notes before supervision 2014-03-03
===================================
* Is it reasonable to have the possibility to use LexGen code in BNFC without
  having to use an incremental parser? It would be a nice bonus, in that case.

Notes 2014-02-25
================
* IntToken -> Token thing, it's really weird to wrap around. The composition of
  monoids makes no sense to me. An IntToken has Accepts as part of its record,
  which is a list of [AlexAcc (Posn -> S.Seq Char -> Token)], so to create a
  token from an IntToken we need both positions and a char sequence. How can
  that be combined with a position and adding a char sequence as well?
  Investigation needed!

Notes after supervision 2014-02-24
==================================
* mkMat does not need Shapes as arguments, since they are already baked into the
  Mat type.
* It would be reasonable to have a chapter about dependently typed Haskell in
  the final report.

Notes before supervision 2014-02-24
===================================
* How to implement the applicative pure for Mat x y. Lifting a value to a matrix
  of arbitrary size seems...infeasible.
* Is there a simple way to find the shape of a Mat without having it passed as
  an argument in the first place?
* What is the mul function in RingP really supposed to do?

Notes after supervision 2014-02-18
==================================
* chopFirstRow has to recurse on c and d in the base case. Creating four rows
  will enable creation of a quad of everything except a b. The same goes for
  chopLastCol etc.
* For general type safety, the data type containing chop dependent data should
  be separated into two types, one for rows and one for cols.
* The change in alex template generation should be done in BNFC. Preferrably the
  changes are connected to the --cnf option. This should probably be easy, so
  best to do it as soon as possible.
* The test program in BNFC will probably have to be changed slightly
* Send the planning report to JPB before next supervision.

Notes after supervision 2014-02-11
==================================
* The Pair (se below) has to do with the balancing (and the oracle stuff)
    - NOTE to self: this is not at all obvious. Look it up further.
* The issue of Posn can be solved by combining several monoids, so it should not
  be a problem. This is something that is also implemented in Ropes in Yi. 
* TODO: implement the recursive cases for chopLast/chopFirst
    - NOTE to self: Draw matrices by hand, it makes stuff much easier.
* For more dependently typed Haskell, "hasochism" from Haskell symposium 2013.

2014-02-11
==========
* It is frustrating that my progress is insanely slow
* Exactly what is the plan?
    - Merge LexGen and ParParse
        - LexGen measure should be Token instead of IntToken
        - How to add Posn to each token while still lazy/incremental?
        - There are lots of other stuff to change in LexGen to conform it to
          BNFC generated code for future use.
    - Merge-function in Quad.hs
        - What is the point of having Pair (Mat x y a) ?

2014-02-03
==========

merge function

...Quad.hs:
 mergein :: RingP a => Bool -> SomeTri a -> Pair a -> SomeTri a -> SomeTri a

 
Further Notes 2014-01-31
========================
* It is disturbing although convenient to write this code for a specific
  language. Care must be taken to make sure it is easy to generalize.

* Eh...in what way can the CYK tables be used in the matrix stuff? Needs to be
  further examined as of now.

* Remember to print "important" MSc thesis forms on monday. Also: planning
  report should be done some time..

Notes 2014-01-31
================
* Parametrization of LexGen results were successful, it forced some new
  constraints to many functions, but looks good otherwise.
  NOTE: This was not obvious to me, but neccessary in order to make the results
        work incrementally.

* Now we more or less want to use the tables from bnfc --cnf to see what parts
  we can use to see if we have A0A1 \elem P.

* We want sigma to be measure and A0A1 \elem P as mappend.

Supervision 2014-01-27
======================
* We don't want to do measure . measure, but rather have LexGen store its result
  as FingerTree instead of Seq IntToken, so that we can measure it with our own
  measure function (matrix stuff)

* BNFC rules are generated by BNFC/Backend/Haskell/ToCNF.hs and
                                          /Haskell.hs

* For the planning report, these logs with sugar can be used, more or less.

Supervision 2014-01-14
======================

* Discussions about setup of repo, supervision, expectations etc
    - Supervision mondays
    - Approx 2 months work, 2 months report writing
    - Midway report around easter

* Guidelines for the report
    - Should be readable by other MSc students
    - Should contain everything interesting/non-obvious

* Things to think about in the actual project
    - How to handle wrong input
        - And how to know what part is bad?
        - Overlaps in matrix - decision needed!
    - How to combine lexer and parser neatly w/ Monoid & Measure
        - This is the first large challenge!
        - Once done - test it as simple as possible.

* Where to find code?
    - BNFC
    - LexGen
    - Yi

Project start 2014-01-13 - 14
========================
* Reading paper by Valiant
* Reading paper by Claessen & Bernardy
* Walkthrough of LexGen by Kristofer