|
| 1 | +- Some locations are still not accurate. This seems to be acting up in comments that span |
| 2 | + many lines. There is potentially an off-by-one error or similar in |
| 3 | + `Lexer.update_content_newlines` which is (supposed) to increment the lexbuf's line |
| 4 | + position for every newline encountered in some content (i.e. inside of a code or math block) |
| 5 | + |
| 6 | +- Top-level errors like two nestable block elements or headings on the same line |
| 7 | + need to be handled. Currently, they parse correctly but do not emit a warning. |
| 8 | + |
| 9 | +- Repetition in `tag_with_content` parse rule(parser.mly:207). Two productions are identical |
| 10 | + save for a newline. This is because an optional newline causes a reduce conflict due to |
| 11 | + `nestable_block_element`'s handling of whitespace. |
| 12 | + |
| 13 | +- Improve error handling inside light table cells. Currently, we cannot do much besides use |
| 14 | + Menhir's `error` token, which erases all information about the error which happened and we |
| 15 | + have to use a string of the offending token to display what went wrong to users, which |
| 16 | + doesn't necessarily communicate a lot |
| 17 | + |
| 18 | +- Tests. There are a few tests, like the ones which test the positions in the lexing buffer, |
| 19 | + which don't apply to the new parser. Others expect error messages which cannot be produced |
| 20 | + by the relevant parser rule |
| 21 | + |
| 22 | +- Likely some error cases which have not been handled. These should be trivial to fix, |
| 23 | + you should really only need to add a new production to the relevant parser rule which |
| 24 | + handles the offending token |
| 25 | + |
| 26 | +Notes for anyone working on this |
| 27 | +- Due to the nature of Menhir, this parser is difficult to work on. |
| 28 | + - Changes will have unexpected non-local consequences due to more or less tokens being consumed by |
| 29 | + some neighboring (in the parse tree) rule. |
| 30 | + - You need to familiarize yourself with the branch of the parse tree that you're working on |
| 31 | + (i.e. toplevel->nestable_block_element->paragraph) before you start making non-trivial changes. |
| 32 | + - Type errors will point towards unrelated sections of the parser or give you incorrect information |
| 33 | + about what has gone wrong. |
| 34 | + |
| 35 | +- If you need to emulate some sort of context like "paragraphs can't accept '|' tokens if they're inside |
| 36 | + tables", then you need to parameterize that rule by some other rule which dictates what it can accept. |
| 37 | + For example, toplevel block elements match `paragraph(any_symbol)` and tables match |
| 38 | + `paragraph(symbols_except_bar)` |
| 39 | + |
| 40 | +- Be as specific as possible. Avoid optional tokens when possible. Prefer the non-empty |
| 41 | + list rules (`sequence_nonempty`, `sequence_separated_nonempty`) over the alternatives. |
| 42 | + Ambiguity will produce a compile-time reduce/reduce rule if you're lucky, unexpected |
| 43 | + behavior at runtime if you're not. |
| 44 | + |
| 45 | +- Contact me on the company slack or at faycarsons23@gmail.com if you're confused about |
| 46 | + anything! |
0 commit comments