-
Notifications
You must be signed in to change notification settings - Fork 296
Version 0.60.0 Changes
In many cases, the extent of the changes and optimizations made it impossible to use the slow
@Deprecated annotation with eventual removal. Manual changes to code will be required to
migrate code.
ℹ️ If you encounter difficulty in migrating some code constructs please open an issue with details of the construct to be migrated.
Major improvements include:
-
New implementation of
SegmentedSequenceusing binary offset tree with efficient access, storage and instantiation. -
New
SequenceBuilderclass, used to create segmented sequences of arbitrary content without concern for segment ordering or whether they share a common base sequence.A segment which cannot be converted to an offset range from the base sequence will be converted to out of base characters, preserving the expected character sequence result.
The builder will optimize literal characters when they match corresponding base sequence characters with special handling of spaces and EOL characters. This means that adding literal spaces and EOL characters instead of using a subsequence will result in them being efficiently replaced by segments from the original base sequence.
For convenience, an instance of
SequenceBuildercan be obtained from any based sequence throughBasedSequence.getBuilder()method. -
New
LineAppendableimplementation used for rendering text. Internally, the class builds a list of lines and keeps track of each line's prefix portion, allowing efficient access and manipulation of lines and prefixes in the rendered result.The generated
BasedSequenceresult will result in aSegmentedSequencewith offsets into the source sequence preserved, allowing mapping offsets in result to original sequence.The result lines are stored as separate
BasedSequencesto maximize preservation of original sequence offset information when the rendering rearranges the lines of the source, as in the case of formatting with reference definition sorting orMarkdownTablesorting. -
Formatting module is now part of the core library with additional features:
- Paragraph text wrapping to fit within the set right margin.
- Source offset tracking from formatted to original markdown
- Table formatting includes sorting by columns and transpose table methods
- Major reorganization and code cleanup of implementation
-
Formatter implementation is now part of core implementation in
flexmarkmodule -
Formatterimproved with more options including wrapping text to margins.- added ability to track and map source offset(s) to their index in formatted sequence. This feature allows editor caret position preservation across formatting operation.
- Offset tracking unified using
TrackedOffset. Used byMarkdownParagraphfor text wrapping andMarkdownTablefor table formatting and able to handle caret position during typing and backspace editing operations which are immediately followed by formatting or the edited source.
-
Tests cleaned up to eliminate duplication and hacks
-
flexmark-test-utilmade reusable for other projects. Having markdown as the source code for tests is too convenient for use only inflexmark-javatests. -
Optimized
SegmentedSequenceimplementation using binary trees for searching segments and byte efficient segment packing. Parser performance is either slightly improved or not affected but allows usingSegmentedSequencesfor collectingFormatterandHtmlRendereroutput to track source location of all text with minimal overhead and double the performance of old implementation. -
new implementation of
LineAppendablereplacesLineFormattingAppendableused for text generation in rendering:-
uses
SequenceBuilderto generateBasedSequenceresult with original source offsets for those character segments which come from the source. This allows round trip source tracking from Source -> AST -> Formatted Source -> Source throughout the library.As an added bonus using the appendable makes formatting to it 40% faster than previous implementation and 160 times more efficient in memory use. For the tests below, old implementation allocated 6GB worth of segmented sequences, new implementation 37MB. The % overhead for the new implementation is four times greater than before but that is after a 43 fold reduction in total overhead bytes, old implementation needed 342MB of overhead, new implementation 8MB.
As a result of increased efficiency, two additional files of about 600kB each can be included in the test run and only add 0.6 sec to the formatter run time.
Tests run on 1141 markdown files from GitHub projects and some other user samples. Largest was 256k bytes.
Description Old SegmentedSequence New Segmented Sequence New LineAppendable Total wall clock time 13.896 sec 9.672 sec 8.344 sec Parse time 2.402 sec 2.335 sec 2.297 sec Formatter appendable 0.603 sec 0.602 sec 0.831 sec Formatter sequence builder 7.264 sec 3.109 sec 1.772 sec The overhead difference is significant. The totals are for all segmented sequences created during the test run of 1141 files. Parser statistics show requirements during parsing and formatting.
Description Old Parser Old Formatter New Parser New Formatter New LineAppendable Bytes for characters of all segmented sequences 917,016 6,029,774,526 917,016 6,029,774,526 37,663,196 Bytes for overhead of all segmented sequences 1,845,048 12,060,276,408 93,628 342,351,155 8,204,796 Overhead % 201.2% 200.0% 10.2% 5.7% 21.8% -
-
-
Break: split out generic AST utilities from
flexmark-utilmodule into separate smaller modules.com.vladsch.flexmark.utilno longer contains any files, only separate utility modules withflexmark-utilsmodule being an aggregate of all utilities modules, similar toflexmark-all-
ast/classes toflexmark-util-ast -
builder/classes toflexmark-util-builder -
collection/classes toflexmark-util-collection -
data/classes toflexmark-util-data -
dependency/classes toflexmark-util-dependency -
format/classes toflexmark-util-format -
html/classes toflexmark-util-html -
mappers/classes toflexmark-util-sequence -
options/classes toflexmark-util-options -
sequence/classes toflexmark-util-sequence -
visitor/classes toflexmark-util-visitor
-
-
Break: delete deprecated properties, methods and classes
-
Add:
org.jetbrains:annotations:15.0dependency to have@Nullable/@NotNullannotations added for all parameters. When using IntelliJ IDEA for development, it helps to have these annotations for analysis of potential problems and makes it easier to use the library with Kotlin. -
Break: refactor and cleanup tests to eliminate duplicated code and allow easier reuse of test cases with spec example data.
-
Break: move formatter tests to
flexmark-core-testmodule to allow sharing of formatter base classes in extensions without causing dependency cycles in formatter module. -
Break: move formatter module into
flexmarkcore. this module is almost always included anyway because most extension have a dependency on formatter for their custom formatting implementations. Having it as part of the core allows relying on its functionality in all modules. -
Break: move
com.vladsch.flexmark.specandcom.vladsch.flexmark.utilinflexmark-test-utiltocom.vladsch.flexmark.test.specandcom.vladsch.flexmark.test.utilrespectively to respect the naming convention between modules and their packages. -
Break:
NodeVisitorimplementation details have changed. If you were overridingNodeVisitor.visit(Node)in the previous version it is nowfinalto ensure compile time error is generated. You will need to change your implementation. See javadoc comment in theNodeVisitorclass for instructions.ℹ️
com.vladsch.flexmark.util.ast.Visitoris only needed for implementation ofNodeVisitorandVisitHandler. If all anonymous implementations ofVisitHandlerare converted to lambdas, then imports forVisitorcan be eliminated.- Fix: remove old visitor like adapters and implement ones based on generic classes not linked to flexmark AST node.
- remove old base classes:
-
com.vladsch.flexmark.util.ast.NodeAdaptedVisitorsee javadoc for class com.vladsch.flexmark.util.ast.NodeAdaptingVisitHandlercom.vladsch.flexmark.util.ast.NodeAdaptingVisitor
-
IntelliJ-IDEA migration migrate flexmark-java 0_50_x to 0_60_0.xml can be used to assist in migrating from 0.50.40 to 0.60 version of the library. It will migrate class name and package changes only.
Changes to arguments and method changes have to be addressed manually.
This class is renamed to LineAppendable. Implementation and subclasses are similarly renamed
to remove Formatting in the class name.
All formatting flags are now prefixed with F_ and when present, select the given modification
of appended text. Previously, ALLOW_LEADING_WHITESPACE and ALLOW_LEADING_EOL were inverted
and setting them disabled the text modification.
-
ALLOW_LEADING_WHITESPACEis nowF_TRIM_LEADING_WHITESPACEand has inverted meaning. -
ALLOW_LEADING_EOLis nowF_TRIM_LEADING_EOLand has inverted meaning. -
CONVERT_TABSis nowF_CONVERT_TABS -
COLLAPSE_WHITESPACEis nowF_COLLAPSE_WHITESPACE -
TRIM_TRAILING_WHITESPACEis nowF_TRIM_TRAILING_WHITESPACE -
PASS_THROUGHis nowF_PASS_THROUGH -
TRIM_LEADING_WHITESPACEis nowF_TRIM_LEADING_WHITESPACE -
PREFIX_PRE_FORMATTEDis nowF_PREFIX_PRE_FORMATTED -
FORMAT_ALLis nowF_FORMAT_ALL
This interface and the implementation classes were refactored and were reworked for efficient
use with SequenceBuilder.
-
CharPredicateclass is now used to provide character sets instead ofCharSequenceto provide consistent and efficient character tests. Methods withCharSequencearguments which were used for selecting character sets, are nowCharPredicate.The simplest way to change the method call is to use
CharPredicate.anyOf(CharSequence)to convert a character sequence to predicate. -
some methods were renamed to better reflect their operation. In these cases the old name methods are deprecated and default implementation invokes the new methods.
This class was renamed to SegmentedSequenceFull, which contains the old, inefficient
implementation. It is not recommended that the old class be used due to its inefficient and in
some cases buggy implementation.
The new SegmentedSequence is an abstract class with concrete implementation by
SegmentedSequenceFull and SegmentedSequenceTree. The latter is an efficient implementation
using binary search tree.
The right way to create an instance of SegmentedSequence is to use an instance of
SequenceBuilder to build a sequence then use SequenceBuilder.toSequence() to return an
instance of SegmentedSequenceTree if the result requires a segmented sequence or a subsequence
of underlying BasedSequence if the single segment.