Adding lookbehinds to rust-lang/regex

An annotated guide to adding captureless lookbehinds to the Rust linear-time regex engine.

In a previous blogpost, Erik wrote about how he implemented the linear time matching algorithm from [RegElk+PLDI24] in the popular regex engine RE2. In this one, we're looking at how to do the same thing for the official regex engine from the Rust language. Namely, we add support for unbounded captureless lookbehinds.

[RegElk+PLDI24]

First, let's discover the newly supported feature and its limitations. Lookbehinds allow regexes to make assertions about things preceding some part of the regex pattern without considering it as part of the match. Similarly, negative lookbehinds, which we also support, assert that something is not preceding. Consider the regex

(?<=Title:\s+)\w+

which would match the following strings (matches underlined with ~):

Title: HelloWorld
       ~~~~~~~~~~

Title: Title: foo
       ~~~~~
              ~~~

But does not match:

No heading

title: bad case

Title:nospace

As seen in the example, lookbehind expression can be unbounded. This means we do not know ahead of time how many characters a lookbehind will match. This is an improvement over many existing regex engines which support lookbehinds but only if they are of a bounded (like the ubiquitous PCRE) or sometimes even fixed (like Python's re) length. However, as a downside our lookbehinds do not support containing capture groups which are a feature allowing to extract a substring that matched a part of the regex pattern.

The actual implementation can be found in the PR to the rust-lang/regex repository.

Architecture of the regex crate

In the Rust ecosystem, libraries that are published are called crates. The language team maintains a handful of official crates, among which there is one called "regex" that provides exactly what you would expect: a regex matching engine.

Under the hood, the regex crate is only a thin wrapper around the much more elaborate "regex-automata" crate, which provides several different engine implementations to match regexes. Furthermore, the code for parsing regexes is split into yet another crate called "regex-syntax". In the following, we will give an overview of the architecture of both these crates.

Regex-Syntax

This crate takes care of processing the raw regex input into a representation that can be easily used by the regex compiler. The first step is to parse the string representing a regex into the AST (abstract syntax tree) representation. At this stage all unsupported regex features are rejected, such as backreferences. Then, the AST is translated into a more normalized representation called HIR (high-level intermediate representation). Each node in the HIR stores properties which help the compiler and the engines to optimize or guarantee correctness of the regex matching process. An example of such a property is the length of the shortest and longest string that can be matched by the HIR node.

Regex-Automata

This crate features many different implementations of a regex engine, each with different capabilities and runtime characteristics. All engines promise to have linear runtime complexity (in the length of the haystack and size of the regex), but the constant factors differ. The main engine that we modified for our implementation is the PikeVM which performs extended NFA simulation.

However, there is one other engine in this crate that is of great importance: the meta-regex. This engine is what is used by the wrapping top-level "regex" crate and its job is to select the "best" of the other engines for performing a search. This decision procedure is based on some heuristics that don't matter much for understanding our work, but one important detail is that the engine first tries to find a match without using any regex engine if possible. This is the case when the regex consists of only a literal character sequence or a disjunction of multiple such sequences or if it can be transformed into such a disjunction. A match can then be searched by a "simple" substring search.

Implementation work

Most of the initial work to get lookbehinds up and running consisted of implementing the two new NFA state types "WriteLookAround" and "CheckLookAround" (code). The first is essentially the equivalent of a "match" state but for a lookbehind, while the second instructs the regex engine to check whether or not a certain lookbehind matches at the current string position. We named them "*LookAround" instead of "*LookBehind" because, in principle, they are enough to implement lookaheads as well. In this section and the following sub-sections, we will dive deeper into the used lookbehind algorithm and the issues that arise from it.

Once these states were properly produced by the regex-parser and compiler pipeline, and once we told the PikeVM how to interpret these instructions, we had an implementation that was basically working. The kind of changes to the code base were very similar in spirit to the changes done to RE2, albeit a bit more verbose, mainly due to Rust's strict type system and the way the crate is structured.

However, there were certain challenges worth looking at in detail. Some of them showed up during the initial implementation phase while others surfaced only once we started benchmarking the resulting engine.

Frontend

To support lookbehinds, the parser had to accept them while still rejecting lookaheads. Additionally, if a lookbehind contained a capture group, the parser would also need to reject it, due to it being unsupported currently by our implementation. The error messages indicate to the user that the syntax is recognized but not supported.

During the AST to HIR translation, the stored properties are updated to account for the different semantics of lookbehinds. For example, a lookbehind does not contribute to the length of the match, so the length properties of the HIR node are set to zero. A new property is added to the HIR node that indicates whether or not the node contains a lookbehind. This is used in places where one has to behave differently depending on whether or not the regex contains a lookbehind.

The Meta-Engine

As explained above, the meta-engine (or meta-regex) bundles all other engines into one and chooses the "most efficient" of them for any search executed. Since we only implemented lookbehind support for the PikeVM, we had to make sure the meta-engine only considers the PikeVM as soon as there are any lookbehind expressions in the regex. This includes forbidding of skipping all engines and using a substring search. This was explicitly disabled, since lookbehinds are not part of the match, yet contribute to what is considered a match.

This turned out to be easier than anticipated. The selection strategy essentially boils down to trying to build each of the engines in order of efficiency and using the first one that succeeds. Therefore, we simply had to make sure the build process of all engines other than the PikeVM fails with a proper error, whenever there is a lookbehind in the regex.

There was one caveat though: the meta-regex expected the build process of the bounded backtracker to always succeed, because a backtracking engine supports all regex features. The only reason the backtracking engine would not be used is if the configured max memory usage is not sufficient to guarantee linear runtime. Therefore, we added a manual condition that checks for lookbehinds explicitly and skips the backtracker in that case (code).

Lookbehind Threads

When we started benchmarking, it became evident that something caused matching with lookbehinds to run longer than needed. Once we figured out the cause, it wasn't too hard to fix, but let's first take a look at how we implemented the problematic part initially.

The concept of the algorithm is to run multiple NFA simulations in lockstep, one automaton for each lookbehind and one for the main regex. However, in practice it would have taken quite a bit of refactoring to account for the possibility of actually running multiple simulations in lockstep. So how did we do it instead? Well, in the case of the PikeVM, running multiple NFAs is equivalent to running the automaton that consists of a union of all individual sub-automata, since the PikeVM does not perform a depth first search like the backtracking engine. Therefore, that is exactly what we did: compile the automaton for each lookbehind and for the main regex and then patch them together in one big union state from which the simulation is then started.

Now for the algorithm to work correctly, it is essential that the NFAs corresponding to lookbehinds that are more deeply nested are executed first. This is because we need the results of inner lookbehinds in order to correctly determine the results of outer ones. To make sure this is the case, the live threads of the inner lookbehinds must have higher priority than the ones of the outer lookbehinds. For the particular implementation of the PikeVM we were dealing with, this means that the threads of the inner lookbehinds must be added to the ordered set of live threads first, which is easy to guarantee by ordering the corresponding sub-automaton earlier in the top-level union state.

So far so good, but now comes the problematic part: priority not only determines which threads are processed first, it also determines what match is returned when multiple are found. This is important to uphold the same match semantics as a backtracking engine. Consider, for example, the regex "a*b|a" on the haystack "aaab". At the beginning of the search, the PikeVM registers a match immediately after consuming the first character. However, because the first alternative in a regex takes priority over the second, the PikeVM cannot return that match yet. It must continue scanning the haystack until it reaches the end, at which point it registers that the first alternative actually matches as well and this is the match that is finally returned. Importantly, even on the haystack "aaaa", where the first alternative does not match, the PikeVM must still scan to the end of the string to realize that the first alternative does not match and it is ok to return the match of the second alternative; there is simply no way to know beforehand.

What does this have to do with lookbehinds? Remember we compile the NFA in a way where all sub-automata have higher priority than the main regex. What's more, every lookbehind sub-automaton has a ".*?" prefix, in order to make sure we capture all lookbehind matches during our search and not only one at the beginning of the haystack. This in combination means that there are always live threads that have higher priority than any thread that could produce a match from the main regex, which in turn means that the PikeVM will scan to the end of the haystack regardless of found matches. Because this behavior arises as soon as any lookbehind is present, no matter the complexity of the lookbehind or the main regex, this was quite problematic performance wise.

As already stated, the solution to this problem was more or less straight forward: we needed to keep the live threads of the lookbehinds in a place separate from the threads that belong to the main regex. That way, as soon as the ordered set of live threads of the main regex runs empty, we can be sure there won't be any more matches and similarly, if there is a match of the highest priority main regex thread, this will actually be registered as such and the search can be concluded immediately.

To make this separation work required to refactor the NFA compilation a bit. We still compile all sub-automata into a single NFA struct, albeit without patching them together with a union anymore. Additionally, the NFA struct now has a field that contains the index of all the sub-automata starting states. This is necessary because we now need to explicitly initialize the starting thread for each sub-automaton.

Two NFAs, one with a union of lookbehinds and the main regex, the other with separate lookbehinds and the main regex. — Resulting NFA structure after separating the lookbehinds and removing the union.

It should be noted that we also considered the idea of tagging the live threads with the information of whether or not they belong to a lookbehind (either by storing this information in the thread itself or in the state of the NFA that is referenced by the thread). While this would most likely have worked too, the code for detecting whether there are still live threads from the main regex would, in comparison, be much more complicated and most likely less performant.

Match-All

The reason the problem described above with scanning to the end unnecessarily actually became apparent during benchmarking match-all searches, in which the regex engine returns all non-overlapping matches found in the string. The reason is that scanning to the end of the string regardless of found matches causes the overall search to have quadratic runtime complexity in the length of the haystack instead of linear. The quadratic behavior of a match-all is a known existing issue, even without any lookbehinds present. However, for most regexes, this is not the case and certainly it was not the case for the simple examples we used in our benchmarks.

Once the lookbehind thread separation was implemented, we were shocked to find that the search, although faster, still had quadratic complexity. After some debugging, we noticed the problem now was that the lookbehind threads were not kept alive between the different search iterations to find all matches. Essentially, the way match-all was implemented is as follows: find a match and return it, then start a completely fresh search at the position where the previous match ended. Before starting this successive search, all state accumulated during the previous search is discarded, including the lookbehind threads.

This is of course detrimental to performance because it means that now, instead of scanning until the end for every match, we have to start searching from the beginning of the haystack after each match to make sure we have up to date lookbehind threads when we reach the actual starting position of the search. This "catching up" is essential for correctness as we need to know whether a lookbehind holds or not even for the first position in the haystack that is searched.

The work done in this case is, however, wasted effort because the information was available during the search of the previous match. Hence, the solution was to implement a configuration that was activated only for match-all searches, which would keep the live threads of the lookbehinds whenever a match is found. These threads are then restored at the beginning of the next search, thereby making the "catching up" unnecessary.

Evaluation

Firstly, in each modified module, unit tests have been added to give us assurance that the implementation is correct. Integration tests have also been added to test the entire regex pipeline (code).

The author of the regex crate has also created a benchmarking tool to compare various regex search engines called rebar. Using this code, it was rather straightforward to add our modified version of the engine to the tool and then comparing the performance both to the original regex crate as well as to other engines. We benchmarked our modifications for three reasons: First, we wanted to make sure that our changes did not cause a significant slowdown for any searches that do not include lookbehinds. Second, we wanted to check whether the performance for searches that do include lookbehinds is reasonable. To that end, we compared the performance to the python library "re". We chose this engine as a comparison due to ease of use, which made the benchmark procedure more straightforward, as well as due to it being used ubiquitously in python programs. Importantly however, the "re" library uses a different algorithm, a backtracker. Finally, we wanted to make sure that the search implementation with lookbehinds respects the linear time complexity constraint. (benchmark definition).

The comparison between our modified version and the original crate showed no significant slowdowns, which leads us to believe that we were successful in implementing lookbehinds in a way that does not cause a downside to regexes that don't need the feature.

After having fixed the performance issues mentioned earlier, we were also successful in demonstrating linearity using the benchmarking results on searches of increasing size.

The comparison of the overall performance to python-re was a bit more challenging. Since the existing benchmarking suite does not contain regexes with features unsupported by the regex crate, there are no benchmarks included, which contain regexes with lookbehinds. Coming up with meaningful and representative benchmarks is challenging on its own. What's more is that many regexes one can find in real software either don't use any of the extended regex features at all, because they use an engine that does not support them, or they use not only lookbehinds but also lookaheads and backreferences, which we did not implement support for (where backreference matching is known to be NP-hard).

In the end, we decided to adopt the benchmarking strategy described in [RegexMatching+POPL24]. They used the ruleset of the network monitoring tool Snort, which contains some rules that use regexes. Out of all these regexes, there were only 5 which fulfilled the criteria of containing lookbehinds but no other unsupported regex features. We also used the network capture logs from the wrccdc archive as a haystack as did the authors of the paper.

The results showed that we were around 2-5x slower than python-re on most benchmarks. This is certainly acceptable, given that the regexes are most likely written in a way that optimizes them for a backtracking engine, which is what python-re provides.

There were, however, two benchmarks on which our implementation performed around 60x slower. The suspicion was on python-re optimizing bounded lookbehinds leading to huge speedups. We were happy to be affirmed in this hypothesis later on, when we implemented optimizations for bounded lookbehinds, which caused the performance to improve into the same ballpark as the other benchmarks.

[RegexMatching+POPL24]

Extensions

We spent most time polishing the basic implementation described above. The goal was to get a working implementation of unbounded lookbehind expressions in the regex crate, and we opened a PR to get our changes merged upstream. There is, however, quite a bit more that is possible and we decided to spend some time on the following two topics. Given the clean implementation in the PikeVM and all required changes in the parser, compiler, etc. it was also easier to split the work among the two of us to go in different directions.

Bounded lookbehinds optimization - Marcin

To understand the optimization, we must first understand the problem of playing catch-up by the lookbehind automatons. A search is performed on a haystack, but more specifically on its subset; the search range. For instance, as mentioned, match-all works by starting brand new searches on the same haystack, however, the beginning of the new search range is set to be the end of the previous match. The issue is that lookbehinds must disregard the search range and always start their execution from the very beginning of the haystack. This is due to the lookbehind's expression potentially staring its match before the start of the search range. But if we know that a lookbehind needs to start only $k$ characters before the search range instead of the very beginning of the haystack, we can avoid doing a lot of useless work. This is exactly what this optimization does. In practice, in our benchmarks this optimization yielded speedups of up to 150x. The implementation can be found here.

Computing the $k$

To guarantee correctness of matching, the chosen $k$ can be overestimated but not underestimated. Hence why starting at the very beginning of the haystack is always a correct approach. In that case, $k = \infty$ .

To compute $k$ for each lookbehind, two components are needed: the longest length of a string that can be matched by that lookbehind (called $n$ ), and the shortest length of a string that can be matched by the regex that precedes the lookbehind (called $m$ ).

Let's look at an example:

A regex with lookbehinds and their respective n and m values. — An example colorcoded regex with lookbehinds and their respective $n$ , $m$ values.

The red lookbehind has $n = \infty$ since a+ can match any amount of characters. $m = 0$ since there is nothing preceding that lookbehind.
The yellow lookbehind has $n = 4$ since it always matches exactly 4 characters. $m = 0$ since there is nothing preceding that lookbehind. Note that the $m$ value is relative to the surrounding lookbehind.
The brown lookbehind has $n = 1$ since it always matches exactly 1 character. $m = 2$ because that is the shortest matching length of aa.
The blue lookbehind has $n = 2$ since it always matches exactly 2 characters. The surrounding lookbehinds do not contribute to the match length. $m = 1$ because that is the shortest matching length of a*a (zero of "a" followed by one "a").

Then, $k$ is equal to the sum of all surrounding lookbehind $k$ values plus $n$ minus $m$ .

The red lookbehind has $k_{r e d} = \infty - 0 = \infty$
The blue lookbehind has $k_{b l u e} = 2 - 1 = 1$
The yellow lookbehind has $k_{y e l l o w} = k_{b l u e} + 4 - 0 = 5$
The brown lookbehind has $k_{b r o w n} = k_{b l u e} + 1 - 2 = 0$

Notice that it is possible for $k$ to be negative, such as with aaa(?<=a). That simply means that the lookbehind automaton can start after the main automaton.

The $k$ values are computed during the NFA compilation phase, which allows all engines to utilize them for their own optimizations. In the scope of our work, we take advantage of these $k$ values only in the PikeVM, and $m$ is not computed and always set to zero.

New game of catch-up

Now that we know the $k$ value for each lookbehind, we can start each lookbehind automaton at a different position of the haystack, $k$ away from the start of the search range. Of course, this optimization works only for bounded lookbehinds, since any unbounded lookbehind would have a value of $k = \infty$ and would therefore always start at the beginning of the haystack.

There is however one more aspect that has to be considered for the correctness of this optimization when used by the PikeVM. As mentioned before it is crucial that the inner lookbehinds are executed before the outer ones to make the side effect results of WriteLookAround visible by CheckLookAround. Since the haystack is traversed from left to right, lookbehinds with a larger $k$ value will start their execution before those with a smaller $k$ value. Thus to maintain correctness, we cannot allow inner lookbehinds to have values of $k$ that are smaller than the $k$ of the outer ones.

The $k$ values are computed by the NFA compiler. Then the PikeVM conservatively adjusts them to ensure correctness. The adjustment is done by taking the maximum of the lookbehind's and the outer lookbehind's $k$ values. After this adjustment a lookbehind will never start before its outer lookbehinds. Thus, $k_{b r o w n} = max (k_{b l u e}, 0) = max (1, 0) = 1$ .

Bounded Backtracker - Robin

Among the NFA simulating PikeVM and a couple of DFA simulation strategies, the regex crate also features a backtracking engine. Now, hold on, you might say, doesn't a backtracking engine have exponential time complexity? You would be right for a standard backtracking engine, but this one is implemented with memoization; also called bitstate backtracker sometimes.

Since backtracking is generally faster than NFA simulation and most backtracking engines have lookbehind support built in, I decided to try and implement lookbehind support in the backtracker of the regex crate. There were two main challenges to be solved: lookbehind reversal and memoization of lookbehind results.

Lookbehind reversal

The way a backtracking engine works can be described as "trial-and-error". It tries to match the haystack with the highest priority variant of the given regex, that is, leftmost alternative in a union, as many repetitions as possible for quantifiers, etc. As soon as there is a failure, it backtracks to the position in the haystack where it last took a decision and tries the next possibility in decreasing priority order.

When the backtracker reaches a position at which a lookbehind is to be checked, it has hence no idea whether the characters preceding that position match the lookbehind or not. Because we allow unbounded lookbehinds, meaning the maximum possible number of characters which one needs to inspect for the lookbehind is not known in advance, the only possibility the backtracker has is to step through the haystack in reverse. To be able to do this, it needs to know how to match the lookbehind in reverse.

Luckily, the regex crate already had a feature in the compiler that does reversed compilation of an arbitrary regex. So what I did was simply compile each lookbehind subexpression both in the forward (used by the PikeVM) as well as the reverse (used by the backtracker) direction. To be able to use the correct NFA states, I had to add another field to the NFA struct with the starting states of the reverse sub-expressions for the lookbehinds.

One important detail to note is that the reverse sub-automaton for each lookbehind must not have the ".*?" prefix added, since the backtracker already knows at what position in the haystack the lookbehind should hold when it comes around to checking one.

Lookbehind memoization

The way memoization is implemented for the main regex is roughly as follows: whenever the backtracker reaches a (NFA state, haystack position)-pair for the second time, it immediately cancels the current search thread and backtracks to the last decision point. Why is that? Because the backtracker has the great advantage that it is always exploring the highest priority match, meaning as soon as a match is found, it can return the match instantly. Therefore, if it reaches such a state-position-pair a second time, this can only mean the path that follows it will not lead to a match.

We would like to have the same kind of memoization for lookbehind exploration, but there is a problem. We will illustrate it with the following (nonsensical) example regex:

^(a|aa)(?<=aa?)b

Suppose we run it on the haystack "aab". Here is an illustration for the first execution path the backtracker will try:

Graph illustrating the path explored by the backtracking engine. — First explored path.

The left automaton is the main regex and the right one is the lookbehind. The green arrows indicate the path explored by the backtracker and the red arrows indicate a failed transition. Note that the transition from state 3 to 4 is attempted only after the entire lookbehind automaton has been explored and thereby a match for the lookbehind has been found at position 1 in the haystack. The sequence lists the state-position-pair visited in order of execution. Once the backtracker fails in the transition from 3 to 4, it will backtrack to state 0 and explore the missing path in the main regex. Here is an illustration for the second execution path:

As you can see, the state 7 is reached in position 0 of the haystack as for the previous execution path. This means that the backtracker will abort and report no match found, even though clearly a match would be found if the backtracker continued with the execution. The blue arrow indicates that the backtracker aborted execution during the check of the lookbehind.

The reason for this wrong behaviour is that we might reach lookbehind states again, even though they actually lead to a successful match of the lookbehind expression. Hence, it is important to store the information of whether the exploration has led to a match or not when state-position-pairs of lookbehind automata are concerned. It is sufficient to store an additional bit in the memoization table that indicates this very result.

The table is initialized with false and is only updated when an exploration has led to a positive result. This is sound because of the depth first search order inherent to backtracking, meaning that the earliest time a state-position pair can be revisited is after an exploration has terminated.

The way we set the bit works as follows: whenever the backtracker transitions to a new state, it first checks the visited table to see if it has seen the state-position-pair before. If it has not, and it is currently exploring a transition belonging to a lookbehind, it pushes a note onto it's DFS-Stack. The note contains the current state id and haystack position. Once the result of the lookbehind check is known (by running out of options to explore or by encountering a WriteLookAround state), the DFS algorithm pops off the notes from the stack. For each note popped off, it sets the memoization bit for the corresponding state-position-pair to true, provided that a WriteLookAround state has been encountered.

Match-All

Similar to the initial implementation of the algorithm in the PikeVM, the implementation of the backtracker suffers from quadratic runtime complexity when performing a match-all search. The reason is, however, a different one. The bitstate vector is cleared whenever a new search is started (resetting both the memoization of what state-position-pairs it has visited as well as the memoized lookbehind outcome). It's not so much the information that is lost in doing so that causes the quadratic runtime, but rather the fact that this clearing operation itself takes time linear in the size of the haystack and the size of the regex. Therefore, this is actually a problem that affects the backtracker for all regexes.

My guess for why this isn't a problem per se is that it only materializes when the amount of matches is very close to the size of the haystack and in that case, usually the regex is so simple that it can easily be handled by the other engines and there's no need to fall back to the backtracker or even PikeVM.

I tried to fix the problem in the same spirit as for the PikeVM: simply don't reset the visited cache during a match-all search. This, unfortunately does not work though. The reason is subtle but I couldn't find a way around it: matches are delayed one character in the engine, for the reason of being able to check simple lookahead assertions (e.g. end-of-string). This causes states to be visited at the end position of the match, inserting (state, position)-pairs into the visited set. Now, when the next search is started, we're in trouble because the starting position is equal to the end position of the preceding match. This means there are certain states that were visited in the preceding search, that should now be re-explored but will simply cause a failed match due to already being present in the memoization table.

One way to potentially solve this problem would be to clear only the bits associated with the end position of the last match instead of the entire memoization table. However, this would still cause quadratic complexity in the size of the regex. This is arguably much better than in both the size of the regex and the size of the haystack, but still, it's not a satisfying solution, so I didn't bother implementing it, as it would have been quite a bit of work and I'm not even sure it would correct the search semantics.

Conclusion

The feature of lookbehinds is very often absent in linear regex engines. In this effort, we have laid the work to bring them into the Rust ecosystem by implementing lookbehinds in the regex crate. The implementation has been polished by implementing practical optimizations such as those for bounded lookbehinds, the prevention of quadratic runtime complexity in match-all searches, or prevention of unnecessary lookbehind scanning till the end of the haystack. The benchmarks show a reasonable and usable performance making it ready for real-world applications. Finally, solid foundations have been established in the backtracking engine for further extensions, namely for lookaheads. We hope that this work will benefit the Rust community and encourage other linear regex engines to implement lookbehinds as well!

Appendix

benchmarks

benchmark                                     python/re            rust/regex-lookbehind  rust/regex-lookbehind-new
---------                                     ---------            ---------------------  -------------------------
lookbehind/snort/snort-0                      2.2 GB/s (1.00x)     45.0 MB/s (50.40x)     1034.7 MB/s (2.19x)
lookbehind/snort/snort-1                      204.0 MB/s (1.00x)   34.3 MB/s (5.94x)      34.1 MB/s (5.99x)
lookbehind/snort/snort-2                      107.1 MB/s (71.24x)  53.0 MB/s (143.94x)    7.5 GB/s (1.00x)
lookbehind/snort/snort-3                      100.7 MB/s (80.25x)  102.2 MB/s (79.08x)    7.9 GB/s (1.00x)
lookbehind/snort/snort-4                      2041.9 MB/s (1.00x)  45.9 MB/s (44.52x)     967.3 MB/s (2.11x)
lookbehind/snort/linear-haystack-1000         119.4 MB/s (1.00x)   27.2 MB/s (4.39x)      26.3 MB/s (4.54x)
lookbehind/snort/linear-haystack-10000        136.3 MB/s (1.00x)   27.3 MB/s (4.98x)      25.8 MB/s (5.28x)
lookbehind/snort/linear-haystack-100000       136.8 MB/s (1.00x)   27.3 MB/s (5.01x)      25.8 MB/s (5.29x)
lookbehind/snort/linear-haystack-many-1000    17.0 MB/s (1.00x)    7.8 MB/s (2.19x)       7.8 MB/s (2.18x)
lookbehind/snort/linear-haystack-many-10000   17.0 MB/s (1.00x)    7.8 MB/s (2.19x)       7.8 MB/s (2.18x)
lookbehind/snort/linear-haystack-many-100000  18.0 MB/s (1.00x)    7.8 MB/s (2.31x)       7.8 MB/s (2.31x)

Architecture of the regex crate

Regex-Syntax

Regex-Automata

Implementation work

Frontend

The Meta-Engine

Lookbehind Threads

Match-All

Evaluation

Extensions

Bounded lookbehinds optimization - Marcin

Computing the k

New game of catch-up

Bounded Backtracker - Robin

Lookbehind reversal

Lookbehind memoization

Match-All

Conclusion

Appendix

benchmarks

Computing the $k$