Name Description Size
mod.rs ! This module contains types and routines for implementing determinization. In this crate, there are at least two places where we implement determinization: fully ahead-of-time compiled DFAs in the `dfa` module and lazily compiled DFAs in the `hybrid` module. The stuff in this module corresponds to the things that are in common between these implementations. There are three broad things that our implementations of determinization have in common, as defined by this module: The classification of start states. That is, whether we're dealing with word boundaries, line boundaries, etc., is all the same. This also includes the look-behind assertions that are satisfied by each starting state classification. The representation of DFA states as sets of NFA states, including convenience types for building these DFA states that are amenable to reusing allocations. Routines for the "classical" parts of determinization: computing the epsilon closure, tracking match states (with corresponding pattern IDs, since we support multi-pattern finite automata) and, of course, computing the transition function between states for units of input. I did consider a couple of alternatives to this particular form of code reuse: 1. Don't do any code reuse. The problem here is that we *really* want both forms of determinization to do exactly identical things when it comes to their handling of NFA states. While our tests generally ensure this, the code is tricky and large enough where not reusing code is a pretty big bummer. 2. Implement all of determinization once and make it generic over fully compiled DFAs and lazily compiled DFAs. While I didn't actually try this approach, my instinct is that it would be more complex than is needed here. And the interface required would be pretty hairy. Instead, I think splitting it into logical sub-components works better. 28587
state.rs 34858