v/human2regex

Fork 0

mirror of https://github.com/pdemian/human2regex.git synced 2025-06-30 18:00:17 -07:00

Go to file

Patrick Demian 7f516ec33b Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

docs

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

lib

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

src

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

tests

More bugs fixed, updated readme

2020-11-06 16:06:27 -05:00

.eslintrc.json

Got a rudimentary syntax tree started

2020-10-29 15:35:59 -04:00

.gitignore

Added tests and enforced a stricter eslint

2020-10-29 10:37:56 -04:00

.npmignore

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

.travis.yml

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

API.md

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

jest.config.ts

Added tests and enforced a stricter eslint

2020-10-29 10:37:56 -04:00

LICENSE

Initial commit

2020-10-04 05:16:36 -04:00

package-lock.json

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

package.json

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

Readme.md

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

tsconfig.json

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

webpack.config.js

Added lib path, reorganized, began working on tutorial proper

2020-11-12 04:56:02 -05:00

Readme.md

Human2Regex

Purpose

Generate regular expressions from natural language.

Instead of a convoluted mess of symbols like /([\w\.=\-]*\w+)/g why not

using global matching
create a group called capture_me
    match 0+ characters or "." or "=" or "-"
    match 1+ words

Is the former not much easier to read and bug fix than the latter?

Running the program should result in the following output:

Your regex = /(?<capture_me>[\w\.\=\-]*\w++)/g

You can then use your regex in your language of choice, with Human2Regex validating your regex for you.

Another example

// H2R supports // # and /**/ as comments
// A group is only captured if given a name. 
// You can use "and", "or", "not" to specify `[]` regex
// You can use "then" to combine match statements, however I find using multiple "match" statements easier to read

// exact matching means use a ^ and $ to signify the start and end of the string

using global and exact matching
create an optional group called "protocol"
    match "http"
    optionally match "s"
    match "://"
create a group called "subdomain"
    repeat
        match 1+ words
        match "."
create a group called "domain"
    match 1+ words or "_" or "-"
    match "."
    match a word
# port, but we don't care about it, so ignore it
optionally match ":" then 0+ digits
create an optional group called "path"
    repeat
        match "/"
        match 0+ words or "_" or "-"
create an optional group
    # we don't want to capture the '?', so don't name the group until afterwards
    match "?"
    create a group called "query"
        repeat
            match 1+ words or "_" or "-"
            match "="
            match 1+ words or "_" or "-"
create an optional group
    # fragment, again, we don't care, so ignore everything afterwards
    match "#"
    match 0+ anything

Running the program should result in the following output:

Your regex = /^(?<protocol>https?\:\/\/)?(?<subdomain>(\w+\.)*)?(?<domain>(?:\w+|_|\-)+\.\w+)\:?\d*(?<path>(\/(?:\w+|_|\-)*)*)?(\?(?<query>((?:\w+|_|\-)+\=(?:\w+|_|\-)+)*))?(#.*)?$/g

Which one would you rather debug?

Webpage

Human2Regex is hosted on github pages at https://pdemian.github.io/human2regex/

API

Human2Regex is available as an embeddable API.

The API reference is available here

Usage

Build

npm run build

Run

point web browser to: docs/index.html

Test

npm t

Todo

Return CommonError rather than requiring the user to convert to a CommonError
Move to yarn/npm
Add more regex options such as back references, subroutines, lookahead/behind, and more character classes (eg, [:alpha:])