URL 模式 API
注意:此功能在Web Workers 中可用。
URL 模式 API定义了一种语法,用于创建 URL 模式匹配器。这些模式可以与 URL 或单个 URL 组件匹配。URL 模式 API 由URLPattern
接口使用。
概念和用法
模式语法基于来自path-to-regexp 库的语法。模式可以包含
- 将完全匹配的文字字符串。
- 通配符 (
/posts/*
),匹配任何字符。 - 命名组 (
/books/:id
),提取匹配 URL 的一部分。 - 非捕获组 (
/books{/old}?
),使模式的一部分可选或匹配多次。 -
RegExp
组 (/books/(\\d+)
),使用几个限制 进行任意复杂的正则表达式匹配。请注意,括号不是正则表达式的一部分,而是将其内容定义为正则表达式。
您可以在下面的模式语法 部分找到有关语法的详细信息。
接口
URL 模式 API 只有一个相关的接口
URLPattern
实验性-
表示可以匹配 URL 或 URL 部分的模式。该模式可以包含捕获组,这些组提取匹配 URL 的部分。
模式语法
模式语法基于path-to-regexp JavaScript 库。这种语法类似于Ruby on Rails 或 JavaScript 框架(如Express 或Next.js)中使用的语法。
固定文本和捕获组
每个模式可以包含固定文本和组的组合。固定文本是精确匹配的字符序列。组根据匹配规则匹配任意字符串。每个 URL 部分都有自己的默认规则,这些规则将在下面解释,但可以覆盖它们。
// A pattern matching some fixed text
const pattern = new URLPattern({ pathname: "/books" });
console.log(pattern.test("https://example.com/books")); // true
console.log(pattern.exec("https://example.com/books").pathname.groups); // {}
// A pattern matching with a named group
const pattern = new URLPattern({ pathname: "/books/:id" });
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.exec("https://example.com/books/123").pathname.groups); // { id: '123' }
段通配符
默认情况下,匹配 URL pathname
部分的组将匹配除正斜杠 (/
) 之外的所有字符。在 hostname
部分,该组将匹配除点 (.
) 之外的所有字符。在所有其他部分,该组将匹配所有字符。段通配符是非贪婪的,这意味着它将匹配最短的可能字符串。
正则表达式匹配器
您可以通过在括号中包含正则表达式来使用正则表达式代替组的默认匹配规则。此正则表达式定义了组的匹配规则。下面是一个命名组上的正则表达式匹配器的示例,它将组限制为仅在包含一个或多个数字时才匹配
const pattern = new URLPattern("/books/:id(\\d+)", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books/abc")); // false
console.log(pattern.test("https://example.com/books/")); // false
正则表达式匹配器限制
某些正则表达式模式可能无法按预期工作
- 以
^
开头的表达式仅在用于 URLPattern 的协议部分开头时才匹配,如果使用则为冗余。js// with `^` in pathname const pattern = new URLPattern({ pathname: "(^b)" }); console.log(pattern.test("https://example.com/ba")); // false console.log(pattern.test("https://example.com/xa")); // false
js// with `^` in protocol const pattern = new URLPattern({ protocol: "(^https?)" }); console.log(pattern.test("https://example.com/index.html")); // true console.log(pattern.test("xhttps://example.com/index.html")); // false
js// without `^` in protocol const pattern = new URLPattern({ protocol: "(https?)" }); console.log(pattern.test("https://example.com/index.html")); // true console.log(pattern.test("xhttps://example.com/index.html")); // false
- 以
$
结尾的表达式仅在用于 URLPattern 的哈希部分结尾时才匹配,如果使用则为冗余。js// with `$` in pathname const pattern = new URLPattern({ pathname: "(path$)" }); console.log(pattern.test("https://example.com/path")); // false console.log(pattern.test("https://example.com/other")); // false
js// with `$` in hash const pattern = new URLPattern({ hash: "(hash$)" }); console.log(pattern.test("https://example.com/#hash")); // true console.log(pattern.test("xhttps://example.com/#otherhash")); // false
js// without `$` in hash const pattern = new URLPattern({ hash: "(hash)" }); console.log(pattern.test("https://example.com/#hash")); // true console.log(pattern.test("xhttps://example.com/#otherhash")); // false
- 前瞻和后顾将永远不会匹配 URLPattern 的任何部分。js
// lookahead const pattern = new URLPattern({ pathname: "(a(?=b))" }); console.log(pattern.test("https://example.com/ab")); // false console.log(pattern.test("https://example.com/ax")); // false
js// negative-lookahead const pattern = new URLPattern({ pathname: "(a(?!b))" }); console.log(pattern.test("https://example.com/ab")); // false console.log(pattern.test("https://example.com/ax")); // false
js// lookbehind const pattern = new URLPattern({ pathname: "((?<=b)a)" }); console.log(pattern.test("https://example.com/ba")); // false console.log(pattern.test("https://example.com/xa")); // false
js// negative-lookbehind const pattern = new URLPattern({ pathname: "((?<!b)a)" }); console.log(pattern.test("https://example.com/ba")); // false console.log(pattern.test("https://example.com/xa")); // false
- 即使在 RegExp 中没有必要,在 URLPattern 中的范围表达式内的括号也需要转义。js
new URLPattern({ pathname: "([()])" }); // throws new URLPattern({ pathname: "([\\(\\)])" }); // ok new RegExp("[()]"); // ok new RegExp("[\\(\\)]"); // ok
无名和命名组
组可以是命名的,也可以是无名的。命名组通过在组名前缀冒号 (:
) 来指定。没有以冒号和名称为前缀的正则表达式组是无名的。无名组在匹配结果中根据它们在模式中的顺序进行数字索引。
// A named group
const pattern = new URLPattern("/books/:id(\\d+)", "https://example.com");
console.log(pattern.exec("https://example.com/books/123").pathname.groups); // { id: '123' }
// An unnamed group
const pattern = new URLPattern("/books/(\\d+)", "https://example.com");
console.log(pattern.exec("https://example.com/books/123").pathname.groups); // { '0': '123' }
组修饰符
组也可以有修饰符。这些在组名之后指定(如果有正则表达式,则在正则表达式之后)。有三个修饰符:?
用于使组可选,+
用于使组重复一次或多次,*
用于使组重复零次或多次。
// An optional group
const pattern = new URLPattern("/books/:id?", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books")); // true
console.log(pattern.test("https://example.com/books/")); // false
console.log(pattern.test("https://example.com/books/123/456")); // false
console.log(pattern.test("https://example.com/books/123/456/789")); // false
// A repeating group with a minimum of one
const pattern = new URLPattern("/books/:id+", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books")); // false
console.log(pattern.test("https://example.com/books/")); // false
console.log(pattern.test("https://example.com/books/123/456")); // true
console.log(pattern.test("https://example.com/books/123/456/789")); // true
// A repeating group with a minimum of zero
const pattern = new URLPattern("/books/:id*", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books")); // true
console.log(pattern.test("https://example.com/books/")); // false
console.log(pattern.test("https://example.com/books/123/456")); // true
console.log(pattern.test("https://example.com/books/123/456/789")); // true
组分隔符
模式也可以包含组分隔符。这些是模式的一部分,它们被大括号 ({}
) 括起来。这些组分隔符不像捕获组那样在匹配结果中被捕获,但仍然可以应用修饰符,就像组一样。如果组分隔符没有被修饰符修改,则它们被视为它们中的项目只是父模式的一部分。组分隔符可能不包含其他组分隔符,但可能包含任何其他模式项(捕获组、正则表达式、通配符或固定文本)。
// A group delimiter with a ? (optional) modifier
const pattern = new URLPattern("/book{s}?", "https://example.com");
console.log(pattern.test("https://example.com/books")); // true
console.log(pattern.test("https://example.com/book")); // true
console.log(pattern.exec("https://example.com/books").pathname.groups); // {}
// A group delimiter without a modifier
const pattern = new URLPattern("/book{s}", "https://example.com");
console.log(pattern.pathname); // /books
console.log(pattern.test("https://example.com/books")); // true
console.log(pattern.test("https://example.com/book")); // false
// A group delimiter containing a capturing group
const pattern = new URLPattern({ pathname: "/blog/:id(\\d+){-:title}?" });
console.log(pattern.test("https://example.com/blog/123-my-blog")); // true
console.log(pattern.test("https://example.com/blog/123")); // true
console.log(pattern.test("https://example.com/blog/my-blog")); // false
路径名中的自动组前缀
在匹配 URL pathname
部分的模式中,如果组定义之前是正斜杠 (/
),则组将自动添加正斜杠 (/
) 前缀。这对带修饰符的组很有用,因为它允许重复组按预期工作。
如果您不希望自动添加前缀,则可以通过将组用组分隔符 ({}
) 括起来来禁用它。组分隔符没有自动添加前缀的行为。
// A pattern with an optional group, preceded by a slash
const pattern = new URLPattern("/books/:id?", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books")); // true
console.log(pattern.test("https://example.com/books/")); // false
// A pattern with a repeating group, preceded by a slash
const pattern = new URLPattern("/books/:id+", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books/123/456")); // true
console.log(pattern.test("https://example.com/books/123/")); // false
console.log(pattern.test("https://example.com/books/123/456/")); // false
// Segment prefixing does not occur outside of pathname patterns
const pattern = new URLPattern({ hash: "/books/:id?" });
console.log(pattern.test("https://example.com#/books/123")); // true
console.log(pattern.test("https://example.com#/books")); // false
console.log(pattern.test("https://example.com#/books/")); // true
// Disabling segment prefixing for a group using a group delimiter
const pattern = new URLPattern({ pathname: "/books/{:id}?" });
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books")); // false
console.log(pattern.test("https://example.com/books/")); // true
通配符标记
通配符标记 (*
) 是一个无名捕获组的简写,它匹配所有字符零次或多次。您可以在模式中的任何位置放置它。通配符是贪婪的,这意味着它将匹配最长的可能字符串。
// A wildcard at the end of a pattern
const pattern = new URLPattern("/books/*", "https://example.com");
console.log(pattern.test("https://example.com/books/123")); // true
console.log(pattern.test("https://example.com/books")); // false
console.log(pattern.test("https://example.com/books/")); // true
console.log(pattern.test("https://example.com/books/123/456")); // true
// A wildcard in the middle of a pattern
const pattern = new URLPattern("/*.png", "https://example.com");
console.log(pattern.test("https://example.com/image.png")); // true
console.log(pattern.test("https://example.com/image.png/123")); // false
console.log(pattern.test("https://example.com/folder/image.png")); // true
console.log(pattern.test("https://example.com/.png")); // true
模式规范化
解析模式时,它会自动规范化为规范形式。例如,unicode 字符在 pathname 属性中被百分比编码,punycode 编码用于 hostname,默认端口号被省略,类似 /foo/./bar/
的路径被压缩为 /foo/bar
等。此外,还有一些模式表示解析为相同的底层含义,例如 foo
和 {foo}
。此类情况被规范化为最简单的形式。在这种情况下,{foo}
会更改为 foo
。
大小写敏感
URL 模式 API 在匹配时默认将 URL 的许多部分视为区分大小写。相比之下,许多客户端 JavaScript 框架使用不区分大小写的 URL 匹配。在URLPattern()
构造函数上提供了 ignoreCase
选项,如果需要,可以启用不区分大小写的匹配。
// Case-sensitive matching by default
const pattern = new URLPattern("https://example.com/2022/feb/*");
console.log(pattern.test("https://example.com/2022/feb/xc44rsz")); // true
console.log(pattern.test("https://example.com/2022/Feb/xc44rsz")); // false
在构造函数中将 ignoreCase
选项设置为 true
将所有匹配操作切换为对给定模式的不区分大小写操作
// Case-insensitive matching
const pattern = new URLPattern("https://example.com/2022/feb/*", {
ignoreCase: true,
});
console.log(pattern.test("https://example.com/2022/feb/xc44rsz")); // true
console.log(pattern.test("https://example.com/2022/Feb/xc44rsz")); // true
示例
筛选特定 URL 组件
以下示例显示了 URLPattern
如何筛选特定 URL 组件。当使用组件模式的结构化对象调用 URLPattern()
构造函数时,任何缺少的组件都默认为 *
通配符值。
// Construct a URLPattern that matches a specific domain and its subdomains.
// All other URL components default to the wildcard `*` pattern.
const pattern = new URLPattern({
hostname: "{*.}?example.com",
});
console.log(pattern.hostname); // '{*.}?example.com'
console.log(pattern.protocol); // '*'
console.log(pattern.username); // '*'
console.log(pattern.password); // '*'
console.log(pattern.pathname); // '*'
console.log(pattern.search); // '*'
console.log(pattern.hash); // '*'
console.log(pattern.test("https://example.com/foo/bar")); // true
console.log(pattern.test({ hostname: "cdn.example.com" })); // true
console.log(pattern.test("custom-protocol://example.com/other/path?q=1")); // false
// Prints `false` because the hostname component does not match
console.log(pattern.test("https://cdn-example.com/foo/bar"));
从完整的 URL 字符串构造 URLPattern
以下示例展示了如何从包含嵌入模式的完整 URL 字符串构造 URLPattern
。例如,:
可以是 URL 协议后缀,如 https:
,也可以是命名模式组的开头,如 :foo
。如果 URL 语法和模式语法之间没有歧义,它“正常工作”。
// Construct a URLPattern that matches URLs to CDN servers loading jpg images.
// URL components not explicitly specified, like search and hash here, result
// in the empty string similar to the URL() constructor.
const pattern = new URLPattern("https://cdn-*.example.com/*.jpg");
console.log(pattern.protocol); // 'https'
console.log(pattern.hostname); // 'cdn-*.example.com'
console.log(pattern.pathname); // '/*.jpg'
console.log(pattern.username); // ''
console.log(pattern.password); // ''
console.log(pattern.search); // ''
console.log(pattern.hash); // ''
// Prints `true`
console.log(
pattern.test("https://cdn-1234.example.com/product/assets/hero.jpg"),
);
// Prints `false` because the search component does not match
console.log(
pattern.test("https://cdn-1234.example.com/product/assets/hero.jpg?q=1"),
);
使用歧义 URL 字符串构造 URLPattern
以下示例展示了如何从歧义字符串构造的 URLPattern
将优先处理将字符视为模式语法的一部分。在这种情况下,:
字符可以是协议组件后缀,也可以是模式中命名组的前缀。构造函数选择将其视为模式的一部分,因此确定这是一个相对路径名模式。由于没有基本 URL,因此无法解析相对路径名,它会抛出错误。
// Throws because this is interpreted as a single relative pathname pattern
// with a ":foo" named group and there is no base URL.
const pattern = new URLPattern("data:foo*");
转义字符以消除 URLPattern 构造函数字符串的歧义
以下示例展示了如何转义一个歧义构造函数字符串字符以将其视为 URL 分隔符而不是模式字符。这里 :
被转义为 \\:
。
// Constructs a URLPattern treating the `:` as the protocol suffix.
const pattern = new URLPattern("data\\:foo*");
console.log(pattern.protocol); // 'data'
console.log(pattern.pathname); // 'foo*'
console.log(pattern.username); // ''
console.log(pattern.password); // ''
console.log(pattern.hostname); // ''
console.log(pattern.port); // ''
console.log(pattern.search); // ''
console.log(pattern.hash); // ''
console.log(pattern.test("data:foobar")); // true
在 test()
和 exec()
中使用基本 URL
以下示例展示了如何 test()
和 exec()
可以使用基本 URL。
const pattern = new URLPattern({ hostname: "example.com", pathname: "/foo/*" });
// Prints `true` as the hostname based in the dictionary `baseURL` property
// matches.
console.log(
pattern.test({
pathname: "/foo/bar",
baseURL: "https://example.com/baz",
}),
);
// Prints `true` as the hostname in the second argument base URL matches.
console.log(pattern.test("/foo/bar", "https://example.com/baz"));
// Throws because the second argument cannot be passed with a dictionary input.
try {
pattern.test({ pathname: "/foo/bar" }, "https://example.com/baz");
} catch (e) {}
// The `exec()` method takes the same arguments as `test()`.
const result = pattern.exec("/foo/bar", "https://example.com/baz");
console.log(result.pathname.input); // '/foo/bar'
console.log(result.pathname.groups[0]); // 'bar'
console.log(result.hostname.input); // 'example.com'
在 URLPattern 构造函数中使用基本 URL
以下示例展示了如何使用基本 URL 构造 URLPattern
。请注意,这些情况下的基本 URL 被严格地视为 URL,本身不能包含任何模式语法。
此外,由于基本 URL 为每个组件提供了一个值,因此生成的 URLPattern
也将为每个组件提供一个值,即使它是空字符串。这意味着您不会获得“默认值为通配符”的行为。
const pattern1 = new URLPattern({
pathname: "/foo/*",
baseURL: "https://example.com",
});
console.log(pattern1.protocol); // 'https'
console.log(pattern1.hostname); // 'example.com'
console.log(pattern1.pathname); // '/foo/*'
console.log(pattern1.username); // ''
console.log(pattern1.password); // ''
console.log(pattern1.port); // ''
console.log(pattern1.search); // ''
console.log(pattern1.hash); // ''
// Equivalent to pattern1
const pattern2 = new URLPattern("/foo/*", "https://example.com");
// Throws because a relative constructor string must have a base URL to resolve
// against.
try {
const pattern3 = new URLPattern("/foo/*");
} catch (e) {}
访问匹配的组值
以下示例演示了如何从exec()
结果对象中访问与模式组匹配的输入值。未命名的组将按顺序分配索引号。
const pattern = new URLPattern({ hostname: "*.example.com" });
const result = pattern.exec({ hostname: "cdn.example.com" });
console.log(result.hostname.groups[0]); // 'cdn'
console.log(result.hostname.input); // 'cdn.example.com'
console.log(result.inputs); // [{ hostname: 'cdn.example.com' }]
使用自定义名称访问匹配的组值
以下示例演示了如何为组指定自定义名称,这些名称可用于在结果对象中访问匹配的值。
// Construct a URLPattern using matching groups with custom names. These
// names can then be later used to access the matched values in the result
// object.
const pattern = new URLPattern({ pathname: "/:product/:user/:action" });
const result = pattern.exec({ pathname: "/store/wanderview/view" });
console.log(result.pathname.groups.product); // 'store'
console.log(result.pathname.groups.user); // 'wanderview'
console.log(result.pathname.groups.action); // 'view'
console.log(result.pathname.input); // '/store/wanderview/view'
console.log(result.inputs); // [{ pathname: '/store/wanderview/view' }]
自定义正则表达式组
以下示例演示了如何使匹配组使用自定义正则表达式。
const pattern = new URLPattern({ pathname: "/(foo|bar)" });
console.log(pattern.test({ pathname: "/foo" })); // true
console.log(pattern.test({ pathname: "/bar" })); // true
console.log(pattern.test({ pathname: "/baz" })); // false
const result = pattern.exec({ pathname: "/foo" });
console.log(result.pathname.groups[0]); // 'foo'
带自定义正则表达式的命名组
以下示例演示了如何在命名组中使用自定义正则表达式。
const pattern = new URLPattern({ pathname: "/:type(foo|bar)" });
const result = pattern.exec({ pathname: "/foo" });
console.log(result.pathname.groups.type); // 'foo'
使匹配组可选
以下示例演示了如何通过在匹配组后放置?
修饰符来使匹配组可选。对于路径名组件,这也将导致任何前面的/
字符被视为该组的可选前缀。
const pattern = new URLPattern({ pathname: "/product/(index.html)?" });
console.log(pattern.test({ pathname: "/product/index.html" })); // true
console.log(pattern.test({ pathname: "/product" })); // true
const pattern2 = new URLPattern({ pathname: "/product/:action?" });
console.log(pattern2.test({ pathname: "/product/view" })); // true
console.log(pattern2.test({ pathname: "/product" })); // true
// Wildcards can be made optional as well. This may not seem to make sense
// since they already match the empty string, but it also makes the prefix
// `/` optional in a pathname pattern.
const pattern3 = new URLPattern({ pathname: "/product/*?" });
console.log(pattern3.test({ pathname: "/product/wanderview/view" })); // true
console.log(pattern3.test({ pathname: "/product" })); // true
console.log(pattern3.test({ pathname: "/product/" })); // true
使匹配组重复
以下示例演示了如何通过在匹配组后放置+
修饰符来使匹配组重复。在pathname
组件中,这也将/
前缀视为特殊字符。它与组一起重复。
const pattern = new URLPattern({ pathname: "/product/:action+" });
const result = pattern.exec({ pathname: "/product/do/some/thing/cool" });
result.pathname.groups.action; // 'do/some/thing/cool'
console.log(pattern.test({ pathname: "/product" })); // false
使匹配组可选和重复
以下示例演示了如何创建可选且重复的匹配组。通过在组后放置*
修饰符来实现此目的。同样,路径名组件将/
前缀视为特殊字符。它既变为可选,也与组一起重复。
const pattern = new URLPattern({ pathname: "/product/:action*" });
const result = pattern.exec({ pathname: "/product/do/some/thing/cool" });
console.log(result.pathname.groups.action); // 'do/some/thing/cool'
console.log(pattern.test({ pathname: "/product" })); // true
对可选或重复修饰符使用自定义前缀或后缀
以下示例演示了如何使用花括号来表示自定义前缀和/或后缀,以便由后续的?
、*
或+
修饰符操作。
const pattern = new URLPattern({ hostname: "{:subdomain.}*example.com" });
console.log(pattern.test({ hostname: "example.com" })); // true
console.log(pattern.test({ hostname: "foo.bar.example.com" })); // true
console.log(pattern.test({ hostname: ".example.com" })); // false
const result = pattern.exec({ hostname: "foo.bar.example.com" });
console.log(result.hostname.groups.subdomain); // 'foo.bar'
在不使用匹配组的情况下使文本可选或重复
以下示例演示了如何使用花括号来表示固定文本值作为可选或重复,而无需使用匹配组。
const pattern = new URLPattern({ pathname: "/product{/}?" });
console.log(pattern.test({ pathname: "/product" })); // true
console.log(pattern.test({ pathname: "/product/" })); // true
const result = pattern.exec({ pathname: "/product/" });
console.log(result.pathname.groups); // {}
同时使用多个组件和功能
以下示例演示了如何在多个 URL 组件中组合多个功能。
const pattern = new URLPattern({
protocol: "http{s}?",
username: ":user?",
password: ":pass?",
hostname: "{:subdomain.}*example.com",
pathname: "/product/:action*",
});
const result = pattern.exec(
"http://foo:[email protected]/product/view?q=12345",
);
console.log(result.username.groups.user); // 'foo'
console.log(result.password.groups.pass); // 'bar'
console.log(result.hostname.groups.subdomain); // 'sub'
console.log(result.pathname.groups.action); // 'view'
规范
规范 |
---|
URL 模式标准 |
浏览器兼容性
BCD 表格仅在浏览器中加载
另请参阅
URLPattern
的 polyfill 可在GitHub 上获取- URLPattern 使用的模式语法类似于path-to-regexp 使用的语法