The rewrite Valve是Apache Tomcat服务器中的一个组件,主要用于在服务器处理请求之前修改传入请求的URL

本文深入解析了Apache Tomcat中的RewriteValve组件,详细介绍了其配置与使用方法,包括RewriteCond、RewriteRule等指令的语法及功能,为实现URL重写提供了全面指南。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

“The rewrite Valve” 指的是 Apache Tomcat 中的一个组件,用于在请求被处理之前,对进入的 URL 进行重写和重定向。

一、功能与作用

  • URL 重写与重定向:类似于 Apache HTTP Server 的 mod_rewrite 模块,可以根据规则修改或重定向 URL。
  • SEO 友好:通过美化 URL,提高搜索引擎优化效果。
  • URL 清理:移除不必要的查询参数,缩短 URL。
  • 负载均衡:将请求分发到多个服务器,避免单点过载。
  • 安全控制:限制或阻止某些类型的请求访问敏感资源。

二、配置方法

  1. 启用 rewrite Valve:在 conf/server.xml 中添加以下配置:
    <Valve className="org.apache.catalina.valves.rewrite.RewriteValve" />
    
  2. 创建配置文件:在 conf/ 目录下创建名为 rewrite.config 的文件,或在 Web 应用的 WEB-INF/ 目录下创建该文件。
  3. 定义重写规则:使用类似 mod_rewrite 的语法,例如:
    RewriteRule ^/old-url$ /new-url [R=301,L]
    
    • ^/old-url$:匹配原始路径。
    • /new-url:重定向目标。
    • R=301:永久重定向。
    • L:表示为最后一条规则。
  4. 重启 Tomcat:使配置生效。

三、规则语法与示例

  • RewriteRule:定义 URL 重写规则。
  • RewriteCond:设置条件,满足条件时才执行对应规则。
  • 常用标志
    • R=301:永久重定向。
    • L:终止后续规则。
    • NC:不区分大小写。

四、应用场景举例

  • 旧链接跳转:将旧版 URL 跳转到新版,保持用户访问体验。
  • 防止 SQL 注入:虽然并非首选方法,但可通过限制特定 URL 访问增强安全性。
  • 统一入口:将不同域名或路径统一重定向到主站点。

五、注意事项

  • 配置文件需放置在正确目录下(conf/WEB-INF/)。
  • 修改配置后需重启 Tomcat。
  • 规则语法需严格遵循,避免产生无限重定向或其他意外行为。

综上,“rewrite Valve” 是 Apache Tomcat 中强大的 URL 重写工具,通过灵活配置,可显著提升 Web 应用的可用性、安全性和维护效率。

The rewrite Valve是Apache Tomcat服务器中的一个组件,主要用于在服务器处理请求之前修改传入请求的URL。以下是对它的详细分析:

  • 功能特点
    • URL清理:可去除不必要的查询参数、缩短URL,或将旧URL重定向到新URL,使URL更简洁规范。
    • SEO优化:通过重写URL,能改善URL的结构和可读性,有助于搜索引擎优化,提高网站在搜索结果中的排名。
    • 负载均衡:可将传入请求分发到多个服务器,确保单个服务器不会因流量过大而不堪重负,实现负载均衡功能。
    • 安全控制:可用于阻止或重定向某些类型的请求,例如包含恶意内容或试图访问受限资源的请求,增强服务器的安全性。
  • 工作原理:The rewrite Valve基于规则的重写引擎,使用PCRE(Perl Compatible Regular Expressions)正则表达式解析器来重写请求的URL。它通过读取配置文件中的规则,对符合条件的URL进行相应的修改操作。
  • 配置方式
    • 启用组件:需在Tomcat安装目录的conf/server.xml文件中,取消相关配置行的注释来启用The rewrite Valve。
    • 创建配置文件:在Tomcat的conf/目录或Web应用的WEB - INF文件夹中创建rewrite.config文件,用于编写URL重写规则。
    • 定义重写规则:在rewrite.config文件中使用类似Apache HTTPD mod_rewrite的正则表达式规则来定义重写规则。例如,RewriteRule ^/old - url$ /new - url (R = 301,L)表示将所有对http://example.com/old - url的请求永久重定向到http://example.com/new - url
  • 核心指令
    • RewriteCond:用于定义规则条件。一个或多个RewriteCond指令可位于RewriteRule指令之前,只有当URI的当前状态与模式匹配,并且满足这些条件时,才会使用后续的RewriteRule规则。
    • RewriteRule:是重写机制的核心指令,可多次使用,每个实例定义一个单独的重写规则。规则定义顺序很重要,运行时将按照定义顺序应用规则。

The rewrite valve implements URL rewrite functionality in a way that is very similar to mod_rewrite from Apache HTTP Server.
Configuration

The rewrite valve is configured as a valve using the org.apache.catalina.valves.rewrite.RewriteValve class name.

The rewrite valve can be configured as a valve added in a Host. See virtual-server documentation for informations how to configure it. It will use a rewrite.config file containing the rewrite directives, it must be placed in the Host configuration folder.

It can also be in the context.xml of a webapp. The valve will then use a rewrite.config file containing the rewrite directives, it must be placed in the WEB-INF folder of the web application
Directives

The rewrite.config file contains a list of directives which closely resemble the directives used by mod_rewrite, in particular the central RewriteRule and RewriteCond directives. Lines that start with a # character are treated as comments and will be ignored.

Note: This section is a modified version of the mod_rewrite documentation, which is Copyright 1995-2006 The Apache Software Foundation, and licensed under the under the Apache License, Version 2.0.
RewriteCond

Syntax: RewriteCond TestString CondPattern

The RewriteCond directive defines a rule condition. One or more RewriteCond can precede a RewriteRule directive. The following rule is then only used if both the current state of the URI matches its pattern, and if these conditions are met.

TestString is a string which can contain the following expanded constructs in addition to plain text:

RewriteRule backreferences: These are backreferences of the form $N (0 <= N <= 9), which provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions..
RewriteCond backreferences: These are backreferences of the form %N (1 <= N <= 9), which provide access to the grouped parts (again, in parentheses) of the pattern, from the last matched RewriteCond in the current set of conditions.
RewriteMap expansions: These are expansions of the form ${mapname:key|default}. See the documentation for RewriteMap for more details.
Server-Variables: These are variables of the form %{ NAME_OF_VARIABLE } where NAME_OF_VARIABLE can be a string taken from the following list:

    HTTP headers:

    HTTP_USER_AGENT
    HTTP_REFERER
    HTTP_COOKIE
    HTTP_FORWARDED
    HTTP_HOST
    HTTP_PROXY_CONNECTION
    HTTP_ACCEPT

    connection & request:

    REMOTE_ADDR
    REMOTE_HOST
    REMOTE_PORT
    REMOTE_USER
    REMOTE_IDENT
    REQUEST_METHOD
    SCRIPT_FILENAME
    REQUEST_PATH
    CONTEXT_PATH
    SERVLET_PATH
    PATH_INFO
    QUERY_STRING
    AUTH_TYPE

    server internals:

    DOCUMENT_ROOT
    SERVER_NAME
    SERVER_ADDR
    SERVER_PORT
    SERVER_PROTOCOL
    SERVER_SOFTWARE

    date and time:

    TIME_YEAR
    TIME_MON
    TIME_DAY
    TIME_HOUR
    TIME_MIN
    TIME_SEC
    TIME_WDAY
    TIME

    specials:

    THE_REQUEST
    REQUEST_URI
    REQUEST_FILENAME
    HTTPS

These variables all correspond to the similarly named HTTP MIME-headers and Servlet API methods. Most are documented elsewhere in the Manual or in the CGI specification. Those that are special to the rewrite valve include those below.

REQUEST_PATH
    Corresponds to the full path that is used for mapping.
CONTEXT_PATH
    Corresponds to the path of the mapped context.
SERVLET_PATH
    Corresponds to the servlet path.
THE_REQUEST
    The full HTTP request line sent by the browser to the server (e.g., "GET /index.html HTTP/1.1"). This does not include any additional headers sent by the browser.
REQUEST_URI
    The resource requested in the HTTP request line. (In the example above, this would be "/index.html".)
REQUEST_FILENAME
    The full local file system path to the file or script matching the request.
HTTPS
    Will contain the text "on" if the connection is using SSL/TLS, or "off" otherwise.

Other things you should be aware of:

The variables SCRIPT_FILENAME and REQUEST_FILENAME contain the same value - the value of the filename field of the internal request_rec structure of the Apache server. The first name is the commonly known CGI variable name while the second is the appropriate counterpart of REQUEST_URI (which contains the value of the uri field of request_rec).
%{ENV:variable}, where variable can be any Java system property, is also available.
%{SSL:variable}, where variable is the name of an SSL environment variable, are not implemented yet. Example: %{SSL:SSL_CIPHER_USEKEYSIZE} may expand to 128.
%{HTTP:header}, where header can be any HTTP MIME-header name, can always be used to obtain the value of a header sent in the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header 'Proxy-Connection:'.

CondPattern is the condition pattern, a regular expression which is applied to the current instance of the TestString. TestString is first evaluated, before being matched against CondPattern.

Remember: CondPattern is a perl compatible regular expression with some additions:

You can prefix the pattern string with a '!' character (exclamation mark) to specify a non-matching pattern.
There are some special variants of CondPatterns. Instead of real regular expression strings you can also use one of the following:
    '<CondPattern' (lexicographically precedes)
    Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically precedes CondPattern.
    '>CondPattern' (lexicographically follows)
    Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically follows CondPattern.
    '=CondPattern' (lexicographically equal)
    Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString is lexicographically equal to CondPattern (the two strings are exactly equal, character for character). If CondPattern is "" (two quotation marks) this compares TestString to the empty string.
    '-d' (is directory)
    Treats the TestString as a pathname and tests whether or not it exists, and is a directory.
    '-f' (is regular file)
    Treats the TestString as a pathname and tests whether or not it exists, and is a regular file.
    '-s' (is regular file, with size)
    Treats the TestString as a pathname and tests whether or not it exists, and is a regular file with size greater than zero.
Note: All of these tests can also be prefixed by an exclamation mark ('!') to negate their meaning.
You can also set special flags for CondPattern by appending [flags] as the third argument to the RewriteCond directive, where flags is a comma-separated list of any of the following flags:
    'nocase|NC' (no case)
    This makes the test case-insensitive - differences between 'A-Z' and 'a-z' are ignored, both in the expanded TestString and the CondPattern. This flag is effective only for comparisons between TestString and CondPattern. It has no effect on file system and subrequest checks.
    'ornext|OR' (or next condition)
    Use this to combine rule conditions with a local OR instead of the implicit AND. Typical example:

    RewriteCond %{REMOTE_HOST}  ^host1.*  [OR]
    RewriteCond %{REMOTE_HOST}  ^host2.*  [OR]
    RewriteCond %{REMOTE_HOST}  ^host3.*
    RewriteRule ...some special stuff for any of these hosts...

    Without this flag you would have to write the condition/rule pair three times.

Example:

To rewrite the Homepage of a site according to the ‘User-Agent:’ header of the request, you can use the following:

RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*
RewriteRule ^/$ /homepage.max.html [L]

RewriteCond %{HTTP_USER_AGENT} ^Lynx.*
RewriteRule ^/$ /homepage.min.html [L]

RewriteRule ^/$ /homepage.std.html [L]

Explanation: If you use a browser which identifies itself as ‘Mozilla’ (including Netscape Navigator, Mozilla etc), then you get the max homepage (which could include frames, or other special features). If you use the Lynx browser (which is terminal-based), then you get the min homepage (which could be a version designed for easy, text-only browsing). If neither of these conditions apply (you use any other browser, or your browser identifies itself as something non-standard), you get the std (standard) homepage.
RewriteMap

Syntax: RewriteMap name rewriteMapClassName optionalParameters

The maps are implemented using an interface that users must implement. Its class name is org.apache.catalina.valves.rewrite.RewriteMap, and its code is:

package org.apache.catalina.valves.rewrite;

public interface RewriteMap {
default String setParameters(String params…); // calls setParameters(String) with the first parameter if there is only one
public String setParameters(String params);
public String lookup(String key);
}

The referenced implementation of such a class – in our example rewriteMapClassName – will be instantiated and initialized with the optional parameter – optionalParameters from above (be careful with whitespace) – by calling setParameters(String). That instance will then be registered under the name given as the first paramter of RewriteMap rule.

Note: Starting with Tomcat 9 you can use more than one parameter. These have to be separated by spaces. Parameters can be quoted with ". This enables space characters inside parameters.

That map instance will be given the the lookup value that is configured in the corresponding RewriteRule by calling lookup(String). Your implementation is free to return null to indicate, that the given default should be used, or to return a replacement value.

Say, you want to implement a rewrite map function that converts all lookup keys to uppercase. You would start by implementing a class that implements the RewriteMap interface.

package example.maps;

import org.apache.catalina.valves.rewrite.RewriteMap;

public class UpperCaseMap implements RewriteMap {

@Override
public String setParameters(String params) {
// nothing to be done here
return null;
}

@Override
public String lookup(String key) {
if (key == null) {
return null;
}
return key.toUpperCase();
}

}

Compile this class, put it into a jar and place that jar in ${CATALINA_BASE}/lib.

Having done that, you can now define a map with the RewriteMap directive and further on use that map in a RewriteRule.

RewriteMap uc example.maps.UpperCaseMap

RewriteRule ^/(.*)$ ${uc:$1}

With this setup a request to the url path /index.html would get routed to /INDEX.HTML.
RewriteRule

Syntax: RewriteRule Pattern Substitution

The RewriteRule directive is the real rewriting workhorse. The directive can occur more than once, with each instance defining a single rewrite rule. The order in which these rules are defined is important - this is the order in which they will be applied at run-time.

Pattern is a perl compatible regular expression, which is applied to the current URL. ‘Current’ means the value of the URL when this rule is applied. This may not be the originally requested URL, which may already have matched a previous rule, and have been altered.

Security warning: Due to the way Java’s regex matching is done, poorly formed regex patterns are vulnerable to “catastrophic backtracking”, also known as “regular expression denial of service” or ReDoS. Therefore, extra caution should be used for RewriteRule patterns. In general it is difficult to automatically detect such vulnerable regex, and so a good defense is to read a bit on the subject of catastrophic backtracking. A good reference is the OWASP ReDoS guide.

Some hints on the syntax of regular expressions:

Text:
. Any single character
[chars] Character class: Any character of the class ‘chars’
[^chars] Character class: Not a character of the class ‘chars’
text1|text2 Alternative: text1 or text2

Quantifiers:
? 0 or 1 occurrences of the preceding text

  •       0 or N occurrences of the preceding text (N > 0)
    
  •       1 or N occurrences of the preceding text (N > 1)
    

Grouping:
(text) Grouping of text
(used either to set the borders of an alternative as above, or
to make backreferences, where the Nth group can
be referred to on the RHS of a RewriteRule as $N)

Anchors:
^ Start-of-line anchor
$ End-of-line anchor

Escaping:
\char escape the given char
(for instance, to specify the chars “.” etc.)

For more information about regular expressions, have a look at the perl regular expression manpage (“perldoc perlre”). If you are interested in more detailed information about regular expressions and their variants (POSIX regex etc.) the following book is dedicated to this topic:

Mastering Regular Expressions, 2nd Edition
Jeffrey E.F. Friedl
O’Reilly & Associates, Inc. 2002
ISBN 978-0-596-00289-3

In the rules, the NOT character (‘!’) is also available as a possible pattern prefix. This enables you to negate a pattern; to say, for instance: ‘if the current URL does NOT match this pattern’. This can be used for exceptional cases, where it is easier to match the negative pattern, or as a last default rule.

Note: When using the NOT character to negate a pattern, you cannot include grouped wildcard parts in that pattern. This is because, when the pattern does NOT match (i.e., the negation matches), there are no contents for the groups. Thus, if negated patterns are used, you cannot use $N in the substitution string!

The substitution of a rewrite rule is the string which is substituted for (or replaces) the original URL which Pattern matched. In addition to plain text, it can include

back-references ($N) to the RewriteRule pattern
back-references (%N) to the last matched RewriteCond pattern
server-variables as in rule condition test-strings (%{VARNAME})
mapping-function calls (${mapname:key|default})

Back-references are identifiers of the form $N (N=0…9), which will be replaced by the contents of the Nth group of the matched Pattern. The server-variables are the same as for the TestString of a RewriteCond directive. The mapping-functions come from the RewriteMap directive and are explained there. These three types of variables are expanded in the order above.

As already mentioned, all rewrite rules are applied to the Substitution (in the order in which they are defined in the config file). The URL is completely replaced by the Substitution and the rewriting process continues until all rules have been applied, or it is explicitly terminated by a L flag.

The special characters $ and % can be quoted by prepending them with a backslash character .

There is a special substitution string named ‘-’ which means: NO substitution! This is useful in providing rewriting rules which only match URLs but do not substitute anything for them. It is commonly used in conjunction with the C (chain) flag, in order to apply more than one pattern before substitution occurs.

Unlike newer mod_rewrite versions, the Tomcat rewrite valve does not automatically support absolute URLs (the specific redirect flag must be used to be able to specify an absolute URLs, see below) or direct file serving.

Additionally you can set special flags for Substitution by appending [flags] as the third argument to the RewriteRule directive. Flags is a comma-separated list of any of the following flags:

'chain|C' (chained with next rule)
This flag chains the current rule with the next rule (which itself can be chained with the following rule, and so on). This has the following effect: if a rule matches, then processing continues as usual - the flag has no effect. If the rule does not match, then all following chained rules are skipped. For instance, it can be used to remove the '.www' part, inside a per-directory rule set, when you let an external redirect happen (where the '.www' part should not occur!).
'cookie|CO=NAME:VAL:domain[:lifetime[:path]]' (set cookie)
This sets a cookie in the client's browser. The cookie's name is specified by NAME and the value is VAL. The domain field is the domain of the cookie, such as '.apache.org', the optional lifetime is the lifetime of the cookie in minutes, and the optional path is the path of the cookie
'env|E=VAR:VAL' (set environment variable)
This forces a request attribute named VAR to be set to the value VAL, where VAL can contain regexp backreferences ($N and %N) which will be expanded. You can use this flag more than once, to set more than one variable.
'forbidden|F' (force URL to be forbidden)
This forces the current URL to be forbidden - it immediately sends back an HTTP response of 403 (FORBIDDEN). Use this flag in conjunction with appropriate RewriteConds to conditionally block some URLs.
'gone|G' (force URL to be gone)
This forces the current URL to be gone - it immediately sends back an HTTP response of 410 (GONE). Use this flag to mark pages which no longer exist as gone.
'host|H=Host' (apply rewriting to host)
Rather that rewrite the URL, the virtual host will be rewritten.
'last|L' (last rule)
Stop the rewriting process here and don't apply any more rewrite rules. This corresponds to the Perl last command or the break command in C. Use this flag to prevent the currently rewritten URL from being rewritten further by following rules. For example, use it to rewrite the root-path URL ('/') to a real one, e.g., '/e/www/'.
'next|N' (next round)
Re-run the rewriting process (starting again with the first rewriting rule). This time, the URL to match is no longer the original URL, but rather the URL returned by the last rewriting rule. This corresponds to the Perl next command or the continue command in C. Use this flag to restart the rewriting process - to immediately go to the top of the loop.
Be careful not to create an infinite loop!
'nocase|NC' (no case)
This makes the Pattern case-insensitive, ignoring difference between 'A-Z' and 'a-z' when Pattern is matched against the current URL.
'noescape|NE' (no URI escaping of output)
This flag prevents the rewrite valve from applying the usual URI escaping rules to the result of a rewrite. Ordinarily, special characters (such as '%', '$', ';', and so on) will be escaped into their hexcode equivalents ('%25', '%24', and '%3B', respectively); this flag prevents this from happening. This allows percent symbols to appear in the output, as in

RewriteRule /foo/(.*) /bar?arg=P1\%3d$1 [R,NE]

which would turn '/foo/zed' into a safe request for '/bar?arg=P1=zed'.
'qsappend|QSA' (query string append)
This flag forces the rewrite engine to append a query string part of the substitution string to the existing string, instead of replacing it. Use this when you want to add more data to the query string via a rewrite rule.
'redirect|R [=code]' (force redirect)
Prefix Substitution with http://thishost[:thisport]/ (which makes the new URL a URI) to force an external redirection. If no code is given, an HTTP response of 302 (FOUND, previously MOVED TEMPORARILY) will be returned. If you want to use other response codes in the range 300-399, simply specify the appropriate number or use one of the following symbolic names: temp (default), permanent, seeother. Use this for rules to canonicalize the URL and return it to the client - to translate '/~' into '/u/', or to always append a slash to /u/user, etc.
Note: When you use this flag, make sure that the substitution field is a valid URL! Otherwise, you will be redirecting to an invalid location. Remember that this flag on its own will only prepend http://thishost[:thisport]/ to the URL, and rewriting will continue. Usually, you will want to stop rewriting at this point, and redirect immediately. To stop rewriting, you should add the 'L' flag.
'skip|S=num' (skip next rule(s))
This flag forces the rewriting engine to skip the next num rules in sequence, if the current rule matches. Use this to make pseudo if-then-else constructs: The last rule of the then-clause becomes skip=N, where N is the number of rules in the else-clause. (This is not the same as the 'chain|C' flag!)
'type|T=MIME-type' (force MIME type)
Force the MIME-type of the target file to be MIME-type. This can be used to set up the content-type based on some conditions. For example, the following snippet allows .php files to be displayed by mod_php if they are called with the .phps extension:

RewriteRule ^(.+\.php)s$ $1 [T=application/x-httpd-php-source]

Tomcat Home
The Apache Software Foundation
Apache Tomcat 9
Version 9.0.34, Apr 3 2020
Links

Docs Home
FAQ
User Comments

User Guide

1) Introduction
2) Setup
3) First webapp
4) Deployer
5) Manager
6) Host Manager
7) Realms and AAA
8) Security Manager
9) JNDI Resources
10) JDBC DataSources
11) Classloading
12) JSPs
13) SSL/TLS
14) SSI
15) CGI
16) Proxy Support
17) MBeans Descriptors
18) Default Servlet
19) Clustering
20) Load Balancer
21) Connectors
22) Monitoring and Management
23) Logging
24) APR/Native
25) Virtual Hosting
26) Advanced IO
27) Mavenized
28) Security Considerations
29) Windows Service
30) Windows Authentication
31) Tomcat's JDBC Pool
32) WebSocket
33) Rewrite
34) CDI 2 and JAX-RS
35) GraalVM Support

Reference

Release Notes
Configuration
Tomcat Javadocs
Servlet 4.0 Javadocs
JSP 2.3 Javadocs
EL 3.0 Javadocs
WebSocket 1.1 Javadocs
JASPIC 1.1 Javadocs
Common Annotations 1.3 Javadocs
JK 1.2 Documentation

Apache Tomcat Development

Building
Changelog
Status
Developers
Architecture
Functional Specs.
Tribes

The rewrite Valve
Introduction

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

Bol5261

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值