Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
litespeed_wiki:cache:drop_query_string [2017/12/01 22:03]
Eric Leu [More information]
litespeed_wiki:cache:drop_query_string [2020/05/04 13:55]
Shivam Saluja
Line 1: Line 1:
-====== ​Drop certain query string parameters ​based on rules to make request more efficient ​====== +====== ​Dropping Query Strings via Cache Rules ====== 
-This function ​is suit for users who's site bring junk query string, ​e.g. UTM code, and get too many different URLs which should store and serve with same cache but not\\ +**Please Note**: This wiki is valid for v2.9.x and below of the LiteSpeed Cache Plugin for WordPress. If you are using v3.0 or above, please see [[https://​docs.litespeedtech.com/​lscache/​lscwp/​overview/​|the new documentation]]. 
-Let's introduce a little about what is a UTM code?+ 
 +In an effort to make some requests more cache friendly, LiteSpeed Enterprise v5.2.3+ may be configured to drop certain query string parameters
 + 
 +===== Why is This Helpful? ​===== 
 +For each query string that is attached to a URL, a separate copy of the page is cached. In many cases this is intentional,​ desired behavior. However, when you have "junk" ​query string ​parameters that don't change the content of the pageit's redundant to cache separate copies. 
 + 
 +Junk query strings include ​UTM codes and Google AdWords auto tags, among others
 + 
 +==== What is a UTM code? ====
 A UTM code is a simple code that you can attach to a custom URL in order to track a source, medium, and campaign name. This enables Google Analytics to tell you where searchers came from as well as what campaign directed them to you.  A UTM code is a simple code that you can attach to a custom URL in order to track a source, medium, and campaign name. This enables Google Analytics to tell you where searchers came from as well as what campaign directed them to you. 
  
-Starting with version 5.2.3, LSWS has a built-in Drop certain ​query string parameters based on rules system.+While such query strings may be useful for tracking purposes, they have no effect ​on the content of the page, and therefore should not be considered when storing the page in cache.
  
-===== How to Enable LSWS WordPressProtect Feature on cPanel ​=====+==== What is a Google AdWords auto tag? ==== 
 +Google AdWords can be configured to add tracking parameters to your URLs in order to pass information about the click. Similar to UTM codes, this kind of tag has no bearing on the content of the page, and therefore may be ignored when caching. These tags appear in the format ''&​glcid=XXXXXXX''​.
  
-As long as LSWS version is 5.2.3 or above, the LSWS Drop certain query string feature is enabled by default and does not need any extra configuration in the LSWS WebAdmin GUI or in Apache configurations. ​+===== How to Drop Query Strings =====
  
-You may wish to override the default settings at the server level, virtual-host level or even the ''​.htaccess'' ​levelBefore making any changes.+There are two ways to modify/​​drop ​query strings: one is to use the Apache-like configuration directive ​''​​CacheKeyModify ...''​​, and the other one is to use ''​E=cache-key-mod:''​ in rewrite rulesThe second method is more flexible and is the preferred way.
  
-The upper level configuration is inherit by lower level. if lower level add more configuration,​ it is in addition ​to the upper level.(the addition feature may not be fully implemented yet, will be fully implemented). If want to get rid of upper level config, need to do "​clear",​ then add new config.+Both methods serve the same purpose: ​to modify ​the query string attached ​to a URL
  
-Let's look at some examples for a WHM/cpanel EA4 environment:+==== Method 1Apache-style CacheKeyModify ====
  
-After you run the followingthe WordPressProtect feature will be automatically enabled globally: +The directive can be added to the Apache servervhost and .htaccess levels
-  /​usr/​local/​lsws/​admin/​misc/​lsup.sh -f -v 5.2.3 (or above version)+
  
-You may wish to set ruleYou will need to set it at the server ​level of the Apache ​configuration ​file:+Upper level configurations are inherited by lower levels. If lower level adds more rules, they are in addition to those of the upper level.(This addition feature may not be fully implemented yet, but it //will// be fully implemented.) If a lower level doesn'​t want to use the upper level'​s ​configuration, the "​clear"​ parameter should be used before adding the new rules.
  
 +The ''​CacheKeyModify''​ directive can be used multiple times. Adding multiple modifications in one line is not supported. Multiple lines are combined.
 +
 +This function is suitable for users whose site brings junk query strings, e.g. UTM code, Google AdWords auto tags, etc, and gets too many different URLs which //should// be stored and served from the same cache page but in practice are not.
 +
 +=== Examples ===
 +
 +  * ''​CacheKeyModify -qs:​utm*''​ drops all query strings where the name part starts with "​utm"​
 +  * ''​CacheKeyModify -qs:​utm''​ drops the query string where the name exactly matches "​utm"​
 +  * ''​CacheKeyModify -qs:​glcid''​ drops all query strings where the name part exactly matches "​glcid"​
 +  * ''​CacheKeyModify clear''​ discards all previous configurations.
 +
 +=== CacheKeyModify on cPanel ===
 +
 +As long as the LSWS version is 5.2.3 or above, this feature is enabled by default and does not need any further configuration in the LSWS WebAdmin GUI or in Apache configurations. You may wish to override the default settings at the server level, virtual-host level or even the ''​.htaccess''​ level.
 +
 +== Examples for a WHM/cpanel EA4 environment ==
 +
 +After you run the following, the drop query string feature will be automatically enabled globally (replace 5.2.3 with the appropriate version):
 +  /​usr/​local/​lsws/​admin/​misc/​lsup.sh -f -v 5.2.3 
 +
 +You may wish to set a rule. You will need to set it at the server level of the Apache configuration file:
   vi /​etc/​apache2/​conf.d/​includes/​pre_main_global.conf   vi /​etc/​apache2/​conf.d/​includes/​pre_main_global.conf
   ​   ​
Line 28: Line 57:
   </​IfModule>​   </​IfModule>​
  
-This will drop all query strings ​that the name part starts with "​utm" ​ for all virtual hosts.+This will drop all query strings ​where the name part starts with "​utm" ​ for all virtual hosts.
  
-You can also drop query string with name exact matches "​utm":​+You can also drop query strings where the name exactly ​matches "​utm":​
   <​IfModule Litespeed>​   <​IfModule Litespeed>​
   CacheKeyModify -qs:utm   CacheKeyModify -qs:utm
   </​IfModule>​   </​IfModule>​
  
-No matter how the server level is set, the end user has the ability to clear it through ''​.htaccess''​ by adding the following:+Regardless of server-level settings, the end user has the ability to clear previous rules through ''​.htaccess''​ by adding the following:
  
   <​IfModule Litespeed>​   <​IfModule Litespeed>​
Line 41: Line 70:
   </​IfModule>​   </​IfModule>​
  
-To verify the server and virtual host level settings, you may run the following command:+To verify the serverand virtual-host-level settings, you may run the following command:
  
   cd /​etc/​apache2/​   cd /​etc/​apache2/​
Line 47: Line 76:
  
 The design logic looks like the following: ​ The design logic looks like the following: ​
-We use ''​A'',''​B''​,''​C'' ​instead of rules.+Assume ​''​A'',''​B'' ​and ''​C'' ​refer to defined ​rules.
 ^ Server Level ^ VHost Level^ .htaccess ^ Result ^ ^ Server Level ^ VHost Level^ .htaccess ^ Result ^
 |A|not set|not set|A| |A|not set|not set|A|
Line 54: Line 83:
 |A|B|C|A+B+C| |A|B|C|A+B+C|
  
-  ​* The addition ​feature may not be fully implemented ​yet on v 5.2.3, but will be fully implemented ​on next release.+* The feature ​to add rule sets may not be fully implemented on v5.2.3, but //will// be fully implemented ​in the next release.
  
-===== How to check===== +==== Method 2: Rewrite Rules ==== 
-==== Prepare URL with junk query string==== +As an alternative to ''​CacheKeyModify''​, we can also use rewrite rulesThis method supports multiple commands combinedand gives you more flexibility.
-Assume we have a public WordPress site with domain ​''​testquerystring.com'' ​and LSCache enabled. And use Campaign URL Builder, e.g.[[https://​ga-dev-tools.appspot.com/​campaign-url-builder/​ | Campaign URL Builder]] or other UTM plugin to generate URLwhich will looks like this: {{:​litespeed_wiki:​cache:​utm.png?700|}} \\ +
-Now, I can access my site with both of URLs: +
-  * <​nowiki>​https://​testquerystring.com </​nowiki>​ +
-  * <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=email&​utm_campaign=promo%20code</​nowiki>​+
  
-==== Setup rules to drop query string==== +For this example, we remove ''​utm_source'' ​with an exact match, ​and ''​utm_medium''​ with a prefix match.
-For testing purpose, we can simply add the following to .htaccess file +
-<​code>​ CacheKeyModify -qs:​utm_medium </​code>​ +
-==== Verify from dev tool==== +
-  * Before setup rules: +
-    * All king of query strings ​with same domain will store with different cache key. Above URLs will store 2 cache files +
-  * After setup rules: +
-    * All king of query strings with same domain will store with same cache key. Above URLs will store only one cache files +
-    * I can access these urls and [[ https://​www.litespeedtech.com/​support/​wiki/​doku.php/​litespeed_wiki:​cache:​lscwp:​troubleshooting:​general?​s[]=hit#​testing | check cache]] are hit at first time visit. +
-      - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=email1&​utm_campaign=promo%20code</​nowiki>​ +
-      - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=email12&​utm_campaign=promo%20code </​nowiki>​ +
-      - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=email123&​utm_campaign=promo%20code </​nowiki>​ +
-      - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=email1234&​utm_campaign=promo%20code</​nowiki>​+
  
-==== Verify from debug log==== 
-We can only check keywords ''​QS''​ & ''​KEY''​ with ''​CACHE''​ tag from debug log: 
-  tail -f /​etc/​apache2/​logs/​* | grep '​CACHE'​ | grep '​QS\|KEY'​ 
- 
-Verify ''​utm_medium''​ has been removed from CacheKey data -> QS 
-<​code>​ 
-[CACHE] Remove exact matched QS key [utm_medium],​ 
-[CACHE] modified QS in cache key is [], 
-[CACHE] CacheKey data: URI [/​testquerystring.com/?​],​ QS [utm_source=google&​utm_campaign=promo%20code],​ Vary Cookie [_lscache_vary=xxxxx],​ Private Cookie [wp_woocommerce_session_xxxxx],​ IP [x.x.x.x] 
-</​code>​ 
- 
-====More Flexibility==== 
-It also can be set via rewrite rule and it support multiple env combined. ​ 
 <​code>​ <​code>​
 Rewritecond %{QUERY_STRING} '​utm_source=google&​utm_medium=email1&​utm_campaign=promo%20code'​ Rewritecond %{QUERY_STRING} '​utm_source=google&​utm_medium=email1&​utm_campaign=promo%20code'​
 RewriteRule ​ .* - [E=cache-key-mod:​-qs:​utm_source,​ E=cache-key-mod:​-qs:​utm_medium*] RewriteRule ​ .* - [E=cache-key-mod:​-qs:​utm_source,​ E=cache-key-mod:​-qs:​utm_medium*]
 </​code>​ </​code>​
-Logs+ 
 +Log shows only ''​utm_campaign''​ is left in the query string: 
 <​code>​ <​code>​
 Remove exact matched QS key [utm_source],​ Remove exact matched QS key [utm_source],​
Line 101: Line 103:
 </​code>​ </​code>​
  
 +===== How to Verify Query Strings Have Been Dropped =====
 +==== Prepare a URL with a Junk Query String====
 +Assume we have a public WordPress site with the domain ''​testquerystring.com''​ and LSCache enabled. Use [[https://​ga-dev-tools.appspot.com/​campaign-url-builder/​ | Campaign URL Builder]] or other UTM plugin to generate a URL, which will look like this:
  
-====More information==== +{{:litespeed_wiki:​cache:​utm.png?​700|}}
-If you change rule from ''​utm_medium''​ to ''​utm_medium*'',​ log will shows: +
-  Remove prefix matched QS key [utm_medium]+
  
 +Access the site with both of the URLs:
 +  * <​nowiki>​https://​testquerystring.com </​nowiki>​
 +  * <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=email&​utm_campaign=promo%20code</​nowiki>​
  
 +==== Set up Rules to Drop the Query String ====
 +For testing purpose, we can simply add the following to .htaccess file
 +<​code>​ CacheKeyModify -qs:​utm_medium </​code>​
 +
 +==== Verify From Developer Tool====
 +=== Before the rules are created ===
 +All query strings with the same domain will be stored with a different cache key. The above URLs will be stored in 2 separate cache files
 +=== After the rules are created ===
 +The ''​utm_medium''​ query string is stripped. Due to the fact that there are other query strings attached to the URL, there will still be 2 separate cache files:
 +  - <​nowiki>​https://​testquerystring.com/</​nowiki>​
 +  - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_campaign=promo%20code</​nowiki>​ (notice ''&​utm_medium=email''​ has been stripped)
 +
 +If you visit ''​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_campaign=promo%20code</​nowiki>''​ and then access the following urls, you will see that they are all stored in a single cache file and are a [[ litespeed_wiki:​cache:​lscwp:​troubleshooting:​general?​s[]=hit#​testing | cache hit]] from the first visit:
 +  - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=text&​utm_campaign=promo%20code</​nowiki>​
 +  - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=instant-message&​utm_campaign=promo%20code </​nowiki>​
 +  - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=browser&​utm_campaign=promo%20code </​nowiki>​
 +  - <​nowiki>​https://​testquerystring.com/?​utm_source=google&​utm_medium=direct&​utm_campaign=promo%20code</​nowiki>​
 +
 +==== Verify From Debug Log====
 +We can search for the keywords ''​QS''​ & ''​KEY''​ appearing with ''​CACHE''​ in the debug log:
 +  tail -f /​etc/​apache2/​logs/​* | grep '​CACHE'​ | grep '​QS\|KEY'​
 +
 +We can then verify that ''​utm_medium''​ has been removed from ''​CacheKey data -> QS''​
 +<​code>​
 +[CACHE] Remove exact matched QS key [utm_medium],​
 +[CACHE] modified QS in cache key is [],
 +[CACHE] CacheKey data: URI [/​testquerystring.com/?​],​ QS [utm_source=google&​utm_campaign=promo%20code],​ Vary Cookie [_lscache_vary=xxxxx],​ Private Cookie [wp_woocommerce_session_xxxxx],​ IP [x.x.x.x]
 +</​code>​
  
-=====Refer=====+===== Reference ​=====
 [[https://​www.launchdigitalmarketing.com/​what-are-utm-codes/​ | UTM Codes article]] [[https://​www.launchdigitalmarketing.com/​what-are-utm-codes/​ | UTM Codes article]]
  • Admin
  • Last modified: 2021/11/11 20:54
  • by Lisa Clarke