volpe/posts/drafts/remove-tracking-params-from-links.md

34 KiB

Your internet traffic is being tracked

When you open up your phone and click on or share links though websites and applications websites are secretly injecting tracking information into the pages.

While you may like this might not be that important to your life because its not relevant but the information being tracked in those links has large impacts on your life.

TODO: explain why that is a bad thing for privacy, and personal life

  • you didn't consent to being tracked you just wanted to open a webpage
  • can be used to make targeted ads work better and influence people behavers and beliefs
  • can be used to track what you specifically are looking at on the internet and who you know
  • it can be used to track and build profiles of who knows who
  • even if you agree with what the groups in power are doing right now you shouldn't just give them the tools they need to stop caring about what you think
  • makes links harder for people to read
  • saves screen space
https://eggsexposed.com
http://www.example.com:8080/products/shoes?utm_source=facebook&utm_campaign=summer_sale&fbclid=abc123#reviews
ftp://192.168.0.2:20/lambda/documents
Protocol* What protocol to use to communicate with the server. (https, ftp, etc...)
Hostname* Name of the website used to look up the IP address of the server
Port Network port to connect to (default to 443 for https, and 80 for http)
Path The specific page or resource location on the server
Query Params Key-value pairs that pass data to the page
Fragment Links to a specific section within the page (not sent to server)

Query Parameters

Query parameters are a part of a URL that are used to encode some data about a page in its url. They are intended to be optional values but sometimes most of the time developers don't actually read or remember technical documentation and RFC's that outline how technologies are supposed to be used and put mandatory data in the optional parameters. 1

?size=medium&color=light%20blue&utm_source=facebook
Structure
? Delimiter Marks the start of query parameters in a URL
Key The name of the data being passed
= Assignment Connects each key to its value
Value The actual data being passed
& Separator Separates multiple key-value pairs
Parameters in this example
size=medium tells the page which size use a medium size
color=light%20blue special characters are encoded with a % then a number
utm_source=facebook this param was secretly added to the link when it was posted on facebook

These parameters can be very useful for things like saying how many items should be returned with a query, what specific page the query is for, or for filters on a page itself.

limit=50, page=2, sort=latest, name=JohnJohn%20Doe, etc...

If there are any extra query parameter in a websites URL it isn't harmful to the function of the website because the website and its server can simply just ignore the unused parameters.

Over time ad tech companies learned that they can take any arbitrary URL that is displayed on their website and just add their own query parameters to it.

A common format that these tracking parameters take are UTM tracking codes.

query parameter description
utm_source how did you get to the site
utm_medium what type of link was used to get you to the site
utm_campaign what specific promotion brought you here
utm_term search term used
utm_content what specific page element was clicked to bring you to the page

From there the host site can use that data to track analytics how users got to their site or how their site is being used and if the page is using any scripts served up by those same ad tech companies or tooling they have built then those scripts and tools can harvest that data and send it back to the ad tech company to track user habits.

While UTM codes are probably the most common way that tracking information is added to links, they are not the only way and there is nothing stopping companies from using other techniques.

One such technique is though the usage of URL shorteners. Not only can url shorteners hide the usage of tracking query parameters behind short nice looking redirect, URL shortener companies also track the ip address's of all users who click on a link as well as embed cookies into your browser sessions to track what sites you are visiting specific down to the individual user level.

How can you help protect yourself and others

Just remove them manually

The most basic low tech way to remove the trackers is to just delete them yourself or just not copy them. This can help protect people that you are sharing the links with but it can become a bit tedious to manually copy every single link you want to visit and remove the trackers from its link.

Install browser extensions to automate it for you

You can install browser extensions that automatically remove tracking parameters from URLs as you browse:

ClearURL

Source Code

Use websites with tools to do it for you

You can also use web tools that you can give a URL and it will clean it for you:

Mobile applications

There are some mobile apps that can also help remove trackers from links:


  1. While the query parameters are not directly defined as optional they are not a part of the main path which should on its own define a unique stable path to any given element. If a query parameter is needed to fetch a specific resource then the path is by definition not uniquely identifying that resource. Youtube is an example of developers not using these value like they should be. Links like this https://www.youtube.com/watch?v=dQw4w9WgXcQ should instead be designed to look something like this https://www.youtube.com/watch/video/dQw4w9WgXcQ ↩︎