tutti-ch-segment-tagging

Tagging of tutti platform using Segment.com.

Usage no npm install needed!

<script type="module">
  import tuttiChSegmentTagging from 'https://cdn.skypack.dev/tutti-ch-segment-tagging';
</script>

README

Segment Tagging

tests

How to: Processes that can be done by any dev

Adding an event/view or activating an existing property for an event/view can be partially done by any developer. The flow consists of modifying the views_and_events.csv file and then submitting it to Team Data to verify and commit the changes into the tagging plan.

1. Add new event/view

This section describes how to add a new event/view, assuming no new properties. If the new event/view also contains new properties, you will need to ask Team Data to do it.

To add new event/view (a view is a page on web or a screen on iOS and Android) requires you to modify the views_and_events.csv file.

  • Add the new event/view as a row at the end of the file.
  • Some of the columns represent properties. The value "excluded" indicates that this property should never be set for the corresponding event/view. The value "obligatory" indicates that you must attempt to set this property. The exlcuded and obligatory properties of the new event/view must be decided by you. Often the new event/view occurs in the same data-context as an existing event/view. For example the events Item Reply Stated and Item Reply Submitted both happen in the data-context of an ad. Therefore they should have same obligatory properties like itemRegion, itemLastPublishedOn, and similar. If your new event shares this data-context with some existing event, you are lucky. You can simply copy over the exlcuded/obligatory values without going over them one-by-one. Saves time.
  • Note that non-property columns still need a review despite the new event sharing property settings with an existing event. For example if the new event applies only to ios then web, backend, and android columns should be set to false.
  • Inform Team Data of your changes and send them the CSV file with the name of the views/events you added, and they will take care of the following steps.

Warning: The steps below are for Team Data, after the modified CSV is received and its changes are validated:

  • Create a new branch: git checkout -b new-events --track origin/new-events.
  • Run update-tagging-from-csv.py views_and_events.csv to update the taggingdb. Note that you can actually use any CSV file in the argument.
  • Change directory to taggingdb and run backup.sh to backup the database.
  • Run regenerate-master-files.py to regenerate the JSON master files (views_and_events.csv will be regenerated as well, but now the new event/view).
  • Make a single commit with message "Add new event My Event" and in the summary section of commit message provide more detailed description, specifying more details.
  • Create a pull request and Team Data members as reviewers (so that they get a notification).

2. Enable a property for an event

  • Open the file view_and_events.csv and in the relevant row set "obligatory" in column correspondng to the property that must be enabled.
  • Inform Team Data of your changes and send them the CSV file with the name of the views/events you modified, and they will take care of the following steps.

Warning: The steps below are for Team Data, after the modified CSV is received and its changes are validated:

  • Run update-tagging-from-csv.py views_and_events.csv to update the taggingdb. Note that you can actually use any CSV file in the argument.
  • Change directory to taggingdb and run backup.sh to backup the database.
  • Run regenerate-master-files.py to regenerate the JSON master files (views_and_events.csv will be regenerated as well).
  • Make a single commit with message "Enable prop myProp for 2 events` and in the summary section of commit message provide more detailed description, specifying more details like the event names.

How to: Processes that need to be done by Team Data

1. Add a new property

To add a new property you need to use taggingdb for now. Here are the steps for adding new propety suggestedSubCategory which are applicable to any other new property.

  • Insert the new property into table public.properties: insert into properties (property, data_type, alive, reserved, documentation) values ('suggestedSubCategory', 'text', true, false, 'The sub-category that was suggested during item insertion.');
  • Exclude this new property from all events and views : insert into excluded_properties_views (property, view) select 'suggestedSubCategory', view from views; and insert into excluded_properties_events (property, event) select 'suggestedSubCategory', event from events;.
  • If the property has fixed values, insert these into table public.property_values. In case of suggestSubCategory property, we know it has same fixed values as itemSubCategory value. Yhe query then becomes: insert into property_values (property, value, alive) select 'suggestedSubCategory', value, true from property_values where property = 'itemSubCategory';. If however you need to insert property values one by one then the query becomes: insert into property_values (property, value, alive) values ('suggestedSubCategory', 'Cats', true);. Then replace "Cats" with the next property value, and so on.
  • Backup the database with the script taggingdb/backup.sh (ran from withing taggingdb directory).
  • Regenerate master files with regenerate-master-files.py.
  • Make a single commit with the message "Add new prop suggestedSubCategory".

2. Rename an event/view

Renaming is not supported because we use event/view name as natural primary key. To rename, delete the event/view first, commit, then create new event/view then commit.

3. Delete an event/view

To delete an event/view you need to use taggingdb. Login into taggingdb and perform the following steps:

  • Delete the rows corresponding to the event/view to be deleted from excluded_properties_events or excluded_properties_views, e.g. delete from excluded_properties_events where event='My Redundant Event';.
  • Now that the event/view to be deleted is not referenced anymore in any other table, delete it by running delete from views where view = 'My Redundant View';.

Adding custom dimension to Google Analytics

This section describes how to add a custom dimension to Google Analytics.

  • Go into Google Analytics and determine the next available numeric ID for the new custom dimension.
    • Go to any GA property, e.g. Web, then go to Admin section, then to Property Settings, Custom Definitions, then Custom Dimensions.
    • Go to the last page of the table showing the custom dimensions.
    • The numeric ID of the new custom dimensions will be the index number of the last existing dimension plus one.
    • Suppose the index number of the last dimension is 44. This means 45 will be the numeric ID of the new custom dimensions.
  • Insert a row into table custom_dimension_ids in taggingdb
    • Run insert into custom_dimension_ids (custom_dimension_id, property) values (45, 'someProperty');
    • Note that the field custom_dimension_ids.property is a foreign key to the properties.property field.
    • Backup the taggingdb database using the taggingdb/manage.sh backup script and commit the changes.
  • In Segment, go to Google Analytics destination configuration page of each source: ANDROID - DEV, ANDROID - LIVE, IOS - DEV, IOS - LIVE, WEB - DEV, WEB - LIVE. On the configuration page, add the new custom dimension in the "Custom Dimensions" section.
  • Go back to Google Analytics. Add the custom dimension and for each of these properties: Android, Android (dev), iOS, iOS (dev), Web, and Web (dev). The custom dimension scope is "hit".
  • Add the custom dimension also for the rollup properties Android + iOS + Web and Android + iOS + Web (dev). For the rollups there will be an extra step where you have to name the custom dimension and then select the custom dimension from the dropdown that it refers to from the list of existing custom dimensions defined in the underlying properties. The name and the referred to dimension should laways be named the same of course. That is, if you name the custom dimension in the rollup property as myDim then it should refernce the myDim custom dimension from all the three underlying properties. It sounds all very complicated, but it is fairly simple once you see the page where you can add the custom dimension.

Adding a custom metric to Google Analytics

To add a custom metric instead of a custom dimension to GA, just follow the instructions on how to add a custom dimension but checking the GA section Custom Definitions / Custom Metrics, and using the taggingdb table custom_metric_ids.

Design

The tagging plan lives in a relational database, PostgreSQL. See the taggingdb directory. It is easy to run a local copy of the tagging database, assuming a PostgreSQL cluster is already setup locally.

The platform developers (Android, iOS, web) are given JSON master files that are built using this database. Using these master files to implement the tagging in a platform help drastically reduce the probaiblity of following issues:

  • Typos in names of events, properties, pages, and screens.
  • Typos in values of properties that take on values from some finite set, e.g. {let, buy, rent, sell}.
  • Sudden appearance of new names of events, screens, pages, and properties not present in tagging database.
  • Wrong property data types, e.g. putting "twelve" into an integer-valued property.
  • Omitting obligatory properties and traits.

But there is no silver bullet, following issues have still same chance of occuring:

  • Failing to fire an event/screen/page
  • Firing an event/page/screen multiple times instead of only once

Terminology

  • view: a page or a sceen.
  • action: an event or a views.
  • tagging: deciding which events, views, properties, and traits need to be tracked and create naming for them.
  • item: an ad (the one created by a user).

Concepts

  • Segment is a data-pipeline, it has no analytics capabilities or similar. All it does is route data from point A (source) to point B (destination). Which is exactly what we have it for. This relieves the data-team from building connections between the tools we use.
  • Segment is a tracking tool, that replaces the need to put any other tags into code. Segment can forward the tracked events, pageviews, and user-profiling to any supported tool. This relieves the developers from touching the code, when you need a new tool.

Tagging process

We assume this repo is used by two roles: tagging creators, and tagging implementers.

Tagging creators create and maintain events/pages/screens. As of February 2018 we have as tagging creators Cliff and Dmitrii.

Tagging implementers are the platform developers: Jakub, Cip, Dmitry, Marko. Tagging implementers have read-only access to the tagging. One of the reasons for keeping creators and implementers roles separate is that in the past where we allowed tagging implementers to implement new events, we ended up with different names for same events/pages/screens across platforms. This is because naturally tagging implementers are platform myopic. Tagging creators on the other hand are aware of all the events/views/pages accross all platforms and are therefore in best position to tag them.

Anyone at tutti.ch (PMs, Marketing, Sales, CEO, Happiness, Devs) can request tagging of an event, view or page, identify. Every request will be answered, not every request will result in a tag being assigned.

Master files

Each platform has a corresponding master file in JSON format: tagging-ios.json, tagging-android.json, and tagging-web.json. The master file contains definitions of events, views, properties, and traits. On top of definitions the master file contains meta data, like API keys, lookup tables, version, and documentation.

Tagging is evolving with time. Things get added, deleted, renamed, etc. Therefore there can be multiple versions of each master file. We use semantic versioning for the master files, for example:

  • Schema of the master file changed. In this case we bump the major: 1.0.0 -> 2.0.0.
  • New keys are added (for example new events), we bump the minor: 1.0.0 -> 1.1.0.
  • Lookup values changed (e.g. category code 1220 referred to cats, and is now dogs), we bump the patch number: 1.0.0 -> 1.0.1.

The versions are tagged using annotated git tags. To access version 1.1.0 of the iOS master file, you would go to https://github.com/tutti-ch/segment-tagging/blob/1.1.0/tagging-ios.json.

Naming convention

Names for events and views should adhere to object-action framework.

Names of events, views, properties, and traits are formatted as follows:

  • Names of events and views consist of two or more words, with each word capitalized. E.g.: Item Detail, Item Shared Through Facebook.
  • Properties and traits are single token in camel case, e.g.: itemListMinPrice, name.

Properties

The master files lists the expected properties for each event and view. The tagging implementer should have logic that attempts to set each listed property. If setting a listed property is not possible, then this property should be set to a null or not set at all (which is considered to be the same).

Q & A

Q: Can we have ignored IPs?

A: ???

Q: How to deal with multi-valied properties, e.g. color = [blue, red].

A: Brantley suggested just passing an array, say ['blue', 'red'], which will get converted to string "blue,red" or similar, in Redshift. Additionally we could also pass a string "['blue', 'red']" that has JSON format, which will arrive into Redshift in same format. In the end both approaches result in text columns in Redshift and easy to parse: either with split_part or json_extract_array_element_text. I would advise passing multi-valued attributes as valid JSON strings. There may always be a destination which will allow you to parse JSON in a text field into an array, but it is less likely that a destination will allow you to parse delimited text into an array.

Q: The iOS SDK has a flushAt parameter. Unless number of events/screens reaches this value, they will not be sent to Segment. How to avoid then that we receive events/views 3 months after the fact because the user did not reach flushAt events during their previous session(s)?

A: Brantley will enquire and report back.

Q: In GA > Audience > Geo > Language, the language is not detected. Why?

A: Brantley mentioned that this is because we are using GA in cloud mode. We should try device mode.

Q: What happens in Redshift when you keep sending itemPrice as integer property then suddenly send it a text value?

A: Unknown value types will be dropped (=Null). If we change it long-term, we shall tell Segment, so they can replay the data in the new type.

Q: Are we allowed to compress the POST payload? E.g. will Segment libraries under the hood generate Content-Encoding: gzip header?

A: Nope. Only uncompressed. (client: max. 15kb | server: max. 32kb)

Q: What is the difference between saying "Redshift": true or leaving out Redshift entirely from the integrations object?

A: Doesn't change anything in the Redshift case. All other destinations will respond to this as expected. Redshift needs to be managed in the interface (--> selective sync).

Q: Is there an API-way to manage destinations? Especially GA custom dimensions?

A: No.

Q: When you have two Redshift destinations how can you refer to specific one in the integrations object?

A: You can't. See above, the integrations object has no effect on Redshift. Manage selective sync in the GUI.

Q: For the integration specification sent throug API to have any effect, must there be a dark grey line from the source to the integration (aka destination) in the Dashboard?

A: Need to ask Segment.

Q: What happens when wrong integration name is provided in the API call?

A: It fails. Ensure that you use the exact casing seen here - https://segment.com/docs/destinations/

Q: Events are assumed to be triggered by users. What about non-user triggered events? Like we would like to have an event "Marked for NPS survey for example".

A: You could use a specific source for a job to only send subset to delighted. For example a Python source that sends events of these users only, based on whatever criteria.

Q: Are we able to set app version using Analytcs.js?

A: Yes. Apps set it automatically, web can set it. However, it's better to add a property, since not all destinations support context mappings. Like so we'd always have it available. Probably best to set both (context app.version and a custom property).

Q: Is new_visitor property tracked automatically by Segment?

A: Need to ask Segment.

Q: What are possible values for the item.paramters

A: Need to ask backend devs.

Q: How is item.highlight flag defined? Is it true when the highlight applies?

A: Need to ask backend devs.

Q: How is item.epoch_time defined? Is it list time? Creation time?

A: The epoch_time is defined as the latest list_time of an ad. Note that only published ads can have a list_time.

Q: What is the payload size limit for a Segment API call.

A: The limit is 25 KB or 30 KB, Segment's Brantley was not sure. This limit is not enforced in the SDK but on the Segment's backend. Most likely you will get an HTTP error, which is probably handled and transformed by the Segment SDK.

Q: Should we include personal data like name, phone number, address, city, and email in the properties?

A: Segment informed us that they will be GDPR compliant before the deadline. Segment told us it is safe to send them personal data and that many of their clients already do so. Segment will allow for easy data deletion, as per requirement of GDPR. Segment will not pass private individuals data to Google Analytics, a special case because Google does not want that data.

Q: Is there a limit on the number of properties per API call?

A: No, but there is a payload limit. See another question.

Q: How are properties that are null treated? Is it treated the same as simply not passing the property?

A: Segment told us that nulls do not need to be sent. However if we always send null, then no corresponding Redshift table column will be created until at least on nonnull value is sent to Segment.

Q: Are nested properties allowed?

A: Nested properties are allowed but the way they will be treated by the destination is different. For Redshift as destination, the arrays will be stringified (e.g. ['a', 'b', 'c'] will be put into single varchar column), while nested objects will be flattened and names built up from the keys from all levels separated by underscores. For example {'a': {'foo': 'bar'}} will be a column a.foo with value bar in Redshift.

Q: Does Segment cover all of GA or just subset? Anything to look out for?

A: Segment covers most of the functionality of Google Analytics. The coverage depends on whether you send the "specced" events like Product Viewed, which will show up in the Ecommerce section in GA. For functionality not covered, a feature request can be made.

Q: Can you override common fields?

A: You can override common fields. For client side sources though this should not be necessary. For data originating on the backend you may want to change timestamp common field to send data from the past. Note that Google Analytics does not accept data with timestamps older that 4 hours.

Q: An anonymous users views 3 pages and then logs in. Will their 3 page views get the updated user ID of the logged-in user?

A: Yes, as per our call with Segment.

Q: Is it possible to update events and pages after the fact?

A: No, at least not in a programmatic way. Updating would be possible in special cases by contacting Segment and explaining the issue.

Q: Is it possible to delete data in segment after the fact.

A: No, at least not in a programmatic way. Deleting would be possible in special cases by contacting Segment and explaining the issue.

Q: What is Segment's opinion on using the track method to log tutti's releases so that we can visualize the effect of releases on visits, sessions, ad insertions, etc. We would also need to track these events under some special user ID.

A: We asked Segment but they are not aware of this. They will ask around internally and we may need to ask again.

Q: Does Segment identify presence of ad blockers?

A: No. Ad blocker usage has to be done by clients.

Random mumbles

  • Would it be better to have default values like "Unknown" instead of null for properties? Need to think this one through carefully.
  • Marko suggested to add error messages into master file. First we will need to define failure modes (e.g. non-existent property being set, non-existen event name, wrong-property data type.)
  • We assume that if a property is not available on one platform, it is not available on any other platform. Is that perhaps something we should not assume?
  • If my experience counts for anything, then we should think about pages we want to tag as a connected graph.
    • Connected, undirected, graph implies that we should not have any numbering in page names like "ad insertion page 1", "ad insertion page 2". Such naming would be more appropriate if we could think of tagging as a tree (data structure).
  • The page method takes a category as its first argument.
    • What do we put there? Level-2 site? If we put level-2 site there, then we should be aware that there will be funnels that span multiple categories. Is that OK?
    • Should we think of the category as a partition of the connected graph? In other words, can a page, say premium features selection belong to more than one category, say "ad insertions" and "promote the ad"?
  • Creating the JSON tagging file by hand is tedious and error prone. It could result in errors, other than typos, that could go undetected, even when reviewed by another person.
    • Ideally we should generate JSON tagging file programmatically, with tagging stored in a relational database. An RDBMS provides extra layer of data quality through primary keys, foreign keys, triggers, and what not.
  • Do we make properties nullable or do we specify exactly the reason for the absence of value? E.g. region property could be "unknown", "not applicable", or "indeterminable" rather than just null.
  • For properties that are like enumerations (region, category), should the Segment SDK wrappers create callables? E.g. to populate region Bern into a property call region('bern') or region.bern. We want the call to fail if an invalid property value is used. Avoid storing garbage.
    • Failures should be monitored with tools like Sentry.
    • Ideally some invalid tagging should already be detected at compile time, though some values are only known at runtime, like the ad list id on the VI or the region on the LI.
      • So there will be errors which emerge only at runtime.
  • The tagging plan should be created with the thought that it is going to be mutated constantly. So deleting, updating, and adding new tags should be a smooth sailing.
  • Should we use site-wide unique page names or is it ok just to have uniqueness within each category? E.g. start page in category ad_insertion and in category ad_promotion may cause confusion in the tools downstream if these tools disregard the category. Ignoring the category will result in two distinct pages under single name start.
    • if we use site-wide unique page names, do we use common prefix like ai for related pages? What about suffix instead for more user friendliness?
  • How do we become aware automatically of new properties being added to tutti API that we can subsequently pass on to Segment?
  • Assume by default that all properties must be passed for each pagescreen or event. Then provide a list of excluded properties. This way it is much harder to forget to include a property because that would require you to consciously add it to the excluded list. Having a list for each pagescreen that gives the included properties is more dangerous because it is easier to forget to add a property to the list (requires no conscious effort).
  • Since we are going for unique pagescreen and event names site-wide, not just category wide, pages from same coherent flow will inevitably end up with commin parts, like the "ai" in ai_start_page, ai_preview_page. What if we made the prefix more distinct like "[ai]_start_page", "[ai]_preview_page". The names look a bit more ugglier though this way.
  • The backend gives back category id, parent category id, and the names. How do we implement same names on backend and what we have in DWH? Do we assume backend category names are the truth and rename the ad_categories_map in DWH to comply with that? As far as I remember the backend category names contains some minor errors. Alternatively we could provide a category code to name map in the tagging.json.
  • the object location_info contains keys area and subarea. What are those?
  • what happends if segment is blocked by the client, perhaps because of ad blocker?

Lookup values

The source of truth for some lookup objects like car brand, car color, etc., is the /conf/bconf directory of tutti repo. This directory contains the configuration files with the desired key-value lookup pairs, albeit some data munging is needed.

Car brand

To extract the car brand lookup values, get the file bconf.txt.cargroup. This file has Latin-1 encoding which my grep struggles with. Convert it to utf-8 with: iconv -f iso-8859-1 -t utf-8 bconf.txt.cargroup > bconf.txt.cargroup-utf8. Then extract the lookup key-values with: grep "^\*\.\*\.cargroup\.[0-9]\+\.name" bconf.txt.cargroup-utf8 | grep -o "[0-9]\+.*" | sed 's/.name=/ /' | tr [:upper:] [:lower:]

Car type

To extract the car type lookup values, fetch the file bconf.txt.language. This file has Latin-1 encoding which my grep struggles with. Convert it to utf-8 with: iconv -f iso-8859-1 -t utf-8 bconf.txt.language > bconf.txt.language.utf8. Subsequently extract the key-values with grep "chassis\..*\.en.*\.value" bconf.txt.language.utf8 | tr [:upper:] [:lower:] | grep "chassis.*" -o | sed 's/chassis.//' | sed 's/.en.value=value:/ /'.

Car color

To extract the car color lookup values, fetch the file bconf.txt.car_params. This file has Latin-1 encoding which my grep struggles with. Convert it to utf-8 with: iconv -f iso-8859-1 -t utf-8 bconf.txt.car_params > bconf.txt.car_params.utf8. Subsequently extract the key-values with grep "color.*name.de" bconf.txt.car_params.utf8 | grep "[0-9]\+.*" -o | sed 's/.name.de=/ /' | tr [:upper:] [:lower:].

Squared meters

You can decode a squared meters value using one of the two schemes. Which decoding scheme should be used, depends on the sub-category. For example, for sub-category Land the deconding scheme is used that has higher values of squared meters. For apartments, decoding scheme with lower values of squared meters is used. The decoding schemes are stored in bconf.txt.adparams. This file has Latin-1 encoding which my grep struggles with. Convert it to utf-8 with: iconv -f iso-8859-1 -t utf-8 bconf.txt.adparams > bconf.txt.adparams.utf8. Subsequently extract the key-values with grep "sizelist.*list" bconf.txt.adparams.utf8 | grep "[0-9]\{1,2\}=.*quot; -o | tr = " ".

Mileage

To extract the mileage key-values, fetch the file bconf.txt.adparams. This file has Latin-1 encoding which my grep struggles with. Convert it to utf-8 with: iconv -f iso-8859-1 -t utf-8 bconf.txt.adparams > bconf.txt.adparams.utf8. The mileage value in this case is actually not a single mileage but a minimum and a maximum, that is single mileage code maps to a mileage range: 3=10'000 - 14'999. To extract the values of minimum mileages run grep mileage\\. bconf.txt.adparams.utf8 | sed 's/*.*.common.mileage.//' | tr -d "'" | grep "[0-9]\+=[0-9]\+" -o | tr "=" " ". To extract the maximum values run grep mileage\\. bconf.txt.adparams.utf8 | sed 's/*.*.common.mileage.//' | tr -d "'" | grep "[0-9]\+=[0-9]\+" | tr "=" " " | tr -d "-" | tr -s " " | cut -d" " -f1,3. Note that the last value 37=GT500K will not be picked up by the grep commands but it still is a valid value!

Rooms

To extract the key-values for rooms, fetch bconf.txt.adparams and converto to UTF-8: iconv -f iso-8859-1 -t utf-8 bconf.txt.adparams > bconf.txt.adparams.utf8. Then grep: grep common.rooms bconf.txt.adparams.utf8 | grep "[0-9]\+=.*quot; -o | tr "=" " ".

Subarea

Subarea is what is referred to as "Quartier" on tutti. When making LI API requests, the parameter sa is the sub-area, but encoded as an integer. The mapping of integers to names can be obtained from the bconf text file. To extract the mapping you need the file bconf.txt.areas. Convert it to to UTF-8: iconv -f iso-8859-1 -t utf-8 bconf.txt.areas > bconf.txt.areas.utf8. Then run grep: "subarea.*name.de.*" bconf.txt.areas.utf8.

Auto-triggered events

Auto-triggered events, are events that are not explicitly triggered by tutti. Currently only iOS and Android have auto-triggered events, 3 of them: Application Opened, Application Updated, and Application Installed. We need to be aware of auto-triggered events because they don't have same properties as our explicitly triggered events. Therefore they need to be validated differently. The auto-triggered events are defined here: https://segment.com/docs/spec/mobile/#overview-of-events and this page needs to be monitored for additions and removals of auto-triggered events.

Segment destination "Salesforce DMP"

The destination "Salesforce DMP" is not a yet an official destination in Segment. As such, enabling it requires these steps:

  1. in the Segment admin interface (yeah login as admin)...
  2. go to respective source (Android-LIVE, iOS-PoC, whatever),
  3. push green button to add destination,
  4. in the catalog, ignore everything and change the URL to be https://app.segment.com/tutti/destinations/catalog/salesforce-dmp
  5. push green button Configure Salesforce DMP
  6. select the source to add DMP to your source
  7. configure all the shitz (in this case Nico does this)

Web path issue in Ad Insertion flow

There is an issue in web where the path context variable lags one page behind the actual flow, so the track contains a value for path for the previous page instead of for the current one.

Phil investigated this issue and found that Segment uses a function called canonical to determine the path of the current page (see the code here). In this function, Segment iterates over links inside the DOM until they find one with the attribute rel="canonical", which is what tells them the path. However, due to implementation details, the tagging is done before the canonicals are updated, which happens inside the React lifecycle. The result is that Segment gets the path before the value is updated and ends up getting the value for the previous AI step.

The web team will try to fix this issue when the Ad Insertion rewrite happens somewhere in 2022.

It looks like the same issue happens in myads when switching tabs.