Data Pipeline Gap Analysis Last batch loaded: 2026-05-13 10:28 UTC
Pipeline status: Critical
Critical: 9 comments missing from comments (89.9% loaded)
What was loaded today
Posts all 1,918 posts inserted
Users all 1,863 users inserted
Comments 80 of 89 comments inserted — 9 missing
Keyword terms all 12 keywords inserted
Keyword links 2,029 of 2,142 scraped links mapped to a post — 113 unmapped
WhatsApp senders all 61 senders inserted
WhatsApp links 241 of 416 forwarded links mapped to a post — 175 unmapped
7 of 7 entities evaluated · Average per-entity load rate: 89.2%
Avg load rate
89.2%
mean of per-entity load rates
How this is calculated

Each entity: loaded ÷ raw × 100. Then the mean of all 7 entity rates. Note: every entity is weighted equally, so Keywords at 0.5% pulls the average down as much as Posts at 100%.

  • Postsloaded (1,918) ÷ raw (1,918) × 100 = 100.0%
  • Usersloaded (1,863) ÷ raw (1,863) × 100 = 100.0%
  • Commentsloaded (80) ÷ raw (89) × 100 = 89.9%
  • Keyword termsloaded (12) ÷ raw (12) × 100 = 100.0%
  • Keyword linksloaded (1,688) ÷ raw (2,142) × 100 = 78.8%
  • WhatsApp sendersloaded (61) ÷ raw (61) × 100 = 100.0%
  • WhatsApp linksloaded (231) ÷ raw (416) × 100 = 55.5%
(100.0% + 100.0% + 89.9% + 100.0% + 78.8% + 100.0% + 55.5%) ÷ 7 = 89.2%
Avg. per-entity load rate
89.2%
Entity breakdown
Posts tiktok_posts_meta → posts
Raw 1,918
Loaded 1,918
Gaps detected 0
100.0% loaded
Clean

No gaps detected.

Users tiktok_users_meta → usernames
Raw 1,863
Loaded 1,863
Gaps detected 0
100.0% loaded
Clean

No gaps detected.

Comments tiktok_comments_meta → comments
Raw 89
Loaded 80
Gaps detected 9
89.9% loaded
Critical
🔴 Fatal gaps — record was NOT inserted
Gap reasonCountImpactRecommended actionSamples
parent post not scraped
FK_DEPENDENCY_MISSING
9 Comments cannot be inserted into the final comments table because their parent posts are absent. Re-run the post scraper for the missing parent post IDs, then reload comments.
comment_id: 7564920220838707976 · missing parent post comment_id: 7565020009086796551 · missing parent post comment_id: 7565095443253216008 · missing parent post comment_id: 7565140281394397953 · missing parent post comment_id: 7565105121052574472 · missing parent post +4 more
Keyword terms tiktok_script_out → keyword_sources
Raw 12
Loaded 12
Gaps detected 0
100.0% loaded
Clean

No gaps detected.

Keyword links tiktok_script_out → source_post_map
Raw 2,142
Loaded 1,688
Gaps detected 113
78.8% loaded
Critical
🔴 Fatal gaps — record was NOT inserted
Gap reasonCountImpactRecommended actionSamples
post not scraped
FK_DEPENDENCY_MISSING
113 Source link references a TikTok post (content_id known) that has never been ingested by the post scraper. Add the post URL to the scraper queue and re-run the post loader.
https://www.tiktok.com/@_____787658/video/7590052024738254102 https://www.tiktok.com/@2.xrxn/video/7447585483263069462 https://www.tiktok.com/@505.oj/video/7570337557997980935 https://www.tiktok.com/@a2a22b2b/video/7557468599313501452 https://www.tiktok.com/@abdullaalallash/video/7597142079390698774 +108 more
WhatsApp senders whatsapp_script_out → whatsapp_senders
Raw 61
Loaded 61
Gaps detected 0
100.0% loaded
Clean

No gaps detected.

WhatsApp links whatsapp_script_out → whatsapp_post_map
Raw 416
Loaded 231
Gaps detected 175
55.5% loaded
Critical
🔴 Fatal gaps — record was NOT inserted
Gap reasonCountImpactRecommended actionSamples
post not scraped
FK_DEPENDENCY_MISSING
1 Source link references a TikTok post (content_id known) that has never been ingested by the post scraper. Add the post URL to the scraper queue and re-run the post loader.
https://www.tiktok.com/@azrael_792/video/7592955395056880914?_r=1&u_code=e1dj87l7efe3j1&preview_pb=0&sharer_language=en&_d=ef9j05j49b9a6j&share_item_id=7592955395056880914&source=h5_m&timestamp=1769700969&item_author_type=2&utm_source=copy&tt_from=copy&enable_checksum=1&utm_medium=ios&share_link_id=8A05715D-CD2A-45C3-BA72-94D8971F3BDF&user_id=7091612352138757126&sec_user_id=MS4wLjABAAAATjb70hydNFr71Sd_yT4b2OltTibX2q5lR8YeucOkVt3a69WUs1hBGqszVXSyxBXu&social_share_type=0&ug_btm=b2001&utm_campaign=client_share&link_reflow_popup_iteration_sharer=%7B%22follow_to_play_duration%22:-1,%22click_empty_to_play%22:1,%22dynamic_cover%22:1,%22profile_clickable%22:1%7D&share_app_id=1233
🟡 Soft gaps — record inserted but incomplete
Gap reasonCountImpactRecommended actionSamples
shortlink — resolver not run
URL_RESOLVER_PENDING
150 vt.tiktok.com / vm.tiktok.com shortlink with no canonical_link populated yet. The resolver hasn't been run on this row. The post may already be in the database — we cannot know until the shortlink is followed. Run resolve_canonical_links.py against raw_data.tiktok_script_out to populate canonical_link, then re-run the pipeline.
https://vm.tiktok.com/ZNd7YaKeC/ https://vm.tiktok.com/ZS91AWAKmrBga-TyVa8/ https://vm.tiktok.com/ZS9e83RU16emy-cHj6M/ https://vm.tiktok.com/ZS9e8xgkNEyE6-49C4s/ https://vm.tiktok.com/ZSHokAGNt7uSG-6gQA4/ +145 more
non-video link — out of scope
OUT_OF_SCOPE
24 Full URL that doesn't contain /video/<id> or /photo/<id> — likely a profile page, hashtag page, or non-video TikTok URL. Not a pipeline failure; this link was never going to match a post. Investigate why the scraper collected non-video links. No pipeline action needed.
https://www.tiktok.com/@.hs.n.douaa?_r=1&_t=ZS-910slQDPi3W https://www.tiktok.com/@.hs.n.douaa?_r=1&_t=ZS-910slQDPi3W https://www.tiktok.com/@.majed.alshbh5?_r=1&_t=ZS-91U3r8J4pAA https://www.tiktok.com/@.majed.alshbh5?_r=1&_t=ZS-91U3r8J4pAA https://www.tiktok.com/@_____abn_sw______?_t=ZP-90bBLvanA2Y&_r=1 +19 more