Each entity: loaded ÷ raw × 100. Then the mean of all 7 entity rates. Note: every entity is weighted equally, so Keywords at 0.5% pulls the average down as much as Posts at 100%.
No gaps detected.
No gaps detected.
| Gap reason | Count | Impact | Recommended action | Samples |
|---|---|---|---|---|
|
parent post not scraped
FK_DEPENDENCY_MISSING
|
9 | Comments cannot be inserted into the final comments table because their parent posts are absent. | Re-run the post scraper for the missing parent post IDs, then reload comments. | comment_id: 7564920220838707976 · missing parent post comment_id: 7565020009086796551 · missing parent post comment_id: 7565095443253216008 · missing parent post comment_id: 7565140281394397953 · missing parent post comment_id: 7565105121052574472 · missing parent post +4 more |
No gaps detected.
| Gap reason | Count | Impact | Recommended action | Samples |
|---|---|---|---|---|
|
post not scraped
FK_DEPENDENCY_MISSING
|
113 | Source link references a TikTok post (content_id known) that has never been ingested by the post scraper. | Add the post URL to the scraper queue and re-run the post loader. | https://www.tiktok.com/@_____787658/video/7590052024738254102 https://www.tiktok.com/@2.xrxn/video/7447585483263069462 https://www.tiktok.com/@505.oj/video/7570337557997980935 https://www.tiktok.com/@a2a22b2b/video/7557468599313501452 https://www.tiktok.com/@abdullaalallash/video/7597142079390698774 +108 more |
No gaps detected.
| Gap reason | Count | Impact | Recommended action | Samples |
|---|---|---|---|---|
|
post not scraped
FK_DEPENDENCY_MISSING
|
1 | Source link references a TikTok post (content_id known) that has never been ingested by the post scraper. | Add the post URL to the scraper queue and re-run the post loader. | https://www.tiktok.com/@azrael_792/video/7592955395056880914?_r=1&u_code=e1dj87l7efe3j1&preview_pb=0&sharer_language=en&_d=ef9j05j49b9a6j&share_item_id=7592955395056880914&source=h5_m×tamp=1769700969&item_author_type=2&utm_source=copy&tt_from=copy&enable_checksum=1&utm_medium=ios&share_link_id=8A05715D-CD2A-45C3-BA72-94D8971F3BDF&user_id=7091612352138757126&sec_user_id=MS4wLjABAAAATjb70hydNFr71Sd_yT4b2OltTibX2q5lR8YeucOkVt3a69WUs1hBGqszVXSyxBXu&social_share_type=0&ug_btm=b2001&utm_campaign=client_share&link_reflow_popup_iteration_sharer=%7B%22follow_to_play_duration%22:-1,%22click_empty_to_play%22:1,%22dynamic_cover%22:1,%22profile_clickable%22:1%7D&share_app_id=1233 |
| Gap reason | Count | Impact | Recommended action | Samples |
|---|---|---|---|---|
|
shortlink — resolver not run
URL_RESOLVER_PENDING
|
150 | vt.tiktok.com / vm.tiktok.com shortlink with no canonical_link populated yet. The resolver hasn't been run on this row. The post may already be in the database — we cannot know until the shortlink is followed. | Run resolve_canonical_links.py against raw_data.tiktok_script_out to populate canonical_link, then re-run the pipeline. | https://vm.tiktok.com/ZNd7YaKeC/ https://vm.tiktok.com/ZS91AWAKmrBga-TyVa8/ https://vm.tiktok.com/ZS9e83RU16emy-cHj6M/ https://vm.tiktok.com/ZS9e8xgkNEyE6-49C4s/ https://vm.tiktok.com/ZSHokAGNt7uSG-6gQA4/ +145 more |
|
non-video link — out of scope
OUT_OF_SCOPE
|
24 | Full URL that doesn't contain /video/<id> or /photo/<id> — likely a profile page, hashtag page, or non-video TikTok URL. Not a pipeline failure; this link was never going to match a post. | Investigate why the scraper collected non-video links. No pipeline action needed. | https://www.tiktok.com/@.hs.n.douaa?_r=1&_t=ZS-910slQDPi3W https://www.tiktok.com/@.hs.n.douaa?_r=1&_t=ZS-910slQDPi3W https://www.tiktok.com/@.majed.alshbh5?_r=1&_t=ZS-91U3r8J4pAA https://www.tiktok.com/@.majed.alshbh5?_r=1&_t=ZS-91U3r8J4pAA https://www.tiktok.com/@_____abn_sw______?_t=ZP-90bBLvanA2Y&_r=1 +19 more |